When AI learns from what you wrote - An example in the context of AI Web Scraping debate


 The use of content created by others to train AI models has been a significant debate, on topics such as intellectual property rights, fair compensation, and the very nature of creativity. Proponents of AI training argue that it is a "transformative" fair use, akin to how a human learns by consuming and being inspired by a vast array of existing works. They contend that AI systems are not simply copying and pasting content, but rather analyzing patterns and relationships within massive datasets to generate new, original outputs.

On the other hand, critics, particularly content creators, artists, and writers, argue that training AI on their work without permission or compensation constitutes a form of theft. They believe that AI companies are profiting from their labor, and that AI-generated content can directly compete with and devalue the original human-made creations, threatening their livelihoods.

Recently, there is a online battle when Cloudflare accused Perplexity of systematically ignoring website blocks and masking its identity to scrape data from sites, as reported in this article. So, I decided to do a little experiment on some content I created, to see how AI reads and uses it.

I have picked my top answer to a stackoverflow questions, and asked the same question to Perplexity.



As expected, it gives the answer and provides a reference to my answer in SO. If it were google, user would have visited the site to know but here answer is available right there, with even better explanations.

The question is whether the assistance AI provides, even for a simple task like writing this blog post, justifies the lack of credit given to the original creators whose content trained the AI.


No comments:

Post a Comment