The use of content created by others to train AI models has been a significant debate, on topics such as intellectual property rights, fair compensation, and the very nature of creativity.
On the other hand, critics, particularly content creators, artists, and writers, argue that training AI on their work without permission or compensation constitutes a form of theft.
Recently, there is a online battle when Cloudflare accused Perplexity of systematically ignoring website blocks and masking its identity to scrape data from sites, as reported in this article. So, I decided to do a little experiment on some content I created, to see how AI reads and uses it.
I have picked my top answer to a stackoverflow questions, and asked the same question to Perplexity.
As expected, it gives the answer and provides a reference to my answer in SO. If it were google, user would have visited the site to know but here answer is available right there, with even better explanations.
The question is whether the assistance AI provides, even for a simple task like writing this blog post, justifies the lack of credit given to the original creators whose content trained the AI.