3 min Applications

Reddit hopes to get more content deals by blocking scrapers

Reddit hopes to get more content deals by blocking scrapers

Update, 1/08, 5:10 pm: Reddit blocks search engines and AI bots so that the social media platform’s content cannot be used without Reddit’s permission to train LLMs. According to Reddit CEO Steve Huffman, content deals are necessary to prevent content from being used for unauthorized purposes.

While Reddit’s original stance was almost one of complete denial, we’re now hearing a different story from the CEO. Initially, it stated the blockage was not due to the lack of content deals. It was striking, however, that search engines with a deal, including Google, were not excluded from the platform.

Huffman now offers a different explanation: “Without these agreements, we have no control or knowledge over how our data is displayed and what it is used for, which has now put us in a position where we can block people who are unwilling to participate.” come to terms with how we want our data to be used or not.” He further elaborates on the failed negotiations with Microsoft, Anthropic and Perplexity.

Original, 25/07, 1:18 pm: Reddit is taking a new step in combating online scrapping for LLM development. Only search engines that struck a paid deal with the social media platform will still have access.

Users of the search engines Bing, DuckDuckGo, Mojeek and Qwant will no longer see results from Reddit. The search engines were blocked to prevent the scrapping of content from Reddit. In it, AI models extract content from the Internet to train themselves.

According to Reddit’s terms and conditions, scraping is prohibited without permission from the platform. AI companies easily ignore those terms, so Reddit now seems to have decided to crack down harder. By modifying the robots.txt file, web crawlers are no longer welcome. Crawlers for research purposes were not blocked in the modification.

‘No relation to content deal’

The social media platform still allows Google and Brave. These companies have already signed a deal to use Reddit’s content to train AI models. These deals assure Reddit that it has something to gain financially from the rise of AI. The deal with Google, for example, would earn the platform $60 million annually.

404 Media reported on the event based on its own research. In it, it found that searches via “site:reddit.com,” which causes search results to be searched only on Reddit’s website, no longer show recent social media posts.

According to the author, the exception for Google and Brave results from closed deals. A Reddit spokesperson informed The Verge in a response that this finding is incorrect. “This has nothing to do with our recent partnership with Google. We have been in talks with multiple search engines. We have been unable to come to an agreement with all of them because some cannot or will not make enforceable promises regarding their use of Reddit content, including their use for AI.”

Also read: Google pays $60 million annually for content on Reddit through AI deal