"To help preserve a safe Internet for content creators, we’ve just launched a brand new “easy button” to block all AI bots. It’s available for all customers, including those on our free tier." This is really neat! Whatever you land on AI scraping, giving site owners the one-click ability to make a choice is great. Some will choose not to use this; others will hit the button. Making it this easy means it's a choice about the principles, not any kind of technical considerations. Which is what it should be. Not every site is on Cloudflare (and some also choose not to use it because of how it's historically dealt with white supremacist / Nazi content). But many are, and this makes it easy for them. Other, similar providers will likely follow quickly.
Ben Werdmuller’s Post
More Relevant Posts
-
Who is scraping your data for AI purposes? I've already blocked some ByteDance spiders, but this new tool from CloudFlare highlighted (for me) significant crawling activity from another ByteDance spider, Amazon and OpenAI - all of which I've now blocked. It will be interesting to see how much of a sales tool this new AI blocking feature in CloudFlare will be. Are people bothered about having their data scraped? Large corporations are surely already blocking by default, but smaller organisations may simply not be aware. CloudFlare is relatively inexpensive and the new AI feature certainly adds value. https://lnkd.in/ej5Zr9DS
Cloudflare moves to end free, endless AI scraping with one-click blocking
arstechnica.com
To view or add a comment, sign in
-
Cloudflare announced new tools that allow websites on its network to stop unrestricted AI scraping while also helping content creators identify which content is most scanned by bots, allowing them to eventually charge for access. Cloudflare is also creating a marketplace for sites to negotiate content deals based on more detailed AI audits. These tools will enable content creators to understand how AI models use their content and decide whether to allow or block access. The rise of generative AI has complicated the value exchange for content creators. Bots no longer fall clearly into “good” or “bad” categories. AI bots, such as those used by large language models (LLMs), do not drive traffic like good bots, but are not malicious like bad bots, which creates a dilemma for site operators. Cloudflare’s tools allow site operators to block AI bots across their entire site with a single click, or audit which areas are being scraped the most. They can then make decisions about which bots to allow, possibly striking deals with AI model providers. Cloudflare has also created model terms of use that content creators can add to their sites to legally protect against unwanted AI scraping. Notwithstanding any of the above, blocking bots like Google’s could (and most probably will) reduce or remove sites from search engine results pages (SERPs). Cloudflare CEO Matthew Prince acknowledged that Google’s practices may need to change, but what would motivate Google to modify its practices? Regulation? Perhaps. There are some laws pending. As you can imagine, this is only the beginning. -s
To view or add a comment, sign in
-
An old problem with new technology is taking content from websites and using it for other purposes. As Generative AI grows in its capabilities, there needs to be newer guidelines to what and how content can be used. Check out this informative article on InformationWeek about "A Lesson in Nightshade and Defenses Against GenAI Content Abuses" to learn more about the potential dangers of Generative AI and how we can protect our content: https://lnkd.in/ek9PJmzz
A Lesson in Nightshade and Defenses Against GenAI Content Abuses
informationweek.com
To view or add a comment, sign in
-
"AI For Real And Scraping": Cloudflare's New Tool To Stop AI Bots From Crawling Sites. To help maintain a safe Internet for content creators, Cloudflare has introduced a new "easy button" to block all AI bots. This feature is available to all customers, including those on free tier, announced Cloudflare. "We understand that customers don't want AI bots, especially dishonest ones, accessing their websites. To address this, we've added a one-click option to block all AI bots", it said in an announcement. To enable it, go to the Security > Bots section of the Cloudflare dashboard and click the toggle labeled AI Scrapers and Crawlers. -Breaking Down AI For Everyone- https://lnkd.in/dJqBjHgc #AIbots #bots #cloudflare Cloudflare
Declare your AIndependence: block AI bots, scrapers and crawlers with a single click
blog.cloudflare.com
To view or add a comment, sign in
-
Good on Cloudflare for being part of the solution and not part of the problem.
This is a pretty big deal. Generative AI companies have been becoming increasingly emboldened, with the Microsoft CEO even feeling comfortable enough to publicly state that not only are they stealing your content but they can take anything on the internet that they want. I'm sure eventually the law will catch up to these plagiarism machine peddlers, but in the meantime it's basically been entirely up to website admins to deal with AI crawlers, and it's no easy task. The crawlers often don't respect robots.txt, don't have clearly defined identifiable user agents, IP ranges, or ASNs, and it's difficult to ensure you don't accidentally block legitimate search engine crawlers in the process as well. Cloudflare providing a 1 click solution for blocking AI crawlers is a game changer for their customers, and will hopefully encourage more platforms to follow suit in helping users defend against these mega corporations and their automated copyright infringement.
Declare your AIndependence: block AI bots, scrapers and crawlers with a single click
blog.cloudflare.com
To view or add a comment, sign in
-
Not only is Marcus Hutchins correct, but what some companies are doing is out right ilegal. Robot.txt or not, it doesn’t excuse any company to violate another’s T&Cs nor its copyright. I completely support open source efforts like “the pile”, or best practices like those of OWASP for the common good. But for a company to just syphon everybodies content for their personal gain and expect no consequences is baffling. Side note, I’m just going to leave this here: https://lnkd.in/dXKEk2-N
This is a pretty big deal. Generative AI companies have been becoming increasingly emboldened, with the Microsoft CEO even feeling comfortable enough to publicly state that not only are they stealing your content but they can take anything on the internet that they want. I'm sure eventually the law will catch up to these plagiarism machine peddlers, but in the meantime it's basically been entirely up to website admins to deal with AI crawlers, and it's no easy task. The crawlers often don't respect robots.txt, don't have clearly defined identifiable user agents, IP ranges, or ASNs, and it's difficult to ensure you don't accidentally block legitimate search engine crawlers in the process as well. Cloudflare providing a 1 click solution for blocking AI crawlers is a game changer for their customers, and will hopefully encourage more platforms to follow suit in helping users defend against these mega corporations and their automated copyright infringement.
Declare your AIndependence: block AI bots, scrapers and crawlers with a single click
blog.cloudflare.com
To view or add a comment, sign in
-
Finally someone has created one of the best pest control technologies for the web. Nobody likes spiders. You never see thousands of fan pages dedicated to cute spiders. Especially the ones that AI companies are using to scrape your proprietary content. Cloudflare has created a feature that prevents spiders and crawlers from accessing websites. And it uses machine learning to continuously update itself to stay ahead of these internet arachnids. With one click you can now prevent OpenAI, ByteDance (TikTok parent), Amazon and Anthropic from scraping your data to train their latest AI or LLM model. Cloudflare also listed the biggest scraper as ByteDance, the Chinese owned parent of TikTok - interestingly they don’t even have an LLM or AI model. Wonder what they’re doing with your content. OpenAI was second. The AI companies have developed workarounds or straight out ignore the existing protocol that search engines obey to not crawl or index your site. A welcome development to protect your proprietary content. #ai #artificialintelligence #marketing #cloudflare https://lnkd.in/g65yj_Ub
Cloudflare Enables Websites To Block AI Bots With One-Click Solution
social-www.forbes.com
To view or add a comment, sign in
-
This is a pretty big deal. Generative AI companies have been becoming increasingly emboldened, with the Microsoft CEO even feeling comfortable enough to publicly state that not only are they stealing your content but they can take anything on the internet that they want. I'm sure eventually the law will catch up to these plagiarism machine peddlers, but in the meantime it's basically been entirely up to website admins to deal with AI crawlers, and it's no easy task. The crawlers often don't respect robots.txt, don't have clearly defined identifiable user agents, IP ranges, or ASNs, and it's difficult to ensure you don't accidentally block legitimate search engine crawlers in the process as well. Cloudflare providing a 1 click solution for blocking AI crawlers is a game changer for their customers, and will hopefully encourage more platforms to follow suit in helping users defend against these mega corporations and their automated copyright infringement.
Declare your AIndependence: block AI bots, scrapers and crawlers with a single click
blog.cloudflare.com
To view or add a comment, sign in
-
“…Bytespider, Amazonbot, ClaudeBot, and GPTBot are the top four AI crawlers. Operated by ByteDance, the Chinese company that owns TikTok, Bytespider is reportedly used to gather training data for its large language models (LLMs), including those that support its ChatGPT rival, Doubao. Amazonbot and ClaudeBot follow Bytespider in request volume.” Nifty control from CloudFlare provides content creators the ability to block AI crawlers and scrapers. #ai #datacollection #bots
This is a pretty big deal. Generative AI companies have been becoming increasingly emboldened, with the Microsoft CEO even feeling comfortable enough to publicly state that not only are they stealing your content but they can take anything on the internet that they want. I'm sure eventually the law will catch up to these plagiarism machine peddlers, but in the meantime it's basically been entirely up to website admins to deal with AI crawlers, and it's no easy task. The crawlers often don't respect robots.txt, don't have clearly defined identifiable user agents, IP ranges, or ASNs, and it's difficult to ensure you don't accidentally block legitimate search engine crawlers in the process as well. Cloudflare providing a 1 click solution for blocking AI crawlers is a game changer for their customers, and will hopefully encourage more platforms to follow suit in helping users defend against these mega corporations and their automated copyright infringement.
Declare your AIndependence: block AI bots, scrapers and crawlers with a single click
blog.cloudflare.com
To view or add a comment, sign in
-
Free* Help to STOP AI scraping website content… *Whilst the article says this protection is free, my due diligence found it was only available on paid accounts. https://lnkd.in/gbQtJAj6
Cloudflare is taking a stand against AI website scrapers
engadget.com
To view or add a comment, sign in