Cloudflare blocks Google from reading robots.txt
I have a strange situation.
We see across many of our sites on cloudflare, that Google is no longer able og reading the robots.txt file. This leads googles crawler to start crawling alot of urls which it shouldnt.
When we open Google Search Console for any domain, we see that Google sees the robots.txt file as being empty. There is a manual functionality for triggering google to read the robots.txt again. When we do this we get "An error has occurred. Please try again later.". Then we can see that google shows that it consideres the robots.txt to be empty.
When we disable proxy in cloudflare and tries, then it works fine, and Google can read the robots.txt file.
The strange thing is that it seems like this started happening suddenly about 2 or 3 weeks ago on many of our sites.
I have checked the following settings
- No custom WAF rules
- No useragent blocking created
- Bot fight mode is disabled
- Block AI scrapers and Crawlers is disabled
- Security Level is Medium
- Browser Integrity check is disabled
When this happens, no events shows up in the Security>>Events.
Anybody who have experienced anything similar or have any idea why Cloudflare would cause the failure at Google?
2 Replies
Do you have Pro/could filter Web Traffic Analytics to robots.txt and see stats on it?
Or could try https://search.google.com/test/rich-results entering your url and look at "View Tested Page" for what it sees/status code/etc
@Chaika The next day, we tried again in Google Search Console, and the suddenly google could access robots.txt again without us chaning anything.
Very strange problem, but it is not happening anymore,