crawling sitemaps rate limiting
Hi, I'm writing a crawler bot, and I'm getting the sitemaps for some sites that are behind cloudflare. Currently I'm limiting to 5 requests per second maximum, but sometimes I get a 403 from cloudflare, and looking at the html returned it would redirect with an extra
__cf_chl_rt_tk
query parameter set. Seems like it's a rate limiting mechanism. Is there a set rate I should use, or a way to determine what rate I should use?1 Reply
seems counter productive to sitemaps 🙂
thanks @Leo
aren't sitemaps designed for bots?
okay