thesis-research-bot
On our server we are getting multiple hits from the IP address that is linked to Cloudflare and its user agent string is "thesis-research-bot"
On every req the target URL is different which is why the rest cache is not working here.
Can anyone please help here ?
6 Replies
For now I've blocked user-agent
What do you mean with "linked to Cloudflare"? Is the IP owned by Cloudflare? Is it on https://cloudflare.com/ips?
Do the requests have a
CF-Worker
header?
What are you hoping someone can do for you here?That was not logged in google cloud logs
I just wanted to know if someone else had faced a similar kind of issue.
If yes what was the approach they followed to resolve it
Linked to Cloudflare mean :
If an application is being proxied via Cloudflare
i've seen this issue with thesis-research-bot scraping one of my properties at every url possible
it was a multithreaded scrape from many ips on aws signapore
i started redirecting unauthenticated requests to various endpoints to a login page, and set up a fail2ban rule to ban users with > 10 redirects in 10 minutes
i also enabled fail2ban's cloudflare api interface
GitHub
fail2ban/config/action.d/cloudflare.conf at master · fail2ban/fail2...
Daemon to ban hosts that cause multiple authentication errors - fail2ban/fail2ban
Thank you I'll check this