I got up today and saw this traffic spike for my site:
Sometime this morning, my server was suddenly hit with a flood of demands that bypassed Cloudflare’s cache and hit my server at WP-Tweaks.com directly. Given that I cache all my pages, I was naturally curious to see what this spike was and whether I needed to tighten up my firewall rules to do anything about it. On viewing the raw logs for my server, I saw that it was the Ahrefs bot performing its usual bi-weekly audit. In one hour, it crawled 1,447 URLs on my site, causing the spike seen here.
I had been considering using Cloudflare’s “rate-limiting” service for a while. But today’s data point showed me just how hard it is to configure something like this without false positives.
How Rate Limiting Is Supposed to Work
You can configure rate limiting to say something like this:
“If an IP addresses issues X number of requests in Y amount of time, then take action”
The “action” can be a temporary block of the account or even a permanent one. Or you could instruct Cloudflare to issue a normal or a JS challenge or even simply log the requests. If you use Cloudflare, you can enable rate-limiting for free for the first 10,000 requests that match your rule.
Even Normal Requests Can Seem Like Too Much
When someone requests a URL on your site, their browser follows up almost immediately with a flood of concurrent requests for images, fonts, CSS, JS, and other static content. All of which are important to load and display the page as intended by you. So it’s not at all unusual for each person to request (say) 20 resources whenever they move from one page to the next on your site.
Of course, browser caching ensures that they won’t be asking for much of the same static content over and over. But it’s something to keep in mind. Some people might be visiting incognito for example.
And of course, you don’t want to block the all-important Google, Bing, and other legit search bots that crawl your site. Just like the Ahrefs crawler in the above example, these can cause unexpected, but necessary traffic spikes. I don’t see an easy way to enable rate-limiting without risking blocking these bots as well.
Limited Use Case for Rate Limiting
When I CAN see myself using rate limiting, it concerns certain resource-consuming pages. A prime example is your login page. I’ve spent considerable thought on securing my WordPress login page, but rate limiting is a good example of a simple solution. There’s no good reason why a bot would want to request your login page 20 times in a row. So you can protect specific URLs like these from abuse. Particularly because they consume so many resources that you can’t afford to have them hammer your site like this.
I could say the same for other requests that use the database or CPU time. Off the top of my head, I can think of contact pages, or other form submission areas on your site since these require a database action. It might even be worth looking into limiting access to admin-ajax.php on WordPress installations. The exact URLs will vary from site to site, and only you know which URLs tend to get spammed.
Overall, I would use rate-limiting only in a very targeted manner where I can be sure I’m not blocking legitimate users. So that means protecting those areas of your site where ordinary people and bots don’t usually have a reason to go.
I’m a NameHero team member, and an expert on WordPress and web hosting. I’ve been in this industry since 2008. I’ve also developed apps on Android and have written extensive tutorials on managing Linux servers. You can contact me on my website WP-Tweaks.com!
Leave a Reply