Malicious bots disguise themselves as regular human traffic, so they might not be visible when you check your website traffic statistics. That can hurt your business decisions because you don’t have the correct data. You might see random spikes in traffic but don’t understand why. Or you might be confused as to why you receive traffic but no conversion.
You can decide which parts of your site you don’t want bots to crawl and block their access via robots.txt. This not only saves energy but also helps to optimize your crawl budget.
Bot traffic gets a bad name sometimes, and in many cases, they are indeed bad. But there are good and legitimate bots too. It depends on the purpose of those bots. Some bots are essential for operating digital services like search engines or personal assistants. Some bots want to brute-force their way into your website and steal sensitive information. So which are the ‘good’ bot activities and which ones are ‘bad?’ Let’s go a bit deeper into these two kinds of bots.
The ‘good’ bots
If they support the crawl-delay
in robots.txt, you should try to limit their crawl rate, so they don’t come back once every 20 seconds and crawl the same links over and over. This is very useful for medium to large websites that crawlers often visit. But small websites also benefit from using crawl delays. Most likely, you don’t update your website content 100 times on a given day, even for larger websites. And if you have copyright bots visiting your site to check for copyright infringement, do they need to come every few hours?
And malicious bots are bad for your site’s security. They will try to brute force their way into your website using various username/password combinations or seek out weak entry points and report to their operators. If you have security vulnerabilities, these malicious players might even attempt to install viruses on your website and spread those to your users. And if you own an online store, you will have to manage sensitive information like credit card details that hackers would love to steal.
For the environment
The most basic way to do this is to block an individual or an entire range of IP addresses. You should block that IP address if you identify irregular traffic from a source. This approach works, but it’s labor-intensive and time-consuming. Alternatively, you can use a bot management solution from providers like Cloudflare. These companies have an extensive database of good and bad bots. They also use AI and machine learning to detect malicious bots and block them before they can cause harm to your site.
Now that you’ve got some knowledge about bot traffic let’s talk about why you should care about it.
For your website security and performance
The ‘good’ bots carry out specific functions that do not cause harm to your website or server. They announce themselves and let you know what they do on your website.
When a bot visits your site, it makes an HTTP request to your server asking for information. Your server needs to respond to this request and returns the necessary information. Whenever this happens, your server must spend a small amount of energy to complete the request. But if you consider all the bots on the internet, then the amount of energy spent on bot traffic is enormous.
In this sense, it doesn’t matter if a good or bad bot visits your site because the process is still the same. They both use energy to perform their tasks, and they both have consequences on the environment. Even though search engines are an essential part of the internet, they are guilty of being wasteful too.
Because of these malicious bots, bot traffic gets a bad name. Unfortunately, a significant amount of bot traffic comes from such ‘bad’ bots. It is estimated that bad bot traffic will account for 27.7% of internet traffic in 2022. Here are some of the bots that you don’t want on your site:
- Email scrapers: They harvest email addresses and send malicious emails to those contact.
- Comment spam bots: Spams your website with comments and links that redirect people to a malicious website. Or in many cases, they spam your website to advertise or to try to get backlinks to their sites.
- Scrapers bots: These bots come to your website and download everything they can find. That can include your text, images, HTML files, and even videos as well. Bot operators will then re-use your content without permission.
- Bots for credential stuffing or brute force attacks: These bots will try to gain access to your website to steal sensitive information. They do that by trying to log in like a real user.
- Botnet, zombie computers: They are networks of infected devices used to perform DDoS attacks. DDoS stands for distributed denial-of-service. During a DDoS attack, the attacker uses such a network of devices to flood a website with bot traffic. This overwhelms your web server with requests, resulting in a slow or unusable website.
- Inventory and ticket bots: They go to websites to buy up tickets for entertainment events or to bulk purchase newly-released products. Brokers use them to resell tickets or products at a higher price to make profits.
Why you should care about bot traffic
Edwin is a strategic content specialist. Before joining Yoast, he spent years honing his skill at The Netherlands’ leading web design magazine.