What Is Bot Traffic?

Bot traffic describes any non-human traffic to a website or an app. The term bot traffic often carries a negative connotation, but in reality bot traffic is not necessarily good or bad; it all depends on the purpose of the bots and the preferences of the website operator.

Some bots are essential for useful services such as search engines and digital assistants (e.g. Siri, Alexa). Most companies welcome these sorts of bots on their sites.

Other bots can be malicious, for example those used for the purposes of credential stuffing, data scraping, and launching DDoS attacks. Even some of the more benign ‘bad’ bots, such as unauthorized web crawlers, can be a nuisance because they can disrupt site analytics and generate click fraud.

It is believed that over 40% of all Internet traffic is comprised of bot traffic, and a significant portion of that is malicious bots. This is why so many organizations are looking for ways to manage the bot traffic coming to their sites.

How can bot traffic be identified?

Web engineers can look directly at network requests to their sites and identify likely bot traffic. An integrated web analytics tool, such as Google Analytics or Heap, can also help to detect bot traffic.

The following analytics anomalies are the hallmarks of bot traffic:

Abnormally high pageviews: If a site undergoes a sudden, unprecedented and unexpected spike in pageviews, it’s likely that there are bots clicking through the site.
Abnormally high bounce rate: The bounce rate identifies the number of users that come to a single page on a site and then leave the site before clicking anything on the page. An unexpected lift in the bounce rate can be the result of bots being directed at a single page.
Surprisingly high or low session duration: Session duration, or the amount of time users stay on a website, should remain relatively steady. An unexplained increase in session duration could be an indication of bots browsing the site at an unusually slow rate. Conversely, an unexpected drop in session duration could be the result of bots that are clicking through pages on the site much faster than a human user would.
Junk conversions: A surge in phony-looking conversions, such as account creations using gibberish email addresses or contact forms submitted with fake names and phone numbers, can be the result of form-filling bots or spam bots.
Spike in traffic from an unexpected location: A sudden spike in users from one particular region, particularly a region that’s unlikely to have a large number of people who are fluent in the native language of the site, can be an indication of bot traffic.

Under Attack?

Comprehensive protection against cyber attacks

Talk to an expert

How can bot traffic hurt analytics?

As mentioned above, unauthorized bot traffic can impact analytics metrics such as page views, bounce rate, session duration, geolocation of users, and conversions. These deviations in metrics can create a lot of frustration for the site owner; it is very hard to measure the performance of a site that’s being flooded with bot activity. Attempts to improve the site, such as A/B testing and conversion rate optimization, are also crippled by the statistical noise created by bots.

How to filter bot traffic from Google Analytics

Google Analytics does provide an option to “exclude all hits from known bots and spiders” (spiders are search engine bots that crawl webpages). If the source of the bot traffic can be identified, users can also provide a specific list of IPs to be ignored by Google Analytics.

While these measures will stop some bots from disrupting analytics, they will not stop all bots. Furthermore, most malicious bots pursue an objective besides disrupting traffic analytics, and these measures do nothing to mitigate harmful bot activity outside of preserving analytics data.

How can bot traffic hurt performance?

Sending massive amounts of bot traffic is a very common way for attackers to launch a DDoS attack. During some types of DDoS attacks, so much attack traffic is directed at a website that the origin server becomes overloaded, and the site becomes slow or altogether unavailable for legitimate users.

How can bot traffic be bad for business?

Some websites can be financially crippled by malicious bot traffic, even if their performance is unaffected. Sites that rely on advertising and sites that sell merchandise with limited inventory are particularly vulnerable.

For sites that serve ads, bots that land on the site and click on various elements of the page can trigger fake ad clicks; this is known as click fraud. While this may initially result in a boost in ad revenue, online advertising networks are very good at detecting bot clicks. If they suspect a website is committing click fraud, they will take action, usually in the form of banning that site and its owner from their network. For this reason, owners of sites that host ads need to be ever-wary of bot click fraud.

Sites with limited inventory can be targeted by inventory hoarding bots. As the name suggests, these bots go to e-commerce sites and dump tons of merchandise into their shopping carts, making that merchandise unavailable for purchase by legitimate shoppers. In some cases this can also trigger unnecessary restocking of inventory from a supplier or manufacturer. The inventory hoarding bots never make a purchase; they are simply designed to disrupt the availability of inventory.

Many websites have relied on producing original content to attract user traffic and generate revenue from that traffic, sometimes from ads. The spike in usage of AI tools in the 2020s has negatively impacted such business models. AI tools use original content from the web to train their underlying large language models (LLMs), build search indexes for use in connection with those models, and retrieve content in real time in response to user prompts. Users who receive responses from LLMs may never visit the websites on whose content the response was based. AI crawler bots that source original content can also impose direct costs on website operators, as they can send lots of requests for webpages.

How can websites manage bot traffic?

The first step to stopping or managing bot traffic to a website is for a website administrator to declare their preferences in a robots.txt file. Robots.txt files provide instructions for bots crawling the page, and they can be configured to instruct bots that they should not visit or interact with certain webpages. But it should be noted that only some bots abide by the rules in robots.txt files; those files do not actually prevent bots from crawling websites. Cloudflare offers a sophisticated managed robots.txt service to help website administrators express their preferences to crawler operators.

To police traffic from AI crawler bots, website operators should use a service like AI Audit from Cloudflare. This service allows website operators to either allow or block AI crawlers (blocking means the AI crawlers cannot access content for any purpose). AI Audit’s pay per crawl feature also lets website operators charge AI bot operators for crawling, if they wish to do so.

A number of other tools also can mitigate abusive bot traffic. A rate limiting solution, like Cloudflare's WAF product, can detect and prevent high-volume, abusive bot traffic originating from a single IP address.

Network engineers also can review traffic, manually identifying suspicious network requests originating from a range of IP addresses and all requests from those IP addresses. This is a very labor-intensive process, however, and it is unlikely to stop the majority of malicious bot traffic that a website may face.

Separate from rate limiting and direct engineer intervention, the easiest and most effective way to stop bad bot traffic is with a bot management solution. A bot management solution can leverage intelligence and use behavioral analysis to stop malicious bots before they ever reach a website. For example, Cloudflare Bot Management uses intelligence from millions of Internet properties and applies machine learning to proactively identify and stop bot abuse. Super Bot Fight Mode, available on Pro and Business plans, offers smaller organizations similar visibility and control over their bot traffic.

FAQs

What is bot traffic?

Bot traffic refers to any non-human activity on a website or application. Bot traffic is not inherently good or bad; it depends on the purpose of the bot, with some bots being essential for services like search engines and others being malicious.

How can I tell if my website is receiving bot traffic?

You can identify bot traffic by looking for anomalies in your website analytics. Key signs include abnormally high pageviews or bounce rates, sudden changes in session duration, a spike in junk conversions, or a sudden surge in traffic from an unexpected geographic location.

Are all bots bad?

Some bots are beneficial and even essential. For example, search engine bots (also called spiders or crawlers) are necessary for a website to be indexed and appear in search results. However, malicious bots can perform harmful actions like scraping data, stuffing credentials, and launching DDoS attacks.

How can bot traffic negatively affect my website?

Malicious bot traffic can hurt your website in several ways. It can skew your analytics, making it difficult to measure performance. Malicious bots can also harm site performance by overloading your server. For businesses, bots can commit click fraud on ads or hoard inventory on ecommerce sites, disrupting sales.

How can I manage bot traffic on my site?

A starting point is to use a robots.txt file to provide instructions to bots, though this is not a foolproof method as malicious bots will ignore it. More effective tools include rate limiting to block high-volume traffic and, most effectively, a dedicated bot management solution that uses machine learning and behavioral analysis to distinguish between good and bad bots.

What is a robots.txt file?

A robots.txt file is a set of instructions for bots that visit your website. In this file, you can specify rules, such as which pages bots are not allowed to crawl. While good bots will follow these rules, many bad bots will not.