据信，全部 Internet 流量中，超过 40% 以上是由机器人流量组成，其中很大一部分为恶意机器人流量。这也是许多组织开始寻求方法管理进入其站点的机器人流量的原因。
Web 工程师可直接查看指向其站点的网络请求，并辨别是否为机器人流量。Google Analytics 或 Heap 等集成式 Web 分析工具也可助力机器人流量检测。
如上所述，未经授权的机器人流量会影响分析度量指标，如页面访问量、跳出率、会话持续时间、用户定位以及转换次数。度量指标偏差会给站点所有者带来许多不利影响；对于充斥着机器人活动的站点，很难衡量其性能。尝试通过 A/B 测试以及优化转换率来改善站点性能，也会因机器人造成的统计噪声而受阻。
Google Analytics does provide an option to “exclude all hits from known bots and spiders” (spiders are search engine bots that crawl webpages). If the source of the bot traffic can be identified, users can also provide a specific list of IPs to be ignored by Google Analytics.
攻击者发动 DDoS 攻击最常用的方式就是发送大量机器人流量。某些类型的 DDoS 攻击活动期间，有大量攻击流量指向网站，以致源服务器负担过重，站点运行变慢或者合法用户根本无法访问。
For sites that serve ads, bots that land on the site and click on various elements of the page can trigger fake ad clicks; this is known as click fraud. While this may initially result in a boost in ad revenue, online advertising networks are very good at detecting bot clicks. If they suspect a website is committing click fraud, they will take action, usually in the form of banning that site and its owner from their network. For this reason, owners of sites that host ads need to be ever-wary of bot click fraud.
Sites with limited inventory can be targeted by inventory hoarding bots. As the name suggests, these bots go to e-commerce sites and dump tons of merchandise into their shopping carts, making that merchandise unavailable for purchase by legitimate shoppers. In some cases this can also trigger unnecessary restocking of inventory from a supplier or manufacturer. The inventory hoarding bots never make a purchase; they are simply designed to disrupt the availability of inventory.
The first step to stopping or managing bot traffic to a website is to include a robots.txt file. This is a file that provides instructions for bots crawling the page, and it can be configured to prevent bots from visiting or interacting with a webpage altogether. But it should be noted that only good bots will abide by the rules in robots.txt; it will not prevent malicious bots from crawling a website.
A number of tools can help mitigate abusive bot traffic. A rate limiting solution can detect and prevent bot traffic originating from a single IP address, although this will still overlook a lot of malicious bot traffic. On top of rate limiting, a network engineer can look at a site’s traffic and identify suspicious network requests, providing a list of IP addresses to be blocked by a filtering tool such as a WAF. This is a very labor-intensive process and still only stops a portion of the malicious bot traffic.
Separate from rate limiting and direct engineer intervention, the easiest and most effective way to stop bad bot traffic is with a bot management solution. A bot management solution can leverage intelligence and use behavioral analysis to stop malicious bots before they ever reach a website. For example, Cloudflare Bot Management uses intelligence from over 25,000,000 Internet properties and applies machine learning to proactively identify and stop bot abuse. Super Bot Fight Mode, available on Pro and Business plans, offers smaller organizations similar visibility and control over their bot traffic.