Bot Traffic

Article Contents

    Bot traffic is any set of legitimate requests made to a website that is made by an automated process rather than triggered by direct human action.

    The Difference Between Human Traffic And Bot Traffic

    The key difference between human traffic and bot traffic is that human traffic is all the visits to your website made by a computer but triggered by direct human action. If you are visiting a page on your site or clicking an advertisement, Google knows it’s you as it has previously recorded your IP address when you logged into its servers.

    Different Types Of Bot Traffic

    The main types of bot traffic are:

    • Spiders — This is any robot that can move around the pages of a website and read the content without it actually interacting with each page or causing any traffic to the site. Search spiders are often used by search engines such as Google, Bing, Yahoo! to index websites they discover on the internet.
    • Crawlers — A crawler is a software application that traverses the World Wide Web automatically in order to create an index cache (or web archive) of Internet resources for a particular group of people (e.g., researchers, scholars).
    • Scrapers — A scraper is computer software that extracts data from websites for various purposes. The term scraping may also refer to a process whereby information is copied from the Internet with the purpose of putting it into a format that can be easily read and processed by a computer program.
    • Bot spam — Bot spam is an automated process (sometimes referred to as “autobot”) which, unlike spiders and crawlers, has no search function or access to any information on the site it visits.
    • Malicious bots — Malicious bots are used by cybercriminals to perform a number of illegal actions such as sending spam, hacking into websites and spreading malware.

    How Bot Traffic Affects SEO Rankings

    Bot traffic is discouraged in Google’s Webmaster Guidelines because it can lead to poor quality, thin content. In essence, Google is trying to make sure that its search results for a particular keyword provide accurate information and are not filled with low-quality sites which do little more than send spammy links and/or annoy the visitors of a website.

    Ideally, if you run a high-quality site or blog you do not want it being indexed by bots as these extra hits will dilute your overall page rank since they have no value from an organic traffic perspective. This means that instead of getting 400 human visitors per month on average, you might get 5,000 fake ones resulting in a hit to overall page rank.

    How To Identify Bot Traffic On Social Media Platforms

    There are a number of ways to find out whether your social media pages are being crawled by bots. You can start with the following tips:

    • Check the “referrer” — The referrer is information contained in an HTTP request which tells you where something has come from before it reaches your browser. If you check this field you will often see things like googlebot or yandex search rather than real humans as the source of a traffic hit on your site.
    • Look at traffic patterns — For example, if you get 400 hits one day and 50,000 another then chances are that something automated is happening here. This could be someone using a bulk software program to read content on your website or visit all of your social media profiles automatically.
    • Pick up on your name being misspelt — Often when a bot is trying to scan social media sites and find content to scrape or steal, they will do this by using the URL of a high-ranking site so that it appears in search results for their target keyword. If you see that your name or brand has been misspelt then there’s likely some kind of spider action happening here.

    Why People Use Bot Traffic

    bots used to be largely isolated within the technology industry where they were often essential tools used for grabbing data from specific websites such as blogs or news stories, but now they are increasingly being used for unscrupulous reasons such as stealing content from other websites with malicious intent and forming part of the backbone of bot networks looking to produce spam and malware.

    Ways To Combat Bot Traffic

    Here are some ideas for stopping or limiting any impact that bot traffic has on your business:

    1. Treat it as you would any other type of spam — If your content is being scraped or plagiarized by someone else without permission then the way that you tackle this will be similar to how you deal with any other form of online spam such as comment spam and forum spammers. This means that you should:
    • Set up a pre-emptive filter for your site so that hits from all automated sources are blocked in advance.
    • Take steps to protect yourself using firewalls and anti-spam software which, if set up correctly, will reduce the amount of web crawler activity directed at your website.
    • Investigate each individual case where it appears that your content is being scraped and removed, with the intention of removing it from the original source.
    1. Make your website as unfriendly to bots as possible — This way you are simply limiting the potential for bot traffic directed at your site in the first place rather than tackling a problem that already exists. If you want to do this then there are some useful tactics to consider including:
    • Use CAPTCHA. Google have just announced that they will no longer be supporting visual reCAPTCHA , so HTML based captchas should now be considered essential for anyone wanting to protect their sites against scraping and spam. You can read more about CAPTCHA and how they work in this article.
    • Limit the amount of content on your website which is not being directly shared. Keep it as short and interesting as possible, with associated social media URLs and CTAs doing most of the hard work for you so that everything looks natural.
    • Set up HTTPS to encrypt traffic heading to your site from search engines who could be classed as crawlers (bots). This way you can make sure that any sensitive information gathered by a bot is encrypted before it reaches its final destination.
    1. Report suspicious activity to Google — You can report fraudulent or spam accounts, webpages and specific pieces of content using Google Webmaster tools. Also, if you have taken a look at the big picture and spotted a pattern of suspicious activity that looks like it could be the work of some kind of targeting bot network, consider reporting the issue to Google through their ‘malware allegation form’ for further investigation.
    2. Add pages to your robots.txt file — You can block specific pieces of content from being indexed using a robots.txt file which is included in all major search engine’s guidelines for web crawling (robots). This means that you can prevent certain sites, directories or files from being crawled by search engines to ensure that they do not come under any unwanted attention from bots seeking content to scrape or steal.
    3. Use Hreflang tags if necessary — If you have different language versions of your website then hreflang tags can be used to inform search engines whether content is identical or alternative versions of the same piece of information. This means that bots will simply move on without wasting their time on your content.
    4. Add a rel=”nofollow” tag — You can inform Google’s web crawlers not to follow any links by using the “nofollow ” HTML attribute, which will also prevent a site being included in its link popularity score for individual pages .
    5. Make it clear that you do not want links or contact information scraped — If you are operating an email-only style business and have taken steps to ensure that your contact form is only available via an HTTPS connection (which is highly advisable anyway) then Google may well respect your wishes and not add your email address to the SERPs.
    6. Use a crawling control/bot traffic blocker — If you have tried all of the above then it is worth considering installing a bot traffic blocker that will filter out any IP addresses not associated with real users conducting genuine research. Depending on your budget and technical knowledge, there are tools ranging from free (simple), paid (advanced) to enterprise products designed specifically for enterprises or Content Management Systems.

    Frequently Asked Questions About Bot Traffic

    Is bot traffic good?

    Bot traffic is not necessarily bad, but intentions can be. Bot-based attacks have been known to target the most vulnerable in society (the elderly and young) as well as stealing from small businesses without their knowledge or permission.

    Be wary of taking advice about SEO techniques that insist on using bots for link building because it may well backfire with your website ranking being negatively affected. Stick to the rules that are put in place by search engines so you will not only help protect yourself but also ensure that online users are protected too!

    Do I need an anti-bot blocker? Will this work against Google?

    Google has its own way of dealing with any bots, crawlers or scrapers trying to access its index which involves blocking IP addresses through a form.

    Blocking bots is not about preventing them from doing their job (more on that below) but more about giving websites the choice of taking action to protect themselves from any unwanted attention. If you have been affected by bot-based attacks and they appear to be getting worse then it could be worth contacting Google through their ” malware allegation form ” in order for the issue to be reviewed further.

    Does bot traffic affect SEO?

    The impact of bot traffic is not always clear-cut, but in general, the more bots you have accessing your website the more diluted it can become.

    This can lead to all kinds of problems for your SEO ranging from the obvious ranking drops to redirects and errors within your analytics.

    How common is bot traffic on Facebook Ads?

    Bot traffic is a serious problem for Facebook, which has been fighting an uphill battle to try and reduce it, primarily through automated methods.

    As you would expect the social network’s primary focus is on fraudulent accounts but they also make efforts to block bots and crawlers from accessing content so that real people are not adversely affected.

    Google wants to give its users the best quality search results possible, determined by factors such as engagement levels (clicking), shares, friends’ connections etc., which means it does not want any kind of middleman or bot in the mix.

    Can I do anything about bot traffic?

    The key to reducing bot spam is through an integrated approach that takes all possible steps to make it as difficult and undesirable for malicious bots to access your content in the first place.

    What should I do when I receive bot traffic?

    There are various steps that can be taken depending on what kind of traffic you receive.

    If it is obviously spammy, then there should be clear guidelines in place to deal with this. You will also need to establish whether the source(s) of bot attacks are external (hackers, malicious software etc.) or internal (employees using company equipment/data).

    This information will help you decide how best to deal with bot traffic and if reporting them directly to search engines is worthwhile.

    Block Bots Effortlessly with Netacea

    Book a demo and see how Netacea autonomously prevents sophisticated automated attacks.



    Web Scraping

    Web scraping (or web harvesting or screen scraping) is the process of automatically extracting data from an online service website.

    Two-Factor Authentication

    Two-factor authentication (2FA) is an extra layer of security to help protect your accounts from hackers and cybercriminals.

    Non-Human Traffic

    Non-human traffic is the generation of online page views and clicks by automated bots, rather than human activity.

    Block Bots Effortlessly with Netacea

    Demo Netacea and see how our bot protection software autonomously prevents the most sophisticated and dynamic automated attacks across websites, apps and APIs.
    • Agentless, self managing spots up to 33x more threats
    • Automated, trusted defensive AI. Real-time detection and response
    • Invisible to attackers. Operates at the edge, deters persistent threats
    Book a Demo