SEO Poisoning Part 2: How Bots Fuel SEO Poisoning Attacks
In our last blog post, we unpacked what SEO poisoning is and how it diverts organic search traffic. We examined some prevalent rank theft techniques used in SEO poisoning attacks.
In this follow-up, we will dive into how automation can be used to further SEO poisoning and ranking theft attacks. With bots, adversaries can execute coordinated ranking theft at a speed and scale not possible manually.
Specifically, we will analyse how automation is being used for content scraping, website cloning, fake user interactions, ad fraud and the consequence of skewed analytics. We will reveal how these bots’ tactics are undermining organic search results.
As marketing professionals strive to win organic traffic through SEO best practices, there’s a growing trend of malicious actors unleashing weaponised automation to expand the scale and scope of harmful SEO poisoning campaigns.
Autonomy of Automated Poisoning Campaigns
While various forms of SEO poisoning have historically manipulated search algorithms through the tactics we discussed in our previous post, automation coupled with things like proxies have only amplified the issue. This has led to the rise in several attack types.
Content Scraping Bots
Content scraping bots are automated tools that extract the content from websites, often without the website owner’s permission. These bots will crawl through web pages, copying text, images, and other elements to create duplicate content.
This has become an important part of SEO poisoning due to the capacity to steal content quickly and efficiently at scale. They will then leverage this stolen content to artificially boost their malicious site’s traffic, by attempting to divert organic traffic.
Typical scraper bots employ various tactics to evade detection by traditional bot management tools and extract target content. These bots will often route traffic through proxy networks to mask their real IP address and may mimic human behavior with randomized actions. They may spoof device fingerprints, solve CAPTCHAs, replay user sessions, and falsify referrers to appear organic. Scrapers may also use headless browser frameworks like Puppeteer and Playwright to dynamically render pages and bypass any obfuscation.
Serving requests to these bots uses up server resources, which can slow down or even crash a website entirely, as well as greatly increasing infrastructure costs for no commercial benefit. On the contrary, if content is duplicated for SEO poisoning, the business will have lost resources creating original content that becomes duplicated elsewhere.
Below is a typical scraper bot kill chain from the BLADE Framework, which details some of the tactics they might use.
For SEO poisoning attacks specifically, once the scraper bot has evaded detection and copied parts of the site, this data will get populated onto an adversary-controlled site. The adversary would then likely use fake interaction bots to further boost the SEO ranking of their site.
Website Cloning
In some cases, web scrapers clone the whole website rather than specific pieces of content. This is typically done to facilitate malware campaigns against the site’s users. Comprehensive duplication enables the cloned site to hijack search visibility and traffic for core brand terms, undermining years of brands’ SEO efforts.
Cloned sites integrate the scraped content seamlessly with additional stolen resources to out-optimize the authentic pages across topics. These sites often implement malicious links, code, and deceptive typo-squatted domains to maximize their disruption.
Fake Interaction Bots
While scraped or cloned content plays a vital role in search result visibility, algorithms also use engagement as a signal of quality and relevance. These metrics include social shares, comments, likes, views, click-through rates and backlinks.
Attackers can manipulate this using bots programmed to interact with a website or API in a manner intended to imitate human interactions, artificially inflating metrics.
While this interaction usually takes place on the fake site, there are hidden costs for the business who has had their content stolen. By artificially gaming these signals, these sites may appear above the original rankings. This allows the duplicate sites to hijack organic visibility and divert website traffic away from the true source.
This may affect leads and sales as the customers may be siphoned off to the counterfeit content. Compounding matters, customer trust in the brand could erode, as they unwittingly interact with fraudulent pages first. The business may then need to devote resources to get these sites removed or take back their search engine visibility.
Below is the typical fake interaction bot kill chain for form spam from the BLADE Framework, which details some of the tactics and techniques these bots often use:
Skewed Analytics
The impacts of scraped content and interaction bots also skew website analytics and metrics. Skewed analytics occurs due to the bot activity on websites. This could be to scrape content or clone sites, to add to a search engine through SEO poisoning.
Whether the bots are successful or not, this means there is traffic on the site that is not from real customers. The intent of the bot operators usually is not to skew analytics, but it can affect the decisions that businesses make without them realizing.
Marketers pay close attention to web analytics as these tools provide valuable insights about prospects and customers. Where are people arriving from? What are they doing when they are on the site? What does that tell us about our current marketing strategies? This data is incredibly important and is used to make a range of business decisions.
Unfortunately, a high proportion of bot traffic alters this data significantly. If you don’t have an accurate view of customers’ behavior on your website, you are likely to make poor decisions about their buying journey and have little understanding of what they want and need from your website.
As a result, skewed analytics can have substantial financial impacts on businesses. During a survey commissioned by Netacea, we found that skewed analytics drains 5% of total online revenue for the average enterprise business.
The Rise in Ad Fraud
In addition to distorted analytics and SEO rankings, scraped content opens more avenues for criminals to profit through ad fraud. Digital advertising is a lucrative business, with millions being funnelled into online ad campaigns every day. Yearly ad spend is expected to reach $885 billion this year. This makes the digital marketing industry a prime target for ad fraud.
Ads work by matching advertisers with publishers, serving the most relevant ad to the users based on factors such as search intent and demographic information. Criminals can turn a profit by hosting genuine ads on fake sites with copied content and faked automated visitors through fake interaction bots.
There are many digital ad fraud schemes, including:
Click fraud
Bots click on ads placed on pages with scraped content to boost payouts from advertisers. This is sometimes done through a large botnet, called a click farm.
Hidden ads
Scraped content pages deliberately obfuscate ads from real users. They will then use ad networks to record fake interactions and bill the advertisers.
Ad stacking
Ads are stacked on top of scraped pages that have been boosted through SEO poisoning tactics. Only the top ad will be visible, but all ads in the stack will record impressions when engaged.
For advertisers, fake impressions and clicks can rapidly drain budgets without generating any tangible leads. Skewed analytics will further obscure campaign performance, impairing decision-making for future campaigns.
For publishers hosting original content, the content scraping opens them up to ad fraud that can siphon revenues, while degrading website performance and the experience of valued customers. Advertisement exchanges may impose chargebacks or the banning of their inventory entirely.
Protecting Your Interests from Scraping Enabled SEO Poisoning
This persistent threat makes ongoing monitoring and rapid response critical for brands. Here are some tips for detecting and responding to poisoning and rank theft campaigns.
Closely Track Your Core Rankings
Review and monitor your most important keywords related your brand. Any gains or losses may indicate SEO poisoning. If there does appear to be any sudden changes to the search rankings, leverage Google Analytics to check your top referral sources for odd spikes potentially stemming from newly poisoned results.
Conduct Routine Backlink Audits
Thoroughly review backlinks pointing to your site using specialised tools like Semrush or Ahrefs. Immediately deny any toxic, irrelevant, or questionable links.
Publish Unique, Useful Content
Create fresh, engaging, relevant content optimised around core focus keywords. Quality content builds credibility and can help counteract search term dilution over time.
Document and Report Abusive Tactics
Keep dated records of any suspected poisoning pages, keywords, links, or other tactics to support cleanup requests. Submit these details directly to search engines when required.
Monitor Brand Assets
Consider defensively registering similar domain names, trademarks, and social media profiles to your brand to prevent fake sites or accounts from appearing via typo-squatting or brand impersonation.
Issue Takedowns and Legal Notices If Necessary
For stolen content, you should consider issuing a DMCA takedown request or cease and desist demand via your legal team.
Detect and Mitigate Automated Attacks
Mitigating scraper bot attacks disrupts the SEO poisoning kill chain by thwarting tactics such as content theft and website cloning. Netacea Bot Protection analyses all website traffic to detect and block automated attacks automatically. With Netacea Bot Protection, you can be confident that website traffic is genuine, and your marketing analytics are an accurate reflection of real customer activity.
Book a demo of Netacea today and get the impact of bad bots under control across your web estate.