Artificial Engagement and Ad Fraud – Stewart Boutcher, Beacon
In this month’s episode, we’re talking about ad fraud and the role bots play in this lucrative space. Marketers care intensely about engagement and pay advertisers good money to get it, but how do they know they aren’t paying for visits from malicious bots? And what other kinds of harm do ad fraud bots cause businesses as a result?
To find out, Andy invites Beacon’s Stewart Boutcher onto the Cybersecurity Sessions. As two CTOs focused on tackling bots, but from different perspectives, Andy and Stewart find plenty to discuss!
Key points
- The winners and losers of digital ad fraud
- Why marketers can’t rely on ad networks to solve the bot problem
- How bot traffic pollutes data, steals customers, and skews decision making
- How the demise of third-party cookies will impact artificial customer profiling by bots
Speakers
Andy Still
Episode Transcript
Andy Still 00:03
So, here we are, again back for another instalment of the Cybersecurity Sessions, our regular podcast talking about all things cybersecurity, with myself, Andy Still, CTO and co-founder of Netacea, the world's first fully agentless bot management product. This time, we're going to be talking about the thorny issue of fraud in the advertising space, particularly the role bots play in this. This is obviously a subject that's close to my heart as I spent the last few years developing technology specifically to identify bot activities against web platforms. So, I know as well as anyone how tricky this can be. Now, the ad fraud space is even more challenging, with so many different parties involved, each with their own angle and tolerance of bot activities. I know when I first wrote about this subject, a couple of years ago, I had to keep redrawing diagrams for myself to remember how it all worked, who was making money from where, who was losing money, and why, in which area. Luckily, today we're joined by Stewart Boutcher, CEO at Beacon whose full-time job is stopping bots from committing ad fraud. Welcome, Stewart. Thanks very much for joining us today. Before we dig into the actual challenge, could you just quickly introduce yourself for our listeners?
Stewart Boutcher 01:10
Of course. So, my name is Stewart Boutcher and as Andy said, I am the CTO and Data Lead for Beacon. I've been involved in the technology sector for a long time. My first jobs are working in missile tracking systems, nuclear power control systems, which were very data heavy obviously. I've been involved a senior board level with entrepreneur organizations since about 1995-96, run a few companies myself with varying degrees of success. And I have now been involved with Beacon for about three and a half, four years in this role. I am also a member of the Data and Marketing Association North Council, and I'm Digital and Project Lead for the UK Police Memorial, which is a charity to honor officers killed and died in the line of duty.
Andy Still 01:52
Wide range of duties there. I think before we go into any detail, I think it's probably worth starting at the very top. Who are the winners and who are the losers from ad fraud? And how do they lose out?
Stewart Boutcher 02:02
Okay, good question. The winners and losers from ad fraud are, very much on the loser slide, is anyone who's running digital marketing on the internet, no matter what they're doing, because ad fraud is a term that encompasses a lot of different areas, fake clicks, fake impressions, programmatic ad fraud. But broadly speaking, it's any attempt to defraud digital advertising networks for financial gain, and scammers and criminal gangs who can use bots and click farms to carry out ad fraud but they're not the only methods. So, broadly speaking, those who run these scams to run these ad bots, or this ad fraud, are the winners. It is arguable, I think, in some sense that ad networks themselves benefit from an increase in traffic across their platforms, which they tend to get paid for. And the losers 100% are those who are paying for those adverts.
Andy Still 02:59
Okay, because I think one of the challenges in this is that the ad networks historically have been the people who determine which is fraud. And it's not necessarily in their interest, is it, to be too stringent on clamping down on fraud? Would that be in line with your experience?
Stewart Boutcher 03:17
It's a tricky question you jump into straightaway there, I have to be careful how I answer that one as I'm sure you do yourself. Yeah, I think it's fair to say that generously, you can say that it would be a difficult problem for an ad network necessarily to be able to be 100% certain about the veracity of a particular click impression visit, whatever it might be, because they don't have access to most of the data that sort of organizations like Beacon, and indeed yourselves do, where we've got a lot of server side, we've got a lot of what's going on, on a website, you can track and look at visitor actions and behavioral metrics, and what kind of stuff which is beyond where they are, they can only look at what's on their own platform. So you, generously you can say that it's a challenge for them to do that. And I think that perhaps what is something that many people would not contend with is they don't necessarily do as much as some people want to do. How's that for a political answer?
Andy Still 04:09
That was very nicely sat on the fence.
Stewart Boutcher 04:13
Well, we're being recorded. In the pub, we'll talk differently.
Andy Still 04:16
Where I thought this got particularly interesting when I was looking into it was around the elements and the approaches that the ad fraud people took to build reputation up because obviously, the value of a digital ad is significantly higher based on the individual and the likelihood of converting, can you just talk us through the level of sophistication of the bot operators and some of the techniques that they use to be able to create something very, very valuable?
Stewart Boutcher 04:49
So, a bot is just, as many of our listeners will know, a bot is a bit of software that's just programmed to do job and it's been directed to a particular kind of job, and botnet is just a whole collection of those. Now, you can run a bot anywhere. And what a lot of bot farm owners will do is that they would compromise people's machines, so there'd be a host of viral payloads that would basically install bots on as many different machines that it could do. And then there'd be some kind of controlling mechanism by which you tell them to do something. And there might be a DDOS service or whatever it might be. More recently, the advent of mass cloud computing has made it substantially easier for people to purchase cloud servers anywhere and run their bot networks in a more legitimate looking fashion. You don't have to worry about compromising someone's computer, you don't have to worry about downloading something and breaking it. Because defenses at the personal level have become much stronger, the operating system level become much stronger. So, bots tend to run on cloud computers now. They have become a lot more sophisticated because they're trying to look human. That's fundamentally the whole point about it. Many people may be aware of an online service called Fiverr, where for $5 or $10, or $50, you can get a certain number of likes on a post or a certain number of followers on Facebook page, so you can spend money to look like whatever you're doing is more popular. And I think a lot of marketeers have used those kinds of services, particularly before they understood what they really were, which was basically they're clearly not humans, they are bots that are being sold for money. So, ad networks, and they're run by intelligent people, tend to have good tech people inside them. They're not fools. They don't want to have bots running on the network and certainly not ostentatiously anyway. And so, these things can't obviously look like a bit of software. So how do you turn something that's obviously software into something that looks human. And a degree of the problem we face now is because of the way of the whole market and digital marketing system works, which is that it was very reliant. And still, it's certainly still very reliant on third party cookies and tracking people across their entire internet journey. So, do you have an understanding about their buying intent about what it is that they want to do and ways they might buy in the near future so that you can market them more effectively? As a side effect of that you can use that third party cookie ecosystem to build up something that looks like it's human because of the way it behaves. So, what do humans do on social media networks, for example? Well, you know, they engage with posts or articles, they view and click on adverts, but the point is they look real, they do the real things. And so, what's happening here in the whole world of ad fraud is that bots are engaging with adverts primarily to look human. Now, what's interesting, I think, is the way that the industry itself has really kind of got ahead of regulation in terms of moving away from third party cookies. Because real people have suddenly realized, hang on a second, what's going on with my data here? Why are people making money out of it? Why does someone know so much about me? But actually, if you look at the alternatives for third party cookies, which are all about, what's your intent, what are you going to buy, how do I market to you more effectively, they had the same problems built in, which is that they're gameable by a sophisticated bit of software that behaves in certain kinds of ways making it look human. So, it doesn't solve that problem really.
Andy Still 08:01
No, I think that was one of the really interesting things I found when looking into this, the amount of effort that the bot operators would put in the building the effective persona of a particular type of customer. And they would do this over a number of months or even years to build that history, of someone who has background visited the right sites at the right time of year to populate that with the right level of interest so that they can then have something that is the particular target market on high value clicks. And like you say that can include visiting lower value sites and actually delivering some value to less popular sites and less well visited sites can actually benefit from this in a certain way, because they have visitors that are visiting them just so that they can build up that reputation.
Stewart Boutcher 08:52
There are certain sectors where that's extremely rife. So, without naming any names at all, the job services industry is rife with bots, because people go online and looking for a job and there's a lot of particularly smaller players who buy traffic from each other. And they will list jobs that are elsewhere the scrape jobs from somewhere and list them on themselves. And then they basically get some kind of fee or referral for a click through. And now it really depends on having high numbers of visitors to their websites, so that they look valuable, and so that they rank better. The trouble with that is that you can put in the time and effort to build a really good quality site with great quality content, or you can shortcut it and you can buy traffic. And the trouble with a lot of bought traffic is that there's no way to validate where it's come from, the source of it, whether it's really human and cynically, you might argue that some of these sites aren't that bothered about whether it's human, they're merely looking at the traffic levels.
Andy Still 09:45
Yeah. And I think that definitely was my kind of experience that actually it was a way for them to drive sufficient traffic, which actually then pushed them higher up and drove more legitimate traffic to them. And it was a shortcut way of getting to quality.
Stewart Boutcher 10:00
It's like when you used to... I mean, 10 years ago, when you first set up a Twitter account, almost one of the first things you do, and I remember marketeers recommending it, you would just go and buy 500 or 1,000 followers, just because you then looked like you were a bit more popular, and therefore, people were more likely to follow you. There's almost nothing worse than going looking at a Twitter account and going "oh, that's interesting". And they've got three followers. And so there is a degree of credibility, which I suppose is the intrinsic human issue that we're not going to solve here today. But it's the same kind of principle.
Andy Still 10:31
Okay, so you were mentioning before that there's a move within the industry away from using third party cookies, what sort of techniques are they using instead of third-party cookies?
Stewart Boutcher 10:40
Well, I mean, first party really generally. But this isn't exactly my area of expertise. But there's a number of attempts for the larger networks to control the data more. So for example, if you're logged into Google, so I've got, you know, I'm running on, I think Edge here is my browser. And top right is a little picture of me, because, you know, I'm logged into Google. So it knows I'm logged into Google at all times. And so that's what that is gaining first party data. So in other words, it's not third party cookies, it's their own known data about what sites I'm looking at. And so there's, the larger networks are going that way. And what's, I think, potentially very dangerous about that, it's an interesting tactic, is the third-party cookies were, in principle, a good idea, it's just they were abused and manipulated by those in the industry. And everyone got involved with it to greater or lesser extent, whether they knew it or not, this is a cynical attempt to power grab, rather than say, actually, we could make those things work in the way they're supposed to work, whether it's genuine protection for people, it's not very hard, you know, there's so many cookie notices now, you know, it's not very hard to say we could keep using third party cookies and just allow a user to decide whether or not you want a particular slide to know about you, it's easy, it's just that it wasn't done. But what's happened is the big networks are now going, okay, well hang on a second here, we can create these walled gardens of our own data. And then what happens is smaller players have to buy from us. So it makes it even more difficult to have good oversight. And if you have players such as ourselves, it makes it harder for us to do our job, I think. The problem with the bots isn't gonna go away.
Andy Still 12:10
No, it's also more insidious to users as well, because that data is being captured via them using what they think is a legitimate search engine for other purposes than that, they're gathering a lot more data around what they're searching.
Stewart Boutcher 12:25
It's not just search, is it? It's everything you do on the internet. So I mean, recommendation, if you want people not know you're looking at then use Incognito, because there's good safety, security around that. There's lawsuits going through in the US at the moment, early days. But I think the initial judgement, I can't remember, but the initial documentation is released by the judge against Google, you may well be aware of this, which is talking around several projects that have said they've, there are two particular I remember of project Jedi Blue, and Project Jedi was effectively the attempt to create this user data as a walled garden without telling users what they're doing, which obviously, is insidious. But with Jedi Blue, perhaps even more worrying is allegedly an attempt for Google to prioritize their own and Facebook's traffic within the programmatic advertising ecosystem. So in other words, to give them an advantage over smaller players. So this is an abuse of position. I mean, it's alleged. It's going to take many years to play out. Who knows where it'll go, but you can you can see the concerns. I think the third-party cookies is lots of human access to data. The replacement mechanisms are very few organizations have access to probably more data. And you don't know what it is.
Andy Still 13:35
Yes, I think it'll be very interesting to see where that goes, we've seen such a clamp down on privacy restrictions being put into browsers. It will be interesting to see where that goes. And also how advertisers react to that. Because, as you say, there's a very large amount of money behind the advertising industry obviously keeps a large amount of content on the websites advertising driven.
Stewart Boutcher 13:59
I understand, it's difficult because people would use the internet and they don't pay for stuff. I mean, you pay for internet connection, but they don't pay for the content they get generally. And there is that kind of disconnect in people's minds between saying, well, actually, no one understands. If you can't see the mark, you're the mark. But I think marketing's traditionally never been very good at transparency, never been very good at openness, very famous comedian, Bill Hicks, you may well be aware of, was very scathing, about marketing. He makes some good points, which is that marketing is potentially malicious evil. But it clearly isn't because it serves a purpose, you know, makes you aware of things which you might not have known about, that you might want to know about but it's because the way that it deals with, its product is basically people it's selling to. It's not open.
Andy Still 14:41
It's a very interesting subject around targeted advertising anyway, that theoretically, it is much better for you because you will see things that are relevant to you and should be driven by things that you have chosen that are relevant that you want to buy, but actually it isn't like that. You know, it's driven by people who then know far too much about you that you haven't explicitly given them permission. So I think you know, there is a right balance.
Stewart Boutcher 15:07
So it’s also flawed, because I mean, how many times have you been followed around the internet by ads for a hotel you just booked, or a product you just bought?
Andy Still 15:18
Yeah. It's built on imperfect data, you're followed by adverts for something you looked at that you've decided that you are never going to look at again. So I think we should dive into getting your opinion on what people can do about it. If you're building a site with advertising on it. What level of protection should you be doing? And what steps can you take to stop these bots from taking your real viewers from seeing real adverts on your site?
Stewart Boutcher 15:40
I mean, it's an interesting one, because you have to be very careful not to just say, use us. But the reason that Beacon came about is because I was running a digital agency at Leeds, UK, where I'm based, and we saw real problems with disparities in data. We'd see a number from an ad network, and we'd see numbers coming through to the website according to GA and other analytics platforms. But also, when you look at the server data itself, raw server logs and traffic traces, I know you do a lot with raw server logs, you just see something completely different. And so it came about because what's going on here? And the answer was, you're getting a lot of invalid traffic, you're getting of traffic that doesn't convert, traffic that isn't real. And so I mentioned earlier on that the North Council for Digital Marketing Association, the DMA, and we're coming up with a Artificial Engagement, this resource, and I'll happily share their website which talks about that. But one of the things is by not burying your head in the sand is key. I think a lot of marketeers understand that programmatic marketing is absolutely rife with fraud. And despite some very good solutions out there, it still is. You just have to do the best you can. But I think a lot of marketers also don't really think of paid search, where we operate, as being particularly problematical. But actually, they are. We see numbers of 20% and 40% of budget wasted on bots on unprotected clients, which is a huge number. If you're spending, you know, half a million pounds a month, even 10,000 pounds a month, 20-40% is still a lot of money, because it's relative to your budget. So asking questions, like what percentage of my traffic is fake on digital media campaigns and who within an organizations responsible for monitoring this and preventing it, taking those initial steps, and saying okay, there is a problem. And actually a great place to understand the scope of the problem is very, very useful as a starting point. And then you can ask questions internally, you can ask questions with a digital marketing agency, and ask them the questions about, do they monitor for fraudulent traffic? What do they do about it? What do they mean by terminologies like engagement? How do they measure that kind of thing? And do they have numbers around different channels, different campaigns, different geographies, around bot engagement, and what we do to prevent it? So just by asking those questions, puts organizations into a completely different mindset whereby they're not going, oh well, there probably, this is probably not, it's massive, we can't even think about it, to... Okay, we'll break it down here. What do we do. You know, ultimately, unless you are a very techy organization who happens to be blessed with some good data scientists and people who understand this problem, you're going to end up engaging with third party specialists such as Beacon because this is a complicated technical process to detect bots. And I know full well that you're very aware of that. There's so many indicators that you're looking at, you know, behavioral metrics around mouse movements, timing, edge touch points, you know, a whole heap of things. And it's complicated, and there has to be good level of domain knowledge to be able to do anything about it, but just being aware is a good start.
Andy Still 18:29
Yeah, I would agree with that. And I think the thing is, when you look into how complex a solution you need from this leads to think about the level of sophistication and expertise that is going into the people who are running these bots, this is not people with a simple bit of software, just trying things out. This is a multi-billion-dollar industry, with big development teams on this full time with technical experts. So it's not something you can easily counter, without bringing some sort of expertise in, in understanding what they're trying to do, and also how you can respond.
Stewart Boutcher 19:03
I mean, it's also about a return on investment, I suppose in terms of like that. And there is diminishing law. I mean, what you guys do is obviously about protecting actual data, you know, you're protecting your websites, you know, from scraping from data stealing and all that kind of stuff. So with Beacon, what we're doing is we're stopping our customers from wasting their budget on bots. But no data is actually stolen, I mean, money stolen. And that's one thing. And I suppose the point is that we generally say to our, to our clients, when we engage with them first, you're not going to get rid of these bots. But if you can reduce it by 50%, or, you know, by 80%, then that's a huge return on investment. And so it's like, where's your level of acceptance? So for example, we have some clients who actually spent some of their marketing budget deliberately drawing bots in so that we could detect them, build them into an audience and then use that look alike audience to prevent bots from engaging them. Now, they were hugely invested in this and that's because their owner was a real data geek and a lot of people are more risk adverse, especially as a result of that they've taken their bot level down from over 40%, I think 45% or 46%, when we started engaging with them, down to sort of 7% or 8%. Even with that level of engagement, they can't take it completely to nothing. But I tell you what, that's huge for them.
Andy Still 20:16
I mean, that's a massive improvement. It's not just the, obviously, the wasted money associated with that bot, that bot has taken the spot of a real customer on your site. So every bot is costing you money, but then it's also costing you customers. So the more you can get rid of those more real customers...
Stewart Boutcher 20:33
One final thing, if I may, you're right, opportunity cost effectively, is what you're talking about there. You can't have to spend the money on someone real who might buy something from you, or whatever it is you want them to do. But also, it pollutes the data stack in a way that you're likely to make decisions based upon traffic that's never gonna convert for you, and make content decisions, you know, user flow decisions, all kinds of stuff. And again, that's something that's very important to marketeers, that they need to have good data to make good decisions about what to do next. And that is probably it really.
Andy Still 21:01
I couldn't agree more. We've engaged with a number of customers who've ended up making decisions that have optimized their sites for bots, which is clearly not where they wanted. This seems like a good point... I think we're just about out of time. So thank you very much again Stewart for joining us today, that's been absolutely fascinating. A subject close to my heart and one of the more complex areas of the whole bot management infrastructure. So thank you very much Stewart.
Stewart Boutcher 21:31
That was a really interesting conversation. So thank you.
Andy Still 21:35
And thank you very much everyone for tuning in and listening in today. If you are interested in giving us any feedback or reviews, please do that on our Twitter account @cybersecpod. Or you can email podcast@netacea.com. Till then, thank you very much and tune in again for our next episode of the Cybersecurity Sessions.