YouTube’s recommendation engine is an artificial-intelligence-based system designed to automatically push users videos that stand a good chance of keeping them on the platform longer. It’s been in the news a lot this year. In January, for instance, YouTube tweaked its system in response to criticism that the recommendation engine led users to conspiracy videos or extremist content. Now, researchers at Harvard’s Berkman Klein Center for Internet and Society have happened on another troubling finding: The recommendation engine, they say, has been curating innocuous videos of young children—such as kids swimming in a pool—for pedophiles.
I talked with Jonas Kaiser, part of the Harvard team that uncovered the disturbing phenomenon, about his and his colleague’s work, which was featured last week in an exposé in The New York Times. Kaiser and his colleagues were working on a project about how YouTube is being used in Brazil. As part of their research, they defined clusters of channels, or, as they call them, communities; they would identify, for instance, a community of channels created by video game enthusiasts.
At some point during their work, the researchers came across a group of 50 or so channels of sexually suggestive content. Taking the top videos of these channels as a starting point, they looked for what videos YouTube recommended they watch next. That’s when they noticed the pattern: the automated recommendation system was promoting videos of young children. I asked Kaiser about how his team came across this startling finding. (YouTube says it has made changes to help protect children, including ending children’s ability to livestream by themselves, disabling comments on videos with children, and limiting how videos of children are recommended.)
Matt Field: You wanted to see what would happen to someone if they followed YouTube’s video recommendations?
Jonas Kaiser: We kind of wanted to replicate what it was to stumble on these videos. We did this in two steps, we looked for the top 20 related videos of the top videos of these channels and then we took the new videos that were added and also looked for those recommended videos.
We then created basically two networks, a network with the videos, basically to understand what are the most recommended videos but also we then aggregated the videos back to their channels and created a network based on these recommendations to understand what are the prominent channels in that context, and there we were able to identify this community of channels that consisted of videos of children…
You would stumble over a channel where it was clear that it was a family channel or a kid having his or her own YouTube channel, just like experimenting with that. There would be three, or four, or five videos and most of the videos would have 100 views or so at best and then one video would have over 200,000 views.
MF: These videos of children were put in the recommendation stream of people with sexual perversions?
JK: Obviously that’s speculation. The way YouTube’s recommendations usually work is that it optimizes for people staying on the site, getting recommended content that is more likely to keep them watching. In this context it seems to be, but obviously we don’t know, that YouTube’s algorithm or machine learning algorithm optimizes basically on pedophile behavior.
MF: And you theorize this is because it’s inconceivable that these sorts of videos could get so many views?
JK: Two reasons. For one, the videos themselves are very innocuous, nothing really happens. Obviously we know that some of these videos just sometimes go viral. However, that is unlikely in this context as they were embedded in [a] community of very similar channels and very similar videos.
If you had just stumbled over one or two of these videos and they’re just like viral that’s just how the internet works sometimes. That’s what we highlighted in The New York Times…
Most of these videos or probably all of them are probably not illegal per se, they’re also not necessarily sexual per se. It’s really the context that the algorithm creates. You just see a child swimming in the pool or whatever, or doing splits. That usually is totally innocent and you could show this to your friends or your family or whatever and everybody would laugh, but if YouTube takes this video and connects it with videos of other small children, that’s a totally different story.
MF: The YouTube algorithm somehow decided, for instance, that a person who watches a video of a woman asking for a sugar daddy, wants to see videos of kids swimming in a pool?
JK: It’s in the same ballpark for the algorithm. I think that’s fair to say—or it used to be in the same ballpark. As you know YouTube introduced some changes; I’m not sure what the same network would look like now. In the end it’s always up to the user and whether the user wants to go down that rabbit hole. The algorithm doesn’t force you to click on anything. More importantly, the algorithm creates the context where it’s easier to go down that rabbit hole. The decision is always up to you.
MF: But it’s because plenty of people have gone down that rabbit hole that the algorithm thinks that’s the direction people want to go?
JK: Exactly. I think The New York Times also had a different video where the video had over 400,000 views. Besides uncovering this and being shocked that this was on YouTube in the first place, we were also extremely shocked by the amount of views that we’ve seen on some of these videos.
MF: Could you get data on the actual users, like the 400,000 who watched one video The New York Times looked into about a young girl in Brazil?
JK: We definitely couldn’t, [but] YouTube probably could to some extent. Were the users signed into their Google account, for example? Even if YouTube were to do that, it’s important to note that those are borderline cases. The really problematic thing is the context the algorithm creates—not like some of the videos in general.
MF: What did you think about YouTube’s reaction to the Times piece?
JK: I think it is definitely a step in the right direction. Obviously there’s always more you can do. I also realize it’s kind of hard to make cuts on a platform that has so much content on it and is constantly uploaded. …[For example,] when YouTube decided to get rid of neo-Nazi content, especially holocaust denial, they inadvertently also deleted content of teachers and professors teaching about that. Even good intentions can have bad consequences on YouTube’s part.
MF: What solutions do you suggest?
JK: In general, making it harder for kids to upload their content—in this case streaming. Disabling the comment section when kids are in the video seems reasonable, because it’s often the context that’s really the highly problematic thing.
Then obviously, rethinking when and how to use the algorithm. Does it make its sense for YouTube to have recommended videos in the context of children?
(Editor’s note: This interview has been condensed and edited for clarity.)