If you’re among the billion people (literally!) who spend time on TikTok every month, you’re familiar with the app as 2021’s central vehicle for youth culture and online culture generally
There are four main goals for TikTok’s algorithm: “user value,” “long-term user value,” “creator value,” and “platform value.”
That set of goals is drawn from a frank and revealing document for company employees that offers new details of how the most successful video app in the world has built such an entertaining — some would say addictive — product.
The document, headed “TikTok Algo 100,” was produced by TikTok’s engineering team in Beijing. A company spokesperson, Hilary McQuaide, confirmed its authenticity, and said it was written to explain to nontechnical employees how the algorithm works. The document offers a new level of detail about the dominant video app, providing a revealing glimpse both of the app’s mathematical core and insight into the company’s understanding of human nature — our tendencies toward boredom, our sensitivity to cultural cues — that help explain why it’s so hard to put down. The document also lifts the curtain on the company’s seamless connection to its Chinese parent company, ByteDance, at a time when the U.S. Department of Commerce is preparing a report on whether TikTok poses a security risk to the United States.
If you’re among the billion people (literally!) who spend time on TikTok
every month, you’re familiar with the app as 2021’s central vehicle for youth culture and online culture generally. It displays an endless stream of videos and, unlike the social media apps it is increasingly displacing, serves more as entertainment than as a connection to friends.
It succeeded where other short videos apps
failed in part because it makes creation so easy, giving users background music to dance to or memes to enact, rather than forcing them to fill dead air. And for many users, who consume without creating, the app is shockingly good at reading your preferences and steering you to one of its many “sides,” whether you’re interested in socialism or Excel tips or sex, conservative politics or a specific celebrity. It’s astonishingly good at revealing people’s desires even to themselves — “The TikTok Algorithm Knew My Sexuality Better Than I Did,” reads one in a series of headlines about people marveling at the app’s X-ray of their inner lives.
TikTok has publicly shared the broad outlines of its recommendation system, saying it takes into account factors including likes and comments as well as video information like captions, sounds and hashtags. Outside analysts have also sought to crack its code. A recent Wall Street Journal report demonstrated how TikTok relies heavily on how much time you spend watching each video to steer you toward more videos that will keep you scrolling, and that process can sometimes lead young viewers down dangerous rabbit holes, in particular toward content that promotes suicide or self-harm — problems that TikTok says it’s working to stop by aggressively deleting content that violates its terms of service.
The new document was shared with The New York Times by a person who was authorized to read it, but not to share it, and who provided it on the condition of anonymity. The person was disturbed by the app’s push toward “sad” content that could induce self-harm.
The document explains frankly that in the pursuit of the company’s “ultimate goal” of adding daily active users, it has chosen to optimize for two closely related metrics in the stream of videos it serves: “retention” — that is, whether a user comes back — and “time spent.” The app wants to keep you there as long as possible. The experience is sometimes described as an addiction, though it also recalls a frequent criticism of pop culture. Playwright David Mamet, writing scornfully in 1998 about “pseudoart,” observed that “people are drawn to summer movies because they are not satisfying, and so they offer opportunities to repeat the compulsion.”
To analysts who believe algorithmic recommendations pose a social threat, the TikTok document confirms their suspicions.
“This system means that watch time is key. The algorithm tries to get people addicted rather than giving them what they really want,” said Guillaume Chaslot, the founder of Algo Transparency, a group based in Paris that has studied YouTube’s recommendation system and takes a dark view of the effect of the product on children, in particular. Chaslot reviewed the TikTok document at my request.
“I think it’s a crazy idea to let TikTok’s algorithm steer the life of our kids,” he said. “Each video a kid watches, TikTok gains a piece of information on him. In a few hours, the algorithm can detect his musical tastes, his physical attraction, if he’s depressed, if he might be into drugs, and many other sensitive information. There’s a high risk that some of this information will be used against him. It could potentially be used to micro-target him or make him more addicted to the platform.”
The document says watch time isn’t the only factor TikTok considers. The document offers a rough equation for how videos are scored, in which a prediction driven by machine learning and actual user behavior are summed up for each of three bits of data: likes, comments and playtime, as well as an indication that the video has been played:
Plike X Vlike + Pcomment X Vcomment + Eplaytime X Vplaytime + Pplay X Vplay
“The recommender system gives scores to all the videos based on this equation, and returns to users videos with the highest scores,” the document says. “For brevity, the equation shown in this doc is highly simplified. The actual equation in use is much more complicated, but the logic behind is the same.”
The document illustrates in detail how the company tweaks its system to identify and suppress “like bait” — videos designed to game the algorithm by explicitly asking people to like them — and how the company thinks through more nuanced questions.
Another chart in the document indicates that “creator monetization” is one of the company’s goals, a suggestion that TikTok may favor videos in part if they are lucrative, not just entertaining.
Julian McAuley, a professor of computer science at the University of California San Diego, who also reviewed the document, said in an email that the paper was short on detail about how exactly TikTok does its predictions, but that the description of its recommendation engine is “totally reasonable, but traditional stuff.” The company’s edge, he said, comes from combining machine learning with “fantastic volumes of data, highly engaged users, and a setting where users are amenable to consuming algorithmically recommended content (think how few other settings have all of these characteristics!). Not some algorithmic magic.”
And indeed, the document does much to demystify the sort of recommendation system that tech companies often present as impossibly hard for critics and regulators to grasp, but that typically focus on features that any ordinary user can understand. The Journal’s coverage of leaked Facebook documents, for instance, illustrated how Facebook’s decision to give more weight to comments helped divisive content spread. While the models may be complex, there’s nothing inherently sinister or incomprehensible about the TikTok recommendation algorithm outlined in the document.
But the document also makes clear that TikTok has done nothing to sever its ties with its Chinese parent, ByteDance, whose ownership became a spasmodic focus at the end of President Donald Trump’s administration in 2020, when he attempted to force the sale of TikTok to an American company allied with his administration, Oracle.
The TikTok document refers questions to an engineering manager whose LinkedIn biography says he works on both TikTok and ByteDance’s similar Chinese app, Douyin, offering a glimpse at the remaining global element of an increasingly divided tech industry, the engineering talent. The document says the engineering manager attended Peking University, received a master’s degree in computer science at Columbia University and worked for Facebook for two years before coming to ByteDance in Beijing in 2017. The document is written in clear, but nonnative, English, and comes from the perspective of the Chinese tech industry. It makes no references, for instance, to rival American companies like Facebook and Google, but includes a discussion of “if Toutiao/Kuaishou/Weibo have done something similar, can we launch the same strategy as they have done?”
Concern about Chinese consumer technology is bipartisan in the United States. Trump’s executive order attempting to ban the app in August 2020 warned that TikTok’s “data collection threatens to allow the Chinese Communist Party access to Americans’ personal and proprietary information.” The Chinese government could “build dossiers of personal information for blackmail, and conduct corporate espionage,” it said. That ban stalled in court and faded after the presidential election. President Joe Biden rescinded the executive order, but his administration then announced its own investigation into security threats posed by TikTok, with an unnamed senior administration official telling reporters that China was “working to leverage digital technologies and American data in ways that present unacceptable national security risks.”
In an emailed statement, McQuaide said that “while there’s some commonality in the code, the TikTok and Douyin apps are run entirely separately, on separate servers, and neither code contains user data.”
She also said, “TikTok has never provided user data to the Chinese government, nor would we if asked.”TikTok, whose CEO
lives in Singapore, hired a raft of well-connected American and European executives and security experts as political pressure intensified under Trump. It says it has no formal headquarters. It has sought to soothe U.S. concerns by storing user data in the United States, with a backup in Singapore.
The U.S. government’s security concerns come in two forms. The first, as Trump suggested in his executive order, is whether the vast trove of data TikTok holds — about the private sexual desires of fans of the app who might end up becoming U.S. public officials, for instance — should be viewed as a national security issue. There’s no evidence the data has ever been used that way, and TikTok is hardly the only place Americans share details of their lives on social media
. The second concern is whether TikTok censors politically sensitive posts.
A report this year by Citizen Lab, the cybersecurity
watchdog organization in Toronto, suggested that both of these concerns are, at best, latent: It did not find any indication that TikTok was either censoring sensitive topics or transmitting data to China.
Some American analysts see TikTok as a profound threat; others view it as the kind of clueless panic that Americans now approaching middle age faced when their parents warned them that if they shared details of their lives on social media, they’d never get a job. Many, many other products, from social networks to banks and credit cards, collect more precise data on their users. If foreign security services wanted that data, they could probably find a way to buy it from the shadowy industry of data brokers.
“Freaking out about surveillance or censorship by TikTok
is a distraction from the fact that these issues are so much bigger than any specific company or its Chinese ownership,” said Samm Sacks, a cybersecurity policy fellow at the research organization New America. “Even if TikTok were American-owned, there is no law or regulation that prevents Beijing from buying its data on the open data broker market.”
One thing that reporting this column has reminded me: The menace that TikTok poses to American national security appears to be entirely hypothetical, and depends on your analysis of both the U.S.-China relationship and the future of technology and culture. But the algorithm’s grasp on what keeps me hooked — between trick tennis shots, Turkish food videos and all the other things it’s figured out I like to watch — did pose a clear and present danger to my ability to finish this column.
©2019 New York Times News Service