Forbes India 15th Anniversary Special

Inside Facebook's data wars

The more Facebook shares about what happens on its platform, the more it risks exposing uncomfortable truths that could further damage its image

By Kevin Roose
Published: Jul 17, 2021

Inside Facebook's data warsBrian Boland, a former Facebook vice president in charge of partnerships strategy and an advocate for more transparency, stands for a portrait in Mercer Island, Wash., June 19, 2021. Boland, who argued that Facebook should publicly share as much information as possible about what happens on its platform — good, bad or ugly, left the company in November; Image: Christian Sorensen Hansen/The New York Times

One day in April, the people behind CrowdTangle, a data analytics tool owned by Facebook, learned that transparency had limits.

Brandon Silverman, CrowdTangle’s co-founder and CEO, assembled dozens of employees on a video call to tell them that they were being broken up. CrowdTangle, which had been running quasi-independently inside Facebook since being acquired in 2016, was being moved under the social network’s integrity team, the group trying to rid the platform of misinformation and hate speech. Some CrowdTangle employees were being reassigned to other divisions, and Silverman would no longer be managing the team day to day.

The announcement, which left CrowdTangle’s employees in stunned silence, was the result of a yearlong battle among Facebook executives over data transparency and how much the social network should reveal about its inner workings.

On one side were executives, including Silverman and Brian Boland, a Facebook vice president in charge of partnerships strategy, who argued that Facebook should publicly share as much information as possible about what happens on its platform — good, bad or ugly.

On the other side were executives, including the company’s chief marketing officer and vice president of analytics, Alex Schultz, who worried that Facebook was already giving away too much.

They argued that journalists and researchers were using CrowdTangle, a kind of turbocharged search engine that allows users to analyze Facebook trends and measure post performance, to dig up information they considered unhelpful — showing, for example, that right-wing commentators like Ben Shapiro and Dan Bongino were getting much more engagement on their Facebook pages than mainstream news outlets.

These executives argued that Facebook should selectively disclose its own data in the form of carefully curated reports, rather than handing outsiders the tools to discover it themselves.

Team Selective Disclosure won, and CrowdTangle and its supporters lost.

An internal battle over data transparency might seem low on the list of worthy Facebook investigations. And it’s a column I’ve hesitated to write for months, in part because I’m uncomfortably close to the action. (More on that in a minute.)

But the CrowdTangle story is important because it illustrates the way that Facebook’s obsession with managing its reputation often gets in the way of its attempts to clean up its platform. And it gets to the heart of one of the central tensions confronting Facebook in the post-Donald Trump era. The company, blamed for everything from election interference to vaccine hesitancy, badly wants to rebuild trust with a skeptical public. But the more it shares about what happens on its platform, the more it risks exposing uncomfortable truths that could further damage its image.

The question of what to do about CrowdTangle has vexed some of Facebook’s top executives for months, according to interviews with more than a dozen current and former Facebook employees as well as internal emails and posts.

These people, most of whom would speak only anonymously because they were not authorized to discuss internal conversations, said Facebook’s executives were more worried about fixing the perception that Facebook was amplifying harmful content than figuring out whether it actually was amplifying harmful content. Transparency, they said, ultimately took a back seat to image management.

Facebook disputes this characterization. It says that the CrowdTangle reorganization was meant to integrate the service with its other transparency tools, not weaken it, and that top executives are still committed to increasing transparency.

“CrowdTangle is part of a growing suite of transparency resources we’ve made available for people, including academics and journalists,” said Joe Osborne, a Facebook spokesperson. “With CrowdTangle moving into our integrity team, we’re developing a more comprehensive strategy for how we build on some of these transparency efforts moving forward.”

But the executives who pushed hardest for transparency appear to have been sidelined. Silverman has been taking time off and no longer has a clearly defined role at the company, several people with knowledge of the situation said. (Silverman declined to comment about his status.) And Boland, who spent 11 years at Facebook, left the company in November.

“One of the main reasons that I left Facebook is that the most senior leadership in the company does not want to invest in understanding the impact of its core products,” Boland said, in his first interview since departing. “And it doesn’t want to make the data available for others to do the hard work and hold them accountable.”

Boland, who oversaw CrowdTangle as well as other Facebook transparency efforts, said the tool fell out of favor with influential Facebook executives around the time of last year’s presidential election, when journalists and researchers used it to show that pro-Trump commentators were spreading misinformation and hyperpartisan commentary with stunning success.

“People were enthusiastic about the transparency CrowdTangle provided until it became a problem and created press cycles Facebook didn’t like,” he said. “Then, the tone at the executive level changed.”

The Twitter Account That Launched 1,000 Meetings

Here’s where I, somewhat reluctantly, come in.

I started using CrowdTangle a few years ago. I’d been looking for a way to see which news stories gained the most traction on Facebook, and CrowdTangle — a tool used mainly by audience teams at news publishers and marketers who want to track the performance of their posts — filled the bill. I figured out that through a kludgey workaround, I could use its search feature to rank Facebook link posts — that is, posts that include a link to a non-Facebook site — in order of the number of reactions, shares and comments they got. Link posts weren’t a perfect proxy for news, engagement wasn’t a perfect proxy for popularity, and CrowdTangle’s data was limited in other ways, but it was the closest I’d come to finding a kind of cross-Facebook news leaderboard, so I ran with it.

At first, Facebook was happy that I and other journalists were finding its tool useful. With only about 25,000 users, CrowdTangle is one of Facebook’s smallest products, but it has become a valuable resource for power users including global health organizations, election officials and digital marketers, and it has made Facebook look transparent compared with rival platforms like YouTube and TikTok, which don’t release nearly as much data.

But the mood shifted last year when I started a Twitter account called @FacebooksTop10, on which I posted a daily leaderboard showing the sources of the most-engaged link posts by U.S. pages, based on CrowdTangle data.

Last fall, the leader board was full of posts by Trump and pro-Trump media personalities. Since Trump was barred from Facebook in January, it has been dominated by a handful of right-wing polemicists like Shapiro, Bongino and Sean Hannity, with the occasional mainstream news article, cute animal story or K-pop fan blog sprinkled in.

The account went semiviral, racking up more than 35,000 followers. Thousands of people retweeted the lists, including conservatives who were happy to see pro-Trump pundits beating the mainstream media and liberals who shared them with jokes like “Look at all this conservative censorship!” (If you’ve been under a rock for the past two years, conservatives in the United States frequently complain that Facebook is censoring them.)

The lists also attracted plenty of Facebook haters. Liberals shared them as evidence that the company was a swamp of toxicity that needed to be broken up; progressive advertisers bristled at the idea that their content was appearing next to pro-Trump propaganda. The account was even cited at a congressional hearing on tech and antitrust by Rep. Jamie Raskin, D-Md., who said it proved that “if Facebook is out there trying to suppress conservative speech, they’re doing a terrible job at it.”

Inside Facebook, the account drove executives crazy. Some believed that the data was being misconstrued and worried that it was painting Facebook as a far-right echo chamber. Others worried that the lists might spook investors by suggesting that Facebook’s U.S. user base was getting older and more conservative. Every time a tweet went viral, I got grumpy calls from Facebook executives who were embarrassed by the disparity between what they thought Facebook was — a clean, well-lit public square where civility and tolerance reign — and the image they saw reflected in the Twitter lists.

As the election approached last year, Facebook executives held meetings to figure out what to do, according to three people who attended them. They set out to determine whether the information on @FacebooksTop10 was accurate (it was) and discussed starting a competing Twitter account that would post more balanced lists based on Facebook’s internal data.

They never did that, but several executives — including John Hegeman, the head of Facebook’s news feed — were dispatched to argue with me on Twitter. These executives argued that my Top 10 lists were misleading. They said CrowdTangle measured only “engagement,” while the true measure of Facebook popularity would be based on “reach,” or the number of people who actually see a given post. (With the exception of video views, reach data isn’t public, and only Facebook employees and page owners have access to it.)

Last September, Mark Zuckerberg, Facebook’s CEO, told Axios that while right-wing content garnered a lot of engagement, the idea that Facebook was a right-wing echo chamber was “just wrong.”

“I think it’s important to differentiate that from, broadly, what people are seeing and reading and learning about on our service,” Zuckerberg said.

But Boland said that was a convenient deflection. He said that in internal discussions, Facebook executives were less concerned about the accuracy of the data than about the image of Facebook it presented.

“It told a story they didn’t like,” he said of the Twitter account, “and frankly didn’t want to admit was true.”

The Trouble With CrowdTangle

Around the same time that Zuckerberg made his comments to Axios, the tensions came to a head. The Economist had just published an article claiming that Facebook “offers a distorted view of American news.”

The article, which cited CrowdTangle data, showed that the most engaged American news sites on Facebook were Fox News and Breitbart and claimed that Facebook’s overall news ecosystem skewed right wing. John Pinette, Facebook’s vice president of global communications, emailed a link to the article to a group of executives with the subject line “The trouble with CrowdTangle.”

“The Economist steps onto the Kevin Roose bandwagon,” Pinette wrote. (See? Told you it was uncomfortably close to home.)

Nick Clegg, Facebook’s vice president of global affairs, replied, lamenting that “our own tools are helping journos to consolidate the wrong narrative.”

Other executives chimed in, adding their worries that CrowdTangle data was being used to paint Facebook as a right-wing echo chamber.

David Ginsberg, Facebook’s vice president of choice and competition, wrote that if Trump won reelection in November, “the media and our critics will quickly point to this ‘echo chamber’ as a prime driver of the outcome.”

Fidji Simo, the head of the Facebook app at the time, agreed.

“I really worry that this could be one of the worst narratives for us,” she wrote.

Several executives proposed making reach data public on CrowdTangle, in hopes that reporters would cite that data instead of the engagement data they thought made Facebook look bad.

But Silverman replied in an email that the CrowdTangle team had already tested a feature to do that and found problems with it. One issue was that false and misleading news stories also rose to the top of those lists.

“Reach leader board isn’t a total win from a comms point of view,” Silverman wrote.

Schultz had the dimmest view of CrowdTangle. He wrote that he thought “the only way to avoid stories like this” would be for Facebook to publish its own reports about the most popular content on its platform, rather than releasing data through CrowdTangle.

“If we go down the route of just offering more self-service data you will get different, exciting, negative stories in my opinion,” he wrote.

Osborne said Schultz and the other executives were discussing how to correct misrepresentations of CrowdTangle data, not strategizing about killing off the tool.

A few days after the election in November, Schultz wrote a post for the company blog called “What Do People Actually See on Facebook in the U.S.?” He explained that if you ranked Facebook posts based on which got the most reach, rather than the most engagement — his preferred method of slicing the data — you’d end up with a more mainstream, less sharply partisan list of sources.

“We believe this paints a more complete picture than the CrowdTangle data alone,” he wrote.

That may be true, but there’s a problem with reach data: Most of it is inaccessible and can’t be vetted or fact-checked by outsiders. We simply have to trust that Facebook’s own private data tells a story that’s very different from the data it shares with the public.

Tweaking Variables

Zuckerberg is right about one thing: Facebook is not a giant right-wing echo chamber.

But it does contain a giant right-wing echo chamber — a kind of AM talk radio built into the heart of Facebook’s news ecosystem, with a hyperengaged audience of loyal partisans who love liking, sharing and clicking on posts from right-wing pages, many of which have gotten good at serving up Facebook-optimized outrage bait at a consistent clip.

CrowdTangle’s data made this echo chamber easier for outsiders to see and quantify. But it didn’t create it or give it the tools it needed to grow — Facebook did — and blaming a data tool for these revelations makes no more sense than blaming a thermometer for bad weather.

It’s worth noting that these transparency efforts are voluntary and could disappear at any time. There are no regulations that require Facebook or any other social media companies to reveal what content performs well on their platforms, and American politicians appear to be more interested in fighting over claims of censorship than getting access to better data.

It’s also worth noting that Facebook can turn down the outrage dials and show its users calmer, less divisive news anytime it wants. (In fact, it briefly did so after the 2020 election, when it worried that election-related misinformation could spiral into mass violence.) And there is some evidence that it is at least considering more permanent changes.

This year, Hegeman asked a team to figure out how tweaking certain variables in the core news feed ranking algorithm would change the resulting Top 10 lists, according to two people with knowledge of the project.

The project, which some employees refer to as the “Top 10” project, is still underway, the people said, and it’s unclear whether its findings have been put in place. Osborne said that the team looks at a variety of ranking changes and that the experiment wasn’t driven by a desire to change the Top 10 lists.

As for CrowdTangle, the tool is still available, and Facebook is not expected to cut off access to journalists and researchers in the short term, according to two people with knowledge of the company’s plans.

Boland, however, said he wouldn’t be surprised if Facebook executives decided to kill off CrowdTangle entirely or starve it of resources, rather than dealing with the headaches its data creates.

“Facebook would love full transparency if there was a guarantee of positive stories and outcomes,” Boland said. “But when transparency creates uncomfortable moments, their reaction is often to shut down the transparency.”

©2019 New York Times News Service