W Power 2024

Hicham Oudghiri and Marc DaCosta: Data's cartographers

Two philosophers are using machine learning and artificial intelligence to create a real-time map of the global economy, and the world's largest financial firms are lining up at its door

Published: Mar 27, 2019 02:32:25 PM IST
Updated: Mar 27, 2019 04:51:08 PM IST

Hicham Oudghiri and Marc DaCosta: Data's cartographersIn the early days, Enigma’s Marc DaCosta and Hicham Oudghiri funded their startup by hawking goods on eBay and building websites for companies like Siggi’s yogurt
Image: Jamel Toppin for Forbes

How much did the federal government shutdown cost? After 35 days of deadlock, $6,354,845,148 in wages had gone unpaid to 747,573 federal employees. Every second that ticked by added $2,118 to the figure. Some staffers were furloughed; many more worked without pay. At the Department of Homeland Security, for example, some 245,405 employees were unpaid and 32,706 furloughed. At Treasury, 36,309 workers were furloughed and 82,336 worked unpaid. At the Environmental Protection Agency, 52 percent of its staffers worked without compensation.

Getting a handle on the magnitude of this disruption—as well as its granular details—was no small feat. A little-known New York City fintech company named Enigma tabulated the rising costs in real time on a website it threw up called Government Shutdown 2018–2019.

In order to do so, Enigma searched the Office of Management & Budget’s contingency plans for 109 federal departments. It also tapped FederalPay.org for average salaries and worker counts for 46 agencies. In all, it combed through 2,000 pages and 2 million spreadsheet rows of federal data to create the tracker. From idea to launch, the effort took three Enigma number crunchers a total of 36 hours.

The shutdown data is available free, but Enigma’s ability to rapidly make sense of multiple disconnected data ­sources, public and private, and create a customisable view of the global economy has attracted some of the world’s leading companies, from BlackRock to PayPal and Celgene, with many clients paying more than $1 million a year for fingertip access to its insights. 

Enigma is the brainchild of Hicham Oudghiri and Marc DaCosta, best friends since they met 16 years ago as undergraduates at Columbia University, where they studied philosophy. Their startup organises information from thousands of sources around the world into a single, fully linked interface.
“People know how the internet works and how to log users and serve them cookies to suggest products on Amazon. That problem is solved. What we’re doing is building a model of the real world,” says Oudghiri, 34, speaking from Enigma’s Humboldt boardroom, named after the Enlightenment-era Prussian philosopher. His co-founder, DaCosta, 34, adds, “This isn’t just about a faster microprocessor or better statistics. Enigma is a knowledge graph of what’s going on in the economy.”

Oudghiri and DaCosta’s journey into the field of data mining began after the financial crisis of 2008. DaCosta was doing graduate work in the cultural anthropology of data at the University of California, Irvine, while Oudghiri was managing renewable-energy projects for BCME Bank in Casablanca, Morocco. Both were curious about explaining the world in light of the global disruptions going on. So they reunited and began to organise sets of publicly available data, starting with Federal Aviation Administration flight logs. They soon discovered a trove of valuable information hiding in plain sight—buried in government logs, university research publications, arcane business filings and shipping manifests. If they could collect, scrub, organise and analyse it, they thought, it might produce a near real-time rendering of the macro economy.

In 2011, Oudghiri and DaCosta formed Enigma and went to work amalgamating public data, mostly from government ­sources like the Census Bureau, the FCC, the Federal Election Commission and the IRS, as well as import records from the US Customs & Border Protection agency and building permits, and putting it together as a single source. They also became experts in uncovering complex, hard-to-find information. For instance, using Freedom of Information Act requests, Enigma taps the CBP’s Automated Manifest System to track every container ship that arrives in the US, including the importer and port of call. From the National Fire Incident Reporting System, Enigma retrieves the cause and location of every fire in the US. To cover energy, Enigma leans on oil-well data from the Railroad Commission of Texas, founded in 1891 to establish tariffs.

The firm’s breakthrough year was 2014, when Enigma raised $4.5 million from Comcast, American Express and the New York Times Co and entered the Fintech Innovation Lab, created by Accenture and the Partnership Fund for New York City. There, embedded with banks and Wall Street giants, Oudghiri and DaCosta discovered that their data had immense utility in financial services. Its information could be connected with customer data in banks’ systems, helping them more quickly recognise fraud or businesses and individuals to underwrite. “We walked out with a battle plan,” DaCosta says of the accelerator. They quickly built a software package, complete with colourful knowledge graphs, and put specialised tools into a compliance interface they called Dossier.

To date, Enigma has synthesised 100,000 data sets in more than 100 countries, organised intelligence on 30 million small businesses and accumulated 140 billion points of data on the US population. It has mapped every molecule used by the US pharmaceutical industry, as well as all trials, patent filings and adverse events.

Braininess permeates Enigma’s headquarters in New York’s Flatiron District: Conference rooms are named after philosophers like Michel de Montaigne and Augustine. Bookshelves are filled with scholarly tomes, from coding manuals like Machine Learning for Hackers and Data Quality and Record Linkage Techniques to classics like the Dialogues of Plato, Ulysses and Rousseau’s Confessions. Covering the walls are maps and experimental artwork exploring “the intersection of data and experience”.

But Enigma’s nerdy academic culture hasn’t stopped it from attracting corporate clients and funding from Silicon Valley and Wall Street. BlackRock, PayPal, American Express, MetLife, BB&T, Celgene, Merck and EMD Millipore have signed up. Some $130 million in capital has flowed into the firm in the past seven years from venture investors like NEA, Crosslink Capital and Glynn Capital and hedge funds like Two Sigma Ventures and Third Point Ventures.

Forbes estimates Enigma’s valuation is $750 million, with revenues pushing $30 million annually, the company having doubled its customer base in 2018.

If a hedge fund wants to know which restaurant chains are growing the fastest, Enigma can check FCC logs for radio licences, which are required to open drive-through windows. Insurers use Enigma for risk assessment, and pharma companies query its data-crunching machines to improve drug safety.

Want to quickly and precisely identify the best candidates for small business loans? Instead of cold-calling storefronts or schmoozing local Chamber of Commerce officials, Enigma will synthesise property-tax filing information with state business filings and Uniform Commercial Code liens to come up with automated credit identities. Need to avoid underwriting in risky fire zones? Why not sew together data sets on emergency-call logs and building permits?

At MetLife, chief digital officer Greg Baxter is starting to use Enigma data pulled from public health systems and universities by connecting it to its own systems to detect pockets of illness or risk and to improve underwriting. And in MetLife’s $588 billion investment management arm, Baxter is using Enigma data to quantify how the quality of restaurants, parks and community event spaces affects real estate prices. “They discover data sources, they organise the data and then they find ways through machine learning of connecting the data,” Baxter says. “When you combine that external data with our internal domain data, then you start to get phenomenally predictive and insightful trends.”

Investor John Fogelsong of Glynn Capital thinks the startup could present a threat to legacy closed-system “big box” technology vendors like Oracle, IBM, SAS and SAP: “Each new data set Enigma ingests compounds the company’s ability to improve a customer’s business processes.”
BlackRock’s recent experience is illustrative. When Frank Cooper, its newly minted chief marketing officer, was asked to shake up how the $6-trillion-in-assets firm prospects for clients, Enigma made a surprising discovery: Contrary to the traditional approach based on regional and demographic targeting, there was little correlation between a client’s location and his or her preparedness for retirement, but political engagement ranked high. “If someone is politically active and even if they are a renter, they’re much more likely to plan for retirement,” Cooper says. “It was shocking for us.”
 
Not all of Enigma’s machine-learning algorithms are trained on the hot pursuit of profits. The firm has volunteered its data efforts toward studying the gender gap across 558 occupations and has already identified that some of the most egregious disparities occur in accounting, retail and sales.
After a November 2014 fire killed five people in a smoke-detector-less home in New Orleans, Enigma worked with the city’s fire department and Office of Performance & Accountability to identify areas where fire safety is weakest.

On Enigma’s public website, it offers national fire-incident data free for city planners to use, and 50 years of weather anomalies in the US, plus nationwide data sets on everything from cancer statistics to so-called adverse events, as defined by the FDA. In New York City, for example, an Enigma FOIA allowed it to synthesise years of rail incident and injury data reported to the Metropolitan Transportation Authority. The firm is also working with a non-profit called Polaris to combat and prevent slavery and human trafficking. The idea is that data can not only lead to better underwriting but also improve government. Oudghiri and DaCosta call it “data for social good”.

Is Enigma vulnerable to the kind of data-usage scandals that have plagued Facebook?

“I think the behaviour from the internet companies has been suspicious if not malicious, creating a picture of your behaviour that is probably above and beyond what you would consent to,” Oudghiri says, arguing that Enigma’s work with financial services firms strikes a more mutually beneficial covenant.

“Your data is a means to understand if you are a legitimate person,” he says. “From a privacy standpoint, you enter those relationships knowingly and explicitly. The need for sharing data arises from a pretty good place.”

(This story appears in the 12 April, 2019 issue of Forbes India. To visit our Archives, click here.)

Post Your Comment
Required
Required, will not be published
All comments are moderated