Getty Images

Finance firms scrape alternative data from unexpected sources

Investment firms and individual investors use data from sources as diverse as Facebook and TikTok, private jet tracking, geolocation and sentiment analysis to buy and sell securities.

NEW YORK -- Alternative data technology is hot in the world of finance.

As the stock market rose dramatically over the past decade, institutional and individual traders became increasingly less content with relying on historical pricing information to buy and sell securities.

Instead, they started looking to alternative data from sources as diverse as TikTok and Reddit, sentiment analysis, geolocation and product reviews to inform market trades.

Diverse data sources

At the Ai4 2022 Finance Summit here on March 1, where independent AI vendors and large finance industry users of the latest AI technologies gathered, alternative data was a prominent topic of discussion.

Alternative data vendors on the exhibition floor, such as RavenPack and Bitvore, attracted steady interest. Speakers extolled the virtues of alternative data, new sources of investment information that give traders fresh opportunities to discover market inefficiencies, said Ciamac Moallemi, a professor of business at Columbia University's Graduate School of Business.

"Historically, the type of data that has been used is… so-called technical data --  data about orders in the market, data about trades, data about existing prices, house prices and so on," Moallemi said during a talk about AI and quantitative trading.

"What we see now is a shift toward … alternative data, things like news, like natural language processing, things like information from social media, from Twitter, satellite images, credit card transactions," he continued.

Photo of Columbia University business professor Ciamac Moallemi
Ciamac Moallemi, William von Mueffling professor of business at Columbia University's Graduate School of Business.

Crawling the web for alternative data

Alternative data vendors deploy proprietary machine learning algorithms to crawl the web.

These search engine bots scrape terabytes of information from public record sources such as government contracts, as well as from social media platforms, employee sentiment data and job listings, satellite images of vehicles parked at retail stores, and corporate data aviation records -- a means to verify merger and acquisition rumors.

The alternative data vendors then aggregate the data, normalize and cleanse it for aberrations such as false positives and negatives, and sell it to customers by the terabyte, as curated data sets or as real-time streams delivered through APIs.

"Businesses and investors have begun to edge out competitors by scouring the web for alternative data," said Marta Lopata, chief growth officer at Thinknum, a fast-growing alternative data vendor that tracks information on more than 5 million companies and currently provides more than 34 industry and topic data sets.

Lopata spoke to an audience of about 75 at her session at the in-person conference -- one of a few that have returned to in-person events after two years of pandemic-spawned virtual tech conferences.

She covered how asset managers can use alternative data and detailed some applications for which Fortune 500 companies use Thinknum's alternative data technology.

From employee sentiment to cryptocurrency

For example, investors mine employee sentiment data -- a Thinknum specialty – from sites like Glassdoor, Indeed and Kununu to uncover how happy their own employees and those of companies they're investing in are.

In light of the "Great Resignation" and unprecedented job mobility in part sparked by the pandemic, such data about job happiness is "top of mind for investors today," Lopata said.

Another timely use for alternative data is tracking how inflation in the U.S. is disrupting markets. Thinknum is following used car sales on CarMax and Carvana, two of the big auto sales apps.

"We're tracking all that data in real time down to a VIN number, so that allows you to understand whether prices are peaking," Lopata said. "Beyond just tracking the peaks … we're tracking when the peak ends."

"We're able to identify that in January '22, we finally started to see some decrease in pricing," she added.

Other current market trends for which Thinknum is digging up alternative data include changes in the food delivery services business and cryptocurrency price fluctuations, where the vendor has discovered that GitHub, the provider of internet hosting for software development, is a prime source of data.

"We've been looking at where we can find a signal before it hits the market," Lopata said.

The volume of GitHub "commits," or revisions or changes, about specific cryptocurrencies and blockchain companies peaked last spring and early summer, but there's been a steady decline in commits on the platform about cryptocurrency since then.

"That has proven to be a very interesting proxy for understanding and leveraging the volatility of the crypto market right now," Lopata said.

Old investing methods declining

Old methods of investing, such as relying on past prices of securities to predict future prices, are fading quickly with the sharp rise of alternative data, Moallemi said.

Until recently, nearly every investor has been able to use price data to research stocks and make decisions about whether to buy or sell them. 

"There's a shift to using novel data beyond price data," Moallemi said. "Why? Because everybody has access to prices."

But, " if everybody has access to that same underlying sorts of prices, it's less likely you can see something that the others don't right, versus having unique data, alternative data," he said.

Dig Deeper on AI business strategies

Business Analytics
Data Management