How the Financial Times is Using Text Mining to Sift Through the News
Behind Newssift, FT’s new standalone search tool.
Search used to be a buzzword in publishing circles—before social media came around.
But some publishers are leveraging improved search technology to go deeper.
One such publisher, the Financial Times, recently launched Newssift, a standalone search tool that uses text mining technology to “sift” through thousands of global business articles to give users results that go far beyond the traditional Google keyword search.
“Qualitative news is a powerful determinant affecting stock prices and corporate reputations,” Newssift’s mission statement reads. “However, this type of news has been difficult to search and nearly impossible to analyze through keyword searches alone.”
“It allows for the extracting of meaning,” said David Crouy, marketing director at Montreal-based Nstein Technologies, which provides the back-end extraction for the Financial Times. “What it does is index content from multiple sources, and adds a layer of metadata to make the search more relevant for users"—a task that would otherwise be work-intensive for editors.
“Rather than have 10 to 15 people manually tagging articles,” Crouy said, “the initial step of ranking and extracting of meta-data is done for you.”
A quick search of “Barack Obama” on the Financial Times’ site shows off the engine’s promise. The topline results are groups among five areas of interest: “Business Topic” (such as environmental and trade policy) “Organization” (like Al Qaeda and GM), “Place” (Iran, Iraq and D.C.), “Person” (from Hillary to “Mahmoud Ahmadinejad”) and—perhaps most useful—“Theme” (“Administration,” “Muslim World,” etc.).
From there, the results are broken down by “sentiment” (positive, negative or neutral), “sources” (online, portals, magazines, newspapers—or, of course, original articles from the Financial Times). Traditional search options—including keywords and date—are offered below the fold.
“It’s a drilling down of relevancy that doesn’t really exist elsewhere,” Crouy said.
The cost of Nstein’s solution, Crouy said, ranges anywhere from $50,000 to $500,000, depending on the size and scope of the project.