How the Financial Times is Using Text Mining to Sift Through the News
Behind Newssift, FTâ€™s new standalone search tool.
Search used to be a buzzword in publishing circlesâ€”before social media came around.
But some publishers are leveraging improved search technology to go deeper.
One such publisher, the Financial Times, recently launched Newssift, a standalone search tool that uses text mining technology to â€śsiftâ€ť through thousands of global business articles to give users results that go far beyond the traditional Google keyword search.
â€śQualitative news is a powerful determinant affecting stock prices and corporate reputations,â€ť Newssiftâ€™s mission statement reads. â€śHowever, this type of news has been difficult to search and nearly impossible to analyze through keyword searches alone.â€ť
â€śIt allows for the extracting of meaning,â€ť said David Crouy, marketing director at Montreal-based Nstein Technologies, which provides the back-end extraction for the Financial Times. â€śWhat it does is index content from multiple sources, and adds a layer of metadata to make the search more relevant for users"â€”a task that would otherwise be work-intensive for editors.
â€śRather than have 10 to 15 people manually tagging articles,â€ť Crouy said, â€śthe initial step of ranking and extracting of meta-data is done for you.â€ť
A quick search of â€śBarack Obamaâ€ť on the Financial Timesâ€™ site shows off the engineâ€™s promise. The topline results are groups among five areas of interest: â€śBusiness Topicâ€ť (such as environmental and trade policy) â€śOrganizationâ€ť (like Al Qaeda and GM), â€śPlaceâ€ť (Iran, Iraq and D.C.), â€śPersonâ€ť (from Hillary to â€śMahmoud Ahmadinejadâ€ť) andâ€”perhaps most usefulâ€”â€śThemeâ€ť (â€śAdministration,â€ť â€śMuslim World,â€ť etc.).
From there, the results are broken down by â€śsentimentâ€ť (positive, negative or neutral), â€śsourcesâ€ť (online, portals, magazines, newspapersâ€”or, of course, original articles from the Financial Times). Traditional search optionsâ€”including keywords and dateâ€”are offered below the fold.
â€śItâ€™s a drilling down of relevancy that doesnâ€™t really exist elsewhere,â€ť Crouy said.
The cost of Nsteinâ€™s solution, Crouy said, ranges anywhere from $50,000 to $500,000, depending on the size and scope of the project.