Thanks to the Snowdon debacle, the news is full of stories and speculation on the topic of big data-namely the ability to draw meaningful conclusions and make more informed decisions based on analyses of very large data sets. Debate topics typically include individual privacy, consumer convenience, trust, and of course the economic or governmental interest in the value of data.

Stories about big data seem like a new phenomenon, but they’re not. Surveys and statistics have been a fact of life for many decades. Large data samples are known to provide a better, more predictive view of a demographic fact or a behavioral trend than smaller samples. The colloquial phrase "margin of error" is literally a mathematical formula based on data sample size. The current hyperbole is due to the fact that sheer data volumes have increased dramatically in recent years-plus the fact that tools like Tableau make it easier to visualize and extract meaning from mountains of local and cloud-based data.

Can Publishers Participate?

Big data discussions are generally laced with impressive sounding words like "petabyte" and "exabyte" (one million and one billion gigabytes, respectively). The mind boggles at these numbers, and most laymen correctly assume that such data volumes are the purview of science, government, and very large corporations. Although this is generally true, smaller companies (including publishers of any size) should not assume that big data analysis is out of their reach.

We spoke with publishing professionals and a publishing-specific data company on the subject. These individuals did represent large publishers, whose activities are typically more visible than those of their smaller counterparts. However, they all agreed that smaller, more specialized magazine publishers can take advantage of big data analyses now-rather than wait for more data or better tools.

When discussing big data, sample size is only one factor. Equally important are data quality and reliability, which often can only be validated over time and through familiarity with the source. Publishers’ data are unique in that respect, since subscriber information is often based on a long-term relationship with the brand, not on isolated transactions.

In fact, marketers familiar with typical consumer purchasing data are often surprised at the rich nature of subscriber data, especially that of specialty magazines with loyal followings.

Finally, it is critical to remember that the purpose of data analysis by publishers is somewhat unique. A typical business may use it to improve its website’s ability to attract, quality, and convert web visitors into paying (and satisfied) customers. Publishers’ goals are similar, but expressed differently:

• For advertising, the purpose of a magazine publisher’s data analysis is to present detailed, accurate, and compelling profiles of its audience to prospective advertisers and sponsors-and, once they’ve advertised, to measure the actual results
• For circulation, its purpose is to facilitate the "journey" of readers or users towards becoming long-term brand adherents, through paid subscriptions and renewals
• For editorial, its purpose is to confirm (and occasionally inform) editors’ and writers’ understanding of their audience

Two Sets of Data are Better than One

We spoke with Condé Nast’s vice president of data and marketing analytics, Christopher Reynolds, whose previous experience with Kraft Foods only partially prepared him for the magazine world. Upon joining the company four years ago, Reynolds was tasked with making the publisher’s data a more practical means for advertising and circulation growth. Like most publishers, Condé Nast has made use of web and mobile analytics, and had regularly surveyed its subscribers for many years. However, until recently, they had not fully leveraged both sets of data together.

"From the analytics end of publishing, the application of big data is one of the more exciting events in the past 20 years," Reynolds says. He believes this is changing what analytics can do for a publisher, through the integration of data management platforms and analytics tools. "Historically, an analytics tool has been ‘a count of people looking at certain things.’ A data management platform is the other side. It says ‘we know who these people are, and we’re trying to deliver them something because of what we know.’ The exciting part about combining those two data sets is that it gives us a picture that we’ve never had access to before."

By combining these data, the company has a better view of what kinds of people are attracted to certain content, and can learn to better manage audiences and deliver content. It can also better identify the "consumer journey" and its relationship with a publishing brand. "All this comes from the fact that you’re combining your knowledge of the person and of what you are delivering," he says.

As someone who had been somewhat suspicious of consumer databases, Reynolds was impressed by the rich nature of Condé Nast’s subscriber survey data, which has been systematically collected over the past 25-30 years. "Historically, the data was very logistics oriented," he says. "We needed to get magazines to houses. Now, we are able to use that data fully, because we understand that reader. We have a relationship with consumers that third-party data providers just don’t have."

Condé Nast’s Catalyst system incorporates over 55 million records, based on reader survey data. These include extensive surveys of their "preferred subscriber network" of over 300,000 respondents per year, including feedback on over 800 products in 35 product categories, plus demographic and psychographic data and responses to lifestyle questions. Although Reynolds described the resulting data set as almost too much, he acknowledged that it enabled Condé Nast to segment the data more narrowly while maintaining a low margin of error.

The Catalyst system has allowed the company to create standard consumer profiles or "catalyst" groups for advertisers, including "Alpha-Millennial" (young peer leaders), "Motor Maven" (luxury car experts/friends’ source), and others. Custom profiles can also be created for individual advertisers, giving them a clearer picture of their potential audience.

Reynolds maintains that smaller publishers have an even greater potential to leverage their subscriber and analytics data, since their knowledge of their highly specialized readership is far greater than any third-party data provider. By adding new data sets, including transactions, sign-ups, and advertising calls to action, the final result will be a compelling argument to specialized advertisers-and a way to maintain subscriber brand loyalty.

Over the past five years, Reynolds notes, it has become easier to put extraction layers and even entire data sets into the cloud, greatly reducing the need to have rigidly structured data in hand before one could extract meaningful conclusions. "You’re now able to pull together a lot of different data from different sources because of the technology that exists today."

More is Not Always Better

Until very recently, Dean Praetorius was the director of trends and social media at The Huffington Post. During his tenure, he found that extremely large data sets required some innovative filtering in order to inform truly useful decisions. "For a long time, the trend has simply been ‘more is better,’" he says. "And to some extent that’s still true. But the trend now is to figure out how to pick out key points that help you build an effective audience. Especially around social insights, which can make or break a brand."

Social networking is one of the best sources of data that smaller publishers can use-outside their own subscriber data-according to Praetorius. "There are more services than ever scanning social networks and looking at huge data sets-looking for outliers and patterns," he says. "Those insights can totally change a company’s approach, and give them a serious look into the competition." He especially recommends BuzzSumo and NewsWhip’s Spike for scanning large volumes of social data and surfacing viral content. For internal analysis, he recommends Chartbeat, which provides real-time content updates to show how successful a publisher is at promoting the right pieces.

Praetorius notes that performance expectations for all content have changed. "It used to be that if content was considered great, it would make it to the front page. Now it has to perform. Publishers are held accountable for how engaging their content is. When it’s done right, both publisher and consumer win."

Fine-Tuning Advertising Metrics

Forbes Media’s chief revenue officer, Mark Howard, has applied big data methodologies to enhance the effectiveness of b-to-b advertising on the company’s website. "There is no shortage of data available to marketers or publishers, the challenge is figuring out what data to focus on," he says.

He maintains that the combining of a publisher’s own data on its audience and interests with more open, third-party data, when properly productized, results in a powerful and highly differentiated advertising offering.

Howard notes that publishers’ greatest challenge is figuring out how to engage key audiences at scale. Publishers need to focus on understanding "how readers use their site, what that means for the exposure of ads in each respective area of the site, and how site users interact with the content and the ads," he says. To that end, Forbes has partnered with the ad and content measurement firm Moat. For the past 2.5 years, they have tagged all ads and pages in order to best qualify the experience of running a campaign on

"Businesses are more focused on understanding the ever-changing customer journey," he says. "It’s well documented that for most b-to-b purchase decisions, 70 percent or more of the research takes place before ever getting in touch with the companies you’re evaluating. This dramatic change to the traditional purchase funnel means that marketers need to nurture their databases in new ways and looking to publishers for insights on their customers that can become actionable in addressing them."

A practical result of Forbes’ and Moat’s data analyses has been a re-adjustment of their advertising cost model. "We’ve gone as far as to work with an advertiser on a program where a significant portion of the campaign was bought on a cost-per-interaction as tracked by Moat," Howard says. The advertiser ran a content-rich half page ad unit, where users could interact with individual content elements. Forbes and Moat data focused on significant interactions, as opposed to clicks, allowing them to drive a significant number of interactions-as well as optimize line items to placement on the site that facilitated the most efficient interactions.

"For the advertiser, the CPI went down and for us the eCPM went up. All of this was only made possible because we were transparent with the advertiser about the metrics we had available and their understanding of the methodology that Moat utilizes to track the impressions, time in view and interaction rates."

Like others we interviewed, Howard maintains that such analyses are well within the reach of smaller publishers, particularly in the area of understanding ad performance. "To not know who your audience is, and not know how ads perform from an exposure and interactions standpoint on every page of your site, puts you at a disadvantage," he says. Most publishers understand this but not everyone has invested fully in it yet."

Platform Options

Cloud developer Ekho’s Product VP, Josh Camire, described his company’s platform that lets publishers combine big data and social data to deliver curated content and build community around their brands. It uses contextual filters on keywords, demographic data, geo-targeting, and other methodologies, then ranks and scores the results, using custom algorithms, to provide actionable information. The company’s publishing clients include Summit Professional Networks.

According to Camire, the company provides data analyses optimized for many facets of a publishers workflow. This includes editorial, providing a framework for understanding trends and delivering crowd-sourced, curated content. Editors also use the service to recognize patterns and conduct investigative research. On the sales and marketing fronts, Ekho can be used to sell sponsorship and native advertising units, deliver promotions to subscribers, and reach non-sub- scribers with contextually relevant content.

Like others we interviewed, Ekho recommends combining multiple data sources to achieve actionable results. These include customer data, public information, and data from social media.


From our brief survey, it is clear that publicly available data tools are only beginning to emerge. Combining the relatively dry event data of web analytics with the more personal, and commercially powerful, subscriber information has the power to change advertising, subscription, and even editorial models. For publishers, these changes can be an economic edge.

However, with big data come big responsibilities. Publishers-like any business-must be mindful of the hazards as well the opportunities of big data analyses. Their trust relationship with subscribers is vulnerable if either the publisher or an advertiser goes too far with their newfound power, and crosses the line between consumer engagement and intrusion.

John Parsons ( is the principal of IntuIdeas LLC in Seattle. He writes and advises on a variety of topics and technologies, including mobile publishing, online video, editorial and design workflow, e-books, digital color, and Web-to-print.

Folio: CEO Summit
Check out this related session at The Folio: Show, November 1-2 in NYC!

The Folio: CEO Summit, being held on Tuesday, November 1, 2016 is an extraordinary educational and networking event that will…