“Big data” is the big buzzword these days when it comes to media buying. There’s no doubt that the large amounts of data being collected offers tremendous promise for more effective marketing and advertising. However, big data does not by itself guarantee accuracy. As media buying and selling incorporates more and more data into their systems, marketers need to ensure that the data they use is vetted, managed and integrated with the same rigor as current media measurement. This rings true for all media, including digital, TV and print.
In order for big data to be truly smart data, biases and inaccuracies must be corrected. Advertisers, agencies and media companies need to make sure that big data integration is done in a transparent, consistent manner and with the most appropriate rigor possible. Data produced using the highest quality measurement will provide the marketplace with the confidence to transact billions of dollars in advertising.
Why Return Path Data Is Not Enough on Its Own
For many video programming distributors such as cable, satellite and telecommunications companies, data is exclusively collected from set-top boxes. This Return Path Data (RPD) on its own can yield skewed results. The reasons for this can be several fold. One is that set-top boxes often remain powered on throughout the day and overnight, even when viewers aren't watching any programming, which means the data keeps flowing, resulting in higher metrics than warranted. Also, because there are costs associated with aggregating this RPD, some MVPDs only collect where there is a clear business reason to do so. For example, a provider may have 15 million subscribers but only collects data on 60% of its households. As a result, the RPD contains substantial gaps, making the data difficult to accurately interpret without a panel or a statistically valid data set that one can calibrate against.
A Deeper Look at the Potential for Errors in Raw Data
First and foremost, it’s important to keep in mind that RPD sets are not ratings, but rather signals that, with the proper modeling and calibration sets, can represent true television audiences. However, there are several layers of limitations and biases that must be corrected.
As mentioned previously, video programming distributors using RPD may have gaps in viewing, creating a significant under reporting of TV audience viewership. Some distributors don’t account for time shifted viewing because they don’t include DVR viewership. Without this data, the RPD is omitting critical and growing audience viewing behavior, thus calling into question the accuracy of the result.
Additional errors and omissions can further impact the accuracy of RPD. Set-top box maintenance events, such as software downloads, are often indistinguishable from viewing events. These non-viewing events can overinflate viewing numbers. Some RPD data is missing crucial information such as end times or contains incorrect time stamps and wrong station codes, resulting in attribution of viewing to the wrong time slot or the wrong program or network.
Without a high quality calibration panel, there is no way to know the extent of or correct any of these limitations.
Significant RPD Biases That Should Be Corrected Through Calibration
RPD is prone to significant biases for a number of reasons. First, set-top boxes are unable to determine if TVs are actually on.
Second, because in some cases RPD homes receive more channels and services than broadcast or cable homes without a return path, it is difficult to accurately compare viewing behaviors. Essentially, it’s like comparing apples to oranges.
Lastly, RPD can’t determine who is actually in front of the TV at a given moment. Nearly all set-top box data only reports household levels. And those households are not representative of the U.S. population as a whole. Attaching RPD to characteristic data from third-party providers does not solve this problem as there are too many gaps and a low level of accuracy around specific characteristic categories.
Without a way to calibrate results, video programming distributors can make overall viewing levels anything they want. Publishing these metrics without reference to an accredited calibration set raises similar concerns the online ad world is dealing with around viewability and fraud.
RPD Can Lead to Smart Data
When used alongside statistically valid and accredited data sets, such as those created from long standing local and national household panels, RPD can successfully deliver smart data. By identifying where the gaps, errors and overstatements are, it becomes possible to model and project person-level viewing to determine who is in fact in front of the TV. Marketers and advertisers who ensure that RPD is vetted, managed and integrated with the same attention to detail as a currency-quality dataset, are more likely to reap the benefits of smart data they can rely on.
Megan Clarken is president of product leadership at Nielsen