Minimizing Bias from data loss - ccpa, gdpr

blog

Minimizing Bias from Data Loss: A Unified Measurement Approach

December 11, 2019


With legislation like the California Consumer Privacy act and platform policies limiting the availability of identity data, it has never been more important to have an accurate understanding of the biases and risks associated with data loss. More importantly, brands need to understand how to best combat those biases.

At a recent ARF event, Analytic Partners shared our results from a series of experiments designed to better assess the impact of data loss based on business outcomes from measurement. In performing this experiment, we leveraged a Unified Modeling approach to measurement which included integrating Commercial Mix Models (CMM) with Multi-touch Attribution (MTA) modeling in an iterative process. The process synchronizes the model insights from each technique through proprietary algorithms that are applied for speedy convergence.

 

To combat traditional MTA challenges such as walled gardens, blind spots, seasonality, and data quality issues, Analytic Partners utilizes an innovative attribution modeling approach that estimates the impact of marketing activities on conversions, while simultaneously adding clarity.

 

The Data Loss Experiments

 

We performed simulated data loss experiments within two different scenarios:

  • Scenario 1: In the first scenario, we removed all identifiers from 35% of individuals/households to simulate a data loss. To expand on that, we leveraged three methodologies to evaluate the impact on the results (last click, MTA model only, and Analytic Partners’ Unified Model) and randomly removed an additional 20%, 30% 40%, 50% and 60% of user data in order to evaluate if there was a fundamental point at which models lose the majority of accuracy and stability.
  • Scenario 2: In the second scenario, we removed 10% from all digital data files, plus 25% from each group of two or more demographic pairs. We ran two different versions of demographic data loss (2a and 2b), once again leveraging three methodologies listed above. In addition, we tested the effects of data loss on digital display rank order when using those methodologies.

 

The Results

 

Effects of Data Loss by Methodology Type

The graphs below represent the effects of data loss by methodology both when the data is randomized (scenario 1) and when it’s specific (scenario 2).

 

 

Results remain relatively stable, with a Unified Model performing best, until the point of 50% data loss, when results degrade regardless of methodology.

 

 

 

Specific data loss (by demographic, income) has a greater impact on outcomes than randomized data loss.

 

Key Insights:

 

Across scenarios, a Unified Model outperformed MTA-only and Last Click methodologies. We’re able to clearly see that specific data loss can have a major impact on outcomes relative to randomized loss. These experiments demonstrate that:

  • The majority of results after data loss were within +/- 15% deviation from original results
  • A Unified measurement methodology minimizes the bias introduced from data loss
  • Results degrade with 50% and greater data loss regardless of methodology
  • Specific demographic and income loss (scenario 2) has a greater impact on outcomes

 

Effects of Data Loss and Methodology on Rank Order

 

In addition to the results and insights above, we also tested the impact of data loss on digital display rankings in scenario 2. The first graph demonstrates rank order for digital display types post-data loss utilizing only the Unified Model approach, while the second graph demonstrates the differences in outcome across all three methodologies pre-data loss.

 

 

While the majority of best and worst-performing display types remain similar, meaningful differences in rank occur post data loss.

 

 

There are significantly greater differences in outcomes based on the methodology used versus simulated data loss.

 

Recommendations

 

The key takeaway from the results seen in these scenarios is that oftentimes methodology, more than data loss, affects outcomes. With that in mind, we recommend that brands consider the following when handling similar challenges in data loss:

  • Ensure experiments/testing, validation, and a Unified methodology is incorporated within your measurement strategy
  • Continuously assess the quality of the data/sample and adjust measurement granularity to ensure robust results
  • Major shifts in spend should be monitored for the expected business impact

 

The challenges associated with data loss are inevitable in our constantly changing landscape. Disruption is the new normal, but with the right tools and partners in place – in tandem with a holistic view of their business – brands can adapt to new environments and navigate these challenges with confidence.

 

Analytic Partners would like to thank the ARF Cross-Platform Measurement Council Attribution Working Group for providing the original parameters of these experiments. These results originally shared at ARF DataXScience 2019.

 

Preeti Croke, Senior Director at Analytic Partners

Let's transform your business.

Analytic Partners can help your business adapt.

Contact Us