The importance of getting users to trust your data

When I’d just joined my current company, I’d heard stories about people leaving because they could never trust the numbers that were being thrown around. They were frustrated that more time and effort was spent trying to get the correct numbers for their reporting than doing any “actual” work. What is more, because they couldn’t trust the data, many decisions were executed on “feel” instead of facts (the outcomes of which will probably not be the best).

One big reason for the data discrepancies lay in the fact that each region (e.g. Asia and America) was measured differently. Whenever cross-region discussions were carried out, each side would be using numbers generated by each side’s reporting and analytics team (e.g. me), each tailored to their respective region’s needs and type of measurements.

As you can imagine, this was always going to lead to conflict and confusion.

Joe Bigshot: “Your team made a loss last month with horrible profit margins. How do you explain that?”

David Salesmaster: “What do you mean I made a loss? It was strategically breaking-even! And in fact, from my own estimates, I’d turned in a slight profit.”

(There is a beautiful Chinese saying for this: 鸡同鸭讲, literally translated as “chicken and duck talk”. One side clucks; the other side quacks.)

But inter-region differences weren’t the only thing affecting the integrity of the reported numbers. Even within regions differences abounded. Plenty of people maintained their own data silos (e.g. using spreadsheets with data copy and pasted from elsewhere) that were often not maintained very well.

Joe Average: “Hey Donn, the data you’d sent yesterday is wrong.”

Me: “What do you mean wrong? Wait a minute. When was the data you used for your comparison last extracted?”

Joe: “About two months ago?”

Me: “…”

One way that we stamped out inter-region differences was to get very clear about what the differences were. Once that was done, we always explained to recipients that our region’s data differed from others in so-and-so ways. If they wanted to compare numbers with their other-region counterparts, they should factor these differences in. You might think this was an insignificant move we did, but you’d be surprised at the number of “oh, I didn’t know that” or “so that’s why it’s different” exclamations from grateful listeners.

Even though the numbers themselves may not have been more accurate, the perception of their accuracy no doubt improved significantly. People started being more trusting ofthe numbers we were giving out, and started confidently making use of the data in their decision-making.

Though I cannot be sure about this, I believe that this move had also reduced the number of data silos people were maintaining as well. There are far fewer incidents of “why does your data differ from mine in so-and-so ways?” as compared to early on. With people trusting our numbers more than their own there just isn’t a need for them to maintain such silos (especially true when the datasets they were maintaining were growing too fast for their comfort).

In the end, the small action of informing people why there were differences in data led to an increase of trust in the data, which led to an increase in the use of data, eventually leading to better, data-backed decisions.

Reminds me a little of the broken windows theory, with the supposed existence of “broken windows” (little acts of vandalism or literally broken windows) leading to a greater incidence of overall crime, and the elimination of which led to a significant reduction of overall crime. May or may not be true, but it leads to a great story.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s