Competitive Intelligence/Analysis

It’s strange how today’s the first day I realise how closely related competitive intelligence is to so much of what I’ve been doing and thinking and writing about in the fields of data analysis *slash* business analysis *slash *  business intelligence *slash* data science etc.

This discipline even has a professional society of  called, aptly enough, Strategic and Competitive Intelligence Professionals, something I quite serendipitously discovered after picking up a book in the library published by them and was feeling curious as to what or who they were).

It’s interesting to figure out how competitive intelligence/analysis fits into the business intelligence umbrella (or how it falls outside of it for that matter), but certainly warrants a look at for anyone interested in leveraging business knowledge/information.

The Case for Modelling

Last week I wrote about the need for experimentation over models. Models are generally too abstract and far afield from reality that it’s hard to get any accurate answers.

If you want to find out if an intervention works, try it. Do experiments on it. Carry out pilot projects and see what happens.

Don’t model just model something to see what’s “likely to happen”. What’s “likely to happen” in your model is going to be highly dependent on the assumptions you make, which in turn are likely to highly dependent on your view of the world. We tend to see what we expect to see, which can make modelling a particularly self-referencing exercise if we’re not careful. Experiments help us see the world as it really is.

But that’s not the full story. Don’t discount modelling just yet.

Joshua Epstein makes a great case for modelling in what is one of the best essays on modelling I’ve read so far. He has one particularly salient point about models that I’d never thought of before: that building models is done by everyone all the time; it’s just that people don’t build explicit models all the time.

By building an explicit model, listing down assumptions and having all the numbers laid out in front of you, at least you know what’s going into your thinking. Not building explicit models, on the other hand, doesn’t mean you’re not using a model, as Epstein explains:

It is just an implicit model that you haven’t written down.

The choice, then, is not whether to build models; it’s whether to build explicit ones. In explicit models, assumptions are laid out in detail, so we can study exactly what they entail. On these assumptions, this sort of thing happens. When you alter the assumptions that is what happens. By writing explicit models, you let others replicate your results.

Overall, the points Epstein makes on modelling are practical and well thought out, and certainly worth the read. Models are, in a nutshell, great at testing assumptions and straightening out thinking before any real decision is made, even if that decision is to go ahead on a pilot project.

The Brilliance of Malcolm Gladwell

I’ve been seeing a lot of Gladwell posts lately on the blogs I read, not least because he’s come up with what seems to be a killer new book, David and Goliath.

I’ll be honest and admit I’m a Gladwell fan. Why? I’m not too sure myself. But there’s a pretty interesting post (in the vein of Gladwell) about Gladwell, and why everything he writes is “so compelling”.


Decision Making: The Need for Experimentation Over Models

Jim Manzi, in his book Uncontrolled, makes a very good point about analytical models and their shortcomings, in particular the need for experimentation (i.e. controlled interventions) to figure out what really happens in the real world (emphasis mine):

Cost changes often could be predicted reliably through engineering studies. But when it came to predicting how people would respond to interventions, I discovered that I could almost always use historical data, surveys, and other information to build competing analyses that would “prove” that almost any realistically proposed business program would succeed or fail, just by making tiny adjustments to analytical assumptions. And the more sophisticated the analysis, the more unavoidable this kind of subterranean model-tuning became. Even after executing some business program, debates about how much it really changed profit often would continue, because so many other things changed at the same time. Only controlled experiments could cut through the complexity and create a reliable foundation for predicting consumer response to proposed interventions.

I’ve been involved in my fair share of modelling, and can’t help but agree with what was written. It’s only too easy to tweak assumptions to suit the popular vote for a business case, telling what people what they want to hear.

Update (12 October 2013): I found a very insightful essay on modelling by Joshua Epstein that talks about the reason behind modelling. It’s worth the read, and makes a good case for modelling alongside experimentation.

Big Data and Personality

Andrew McAfee posted about a very intriguing study on personality, gender and age in their relation to language. In essence, what the study did was to look at the correlation of people’s Facebook statuses and their personality, gender, and age.

You’ll know why I say it’s intriguing when you take a look at some of the findings. Especially interesting are the word maps.

Here’s one showing the words used by people who were extraverted/introverted, and their emotional stability (i.e. personality). Neurotic people are sad, angry, and existential. Emotionally stable people are… hmm… outdoorsy/active? As McAfee mentioned in his post it’s an interesting correlation between the sorts of activity and emotional stability, but one which cause-and-effect is difficult to determine. Does physical activity lead to a more emotionally stable personality, or do emotionally stable people just tend towards physical activity?

Image of Facebook status updates by personality

I’m pretty much a 60/40 introvert (60% introvert, 40% extravert) so I’m always intrigued with studies on introversion, so I just couldn’t ignore the huge “anime” (and its related terms, like “Pokemon”) popping up in the introversion word map.  — I do wonder how much of an impact cultural influence (i.e. a person’s country of origin/residence) plays a part. And did you notice the number of emoticons in that map? Me too 🙂

And here’s the word map for males vs. females. I love this one. Seems like the biggest thing on female’s minds is shopping and relationships, while for males it’s all about sex and games. As McAfee mention’s on his blog, this doesn’t “does not reflect well at all on my gender”.

Image of Facebook status updates by gender

And here’s one for age. My guess why daughter’s are more talked for the 30s to 65s about is because women are the ones talking about them (men just talk about sex and sports). In the gender map, relationships dominate what women talk about (apart from chocolate and shopping), and through my experience in TV watching, women don’t really talk about sons because sons pretty much take care of themselves. Daughters, on the other hand, are always worth worrying about.

Image of Facebook status updates by age

I could imagine fiction writers using these to build character dialogues; or academics building ever more insightful anthropological maps; or marketers with targeted campaigns. It’s a really imaginative use of big data, and one that I think is brilliant.

Who says Big Data’s failed?

The dismal failure of ‘Big Data’?

I just read an article on ZDnet on the dismal failure of ‘Big Data’ that was so bad I don’t even know where to start. The author states that economics is a “big data” profession (why–what makes a profession “big data” or not?), and then goes on to say that because big data hasn’t been able to solve all the world’s economic ills, it has failed.

Surely, a Big Data profession such as the study of economics over the past 150 plus years would by now be refined and almost scientific in its precision, especially since these days we have as much compute power as an economist might need, not to mention even more data to analyze. But it’s not even close.

This much data (enough to be called “big data”) wasn’t available until the recent past, so why drag 150 years of history into it? It isn’t like we’ve been working with big data related to economics for the past 150 years. The technologies, skills and mindsets related to handling this much data are still very young.

What is more, I don’t really get what he means by the failure of big data. “Big data”, by definition,  simply means data that the data cannot be (or is difficult) to handle using legacy techniques involving just a single computer. By saying that big data has failed, it implies that it can succeed. But how? What’s the thing that differentiates the success and failure of “big data”?

If success means solving all the world’s economic problems then by golly, it has failed. But that’s like saying that if a pair of running shoes promises to help you run with less pain but doesn’t help you win the Olympics has failed. Certainly, there are limits to what big data can do, but saying that it has failed doesn’t make sense.

Possible successes — if you will call it that: Web analytics (where data is so easily grown because of the ease of data collection and the number of people and actions that can be measured) and politics (which was, actually, driven quite a bit by web analytics).

BONUS possible success: sports, specifically baseball (not big data per se, but it’s an interesting read and it’s about data, so there.