Disruptive Marketing Blog

Big Data at the service of Marketing

“If it’s free it’s because you ARE the product”

This simple phrase, which we all recognize when talking about Facebook, Twitter, and social media specifically, should also probably refer to online marketing in general.

The premise is quite simple: Facebook, Linkedin, Twitter and Instagram are free because their users represent the  actual value of the company. For example, Linked in was recently acquired by Microsoft for $250/user (active user).

Equally, Google, as we all know, is no “charity”…let’s be clear…it has launched some fantastic products and has revolutionized an entire  industry in such a smart way that it should be an example to all of us who work in this sector, worthy of a case study ,being imitated and admired. In which case, how on earth can Google Analytics still be free??

Let’s get one thing straight…Google has been giving us “sweeteners” for years. It regulated and standardized Internet metrics but I still cannot quite understand why the majority of web analysts still depend entirely  on Google to carry out their day to day activities.

In other words, I don’t understand, if our job is to measure and analyze traffic and then take action based on this, why we don’t obtain this data directly ourselves!??.

I certainly don’t mean creating an internal tagging system from scratch, I mean why, for example, don’t we routinely use tools such as piwik, where you can download your own data in your own database, and/or use it for managing or analysis of server logs??


I believe that Google, by providing us with Google Analytics, has been happy to give us a “tranquillizer” to keep us snoozing on the job!

How is it that we can export Google Spreadsheets via xls, whereas with GA we are unable to export our stats to Piwik, for example?

In today’s market we often talk about “Big Data”, and we love everything “Big”, but it seems sometimes that we don’t place enough importance on the actual data itself. We also talk a lot about deep learning, machine learning, Artificial intelligence, however I believe that the main mistake we make when using these various techniques is in not looking closely enough at where the actual data has come from?? How can we apply mathematical calculations to a set of data that we have no control over, when we don’t really know how each click is being accounted for, or whether our server has lost any data…..in other words: How can we predict the future if we are unsure of the accuracy of our database or if we don’t understand how it works??

Another aspect that is also quite “odd” and we should think a little more in detail about is:

If each piece of data is the ultimate driver of web analytics how can it be free??

If each piece of data is the ultimate driver of web analytics, how can it be that even when it is free it costs so much??

In my personal case I have been lucky enough to work with web analytics in three separate and distinct facets:

– From 1.999 to 2.002: Analyzing Logs. In your server logs, you have absolutely everything in terms of information. However to work with them requires a steep learning curve and quite a large investment of effort. Once you have overcome these obstacles, just enjoy the rewards and benefits

– From 2.002 to 2.004: Implementing various TAG systems (pre GA) with complete control over the database. We knew exactly what was being saved,how it was being saved and how data was processed and displayed. This is the real key…if you don’t know what is being stored, how the information is being displayed, processed and saved then you are quite simply “Blind”.

– Since 2.004. Working closely with GA.

Google Analytics is a wonderful tool and I am full of positive feedback regarding my experience of working with it. This blog is not designed to be a criticism of GA, it is simply a sidenote to all those web analysts who insist  on only working with data and statistics obtained from Google Analytics.

I warmly encourage you to install Piwik, and co-ordinate with your colleagues who oversee the database, observe how you are currently storing information, what is the current logic, what information is stored, what kind of server you need to cope with processing all the data you collect and then allow you to  make modifications to the stats/ change certain parameters?? When you “Own” the data you can start asking questions about exactly what you are collecting…when you do not, you can only ask those questions that Google Analytics allows you to.

Make the effort to try running two solutions at the same time, loosen the chains of Google Analytics, and you will probably actually end up using Google Analytics even more, because even though it’s a fantastic tool it should NEVER be the only way that we measure data, and often, due to our lack of diligence…it is! Also, don’t automatically assume that it is so perfect, until you start comparing its data with other sources you don’t know that this is true…..the point is that you just don’t know without trying…..

“Assuming” is dangerous because we want and need more and more data, sometimes without really looking at where this data has come from, how it has been calculated and if it really is accurate!??..…