Big data may be hot, but little data is what matters


This post was also published on CNET.

Big Data is in vogue, with a glut of startups and numerous large installations appearing in corporations. At CBS Interactive, we process almost one billion events a day that flow from our Web and application servers over message queues to a cluster of 80 twelve core Hadoop nodes that then feed a Teradata data warehouse.

Processing and analyzing such a large volume of data helps us ask important questions: Which pages on which properties are most profitable? Who goes where across our various sites? What types of content generates the greatest number of advertising conversions?

But here's the thing: Most of our conversations with product and business managers are spent discussing what I like to call "little data."

Little data constitutes the nuts and bolts metrics of running a business. For a Web property, that means getting a handle on issues such as the bounce rate, SEO session starts, social session starts, funnels of how users flow through a property, and page views per session. Too many people lose sight of these simple but critical metrics.



Monitoring and actively managing a Web property to these "little" metrics creates significant lift in page views and conversions, and that helps revenue. Even something as simple as actively managing the hoops you make people jump through to register for your site and the sorts of emails you send when they do builds and grows a loyal following.

One of the most critical metrics we are tracking comes from the gaming world: the ratio of daily active users to monthly active users (DAU to MAU). When the ratio is low, say around 0.03, it means that your unique users are coming only once a month, and you're essentially running a fly-by tourist site. When the ratio is high, such as Facebook's estimated 0.60, it means that a majority of your users are using your site on a daily basis and that your property is a key part of their online lives. (I can't share the DAU to MAU for our sites, as they're confidential.)

Taking control of little data


There are a variety of inexpensive little data tools that are easy to implement. SimplyMeasured is excellent at managing social traction -- i.e., how much impact your site is making on various social networks. KISSMetrics is world class at managing "conversion funnels," the path a user follows through a site before "converting" to a sale. Google Analytics, which is free, does a great job of managing metrics such as bounce rate and SEO session starts, measures of how "sticky" your site is and how well you're doing at attracting new external visitors. Implementing such easy-to-use tools encourages product and business managers to actively manage their sites with "little data" metrics in mind.

Managing a site by the numbers shouldn't be taken to the extreme, however. Sites need to look good and say something relevant to readers; they shouldn't just be optimized within an inch of their lives to drive revenue. We don't, for instance, want to make CBSNews.com look like GoDaddy, which is -- understandably enough -- completely optimized to drive revenue.

Growing up to big data


Once the use of little data becomes pervasive in an organization, big data can then begin to help decision making, since a culture of data-driven decisions is ingrained. Moving beyond just simple web metrics, big data can provide an integrated view of a business by integrating financial metrics, answering questions you hadn't even thought of when initially setting up a site, and deciphering trends across disparate sets of data.

Big data is hard to do, and can be very expensive and time consuming. Integrating revenue and cost data in order to manage end-to-end business models is a complicated and time consuming task. Some organizations decide to outsource to companies like Omniture and Webtrends (where I used to be a GM), which can help figure out how to tag and manage the process, in addition to storing the vast amounts of data required for meaningful analysis.

If your organization has enough volume and the technical competence to do your own implementation, keep in mind that it's easy to get lost in the process of building out big data infrastructure and lose sight of the fact that, in the end, big data needs to be usable. This might sound straightforward, but in practice it can be anything but. It requires highly skilled data scientists and strategists that understand business problems and can distill the data into simple, actionable metrics.

In effect, the magic of big data is turning it into little data.