Don’ts of Data Science: Watch for these Gotchas


While the topic of big data has become familiar, the concept is still evolving and we’re very much in the early days of this era. While big data is in its infancy, the question that hits the core of many is: What do we do with it all?

Since the dawn of the digital age, the amount of data that a brand has access to seems endless, largely based on digital footprints and machine learning. As that amount grows, brands can gain new insights that impact their path to success. While there are incredible opportunities to make a difference by leveraging big data, there’s also the harsh reality that many brands aren’t using data in the right way.

While we aren’t all data scientists, that doesn’t mean we shouldn’t roll up our sleeves and utilize the full potential of data. We often think that data lies in the hands of the IT organization, but with data proving to be one of the most beneficial proof points for a competitive strategy, it’s imperative that analytics plays a role in everyday workflows, even for those that might not have a background in the space. With plenty of commentary on exactly what to do with all your data, here are a few tips on what not to do, and how to avoid them:

Paralysis of perfection:

The world of data science follows one of the main tenets of financial investment – start early, even if it’s small. Data science mirrors this trajectory, building on a portfolio of insights over time. While the insights might not be perfect from the beginning, they will eventually compound, giving your brand that competitive edge.

Even with just a toe in the water, it’s possible to transform a small subset of data into something highly predictive. Conversely, in large datasets, it’s possible for information quantity to decrease as data quantity grows, which can ultimately result in poor decisions. Ensure that your organization is specific and deliberate about the question you want to answer, and what will come of the results. Data science accelerates data exploration, allowing brands to gain insights and meaning quickly, even in a small way. But, making sense of all the data, and turning that into actionable insights, takes time.

Dismissing outliers:

The bigger the dataset, the bigger the probability of coming across an outlier. While it might be easiest to classify this as an isolated anomaly and go back to business, it’s increasingly important to look at the story the data is telling. While the presence of outliers could point to data quality issues, don’t assume that they do. Anomalies commonly point to unexpected patterns. To get value out of your data it’s imperative to care about every component of a set.

Read the source article at InformationWeek.