Citizen Data Scientists: Why Data Scientists aren’t Enough


“AI is the new electricity”, proclaims Andrew Ng, adjunct Stanford professor and Baidu’s former chief scientist. Ng predicts AI’s future as one of ubiquity in our lives as consumers, used for everything from food recommendations to healthcare. Ng’s statement also holds for enterprises. AI and data science are increasingly standard for teams like marketing and finance. Led by trends in the data industry, decreasing barriers in understanding and applying analyses have transformed industries into data-driven business environments.

As Gartner succinctly defined, a citizen data scientist is a person whose primary job function is outside statistics or analytics but nevertheless uses models that are predictive, or have advanced analytics capabilities. Gartner foresees that citizen data scientists will foster greater depth of business analytics, as they provide increased support for (formal) data scientists and enable them to shift their focus to more complex analyses.

Many employees who use SQL aren’t computer scientists or engineers, and using SQL is only one part of their job. They have other responsibilities, such as creating spreadsheets, delivering strategy recommendations, executing marketing campaigns. These other duties complement their data skills in a unique way.

Imagine if companies replaced these analysts with computer scientists. The CS majors might write better, faster code, at the expense of the other duties. Data engineers could execute analyses, but they might lack the full background context, which lives with the business team. This leads to a problem, because the key use of analytics is in its application.

Algorithms may compute advanced statistics and machine learning, but it is the humans who own the domain expertise and deep industry knowledge to realize its value.  A successful implementation pairs machine automation with human direction.

Three changes have made the citizen data scientist a possibility. First, more employees are technical and comfortable dealing with new technology (SQL is now commonplace and seen as a business tool like Excel). Second, new tools and advances in the data space allow teams to self-serve analytics. Finally, advanced models — once seen as too complex — are more accessible, via open-source libraries and other add-ons that enable users to apply complex equations without requiring an understanding of the underlying math.

With the increased availability of data and increased popularity of data-driven decisions, managers face an increased number of ad-hoc, revenue-impacting questions that need a quick answer. There is little time to spec a project for the data science team. On the other hand, managers might lack the familiarity to pull the numbers themselves. And often, the most impactful analyses are small and simple data pulls, not large analytic engagements.

The time is ripe for the citizen data scientist to be every company’s next focus: the technical level of today’s employee is rising, while the technical knowledge barrier required to use new data tools is dropping.

Read the source article at