AI Trends Weekly Brief: Augmented Analytics


Augmented Analytics – New Paradigm
For Using Machine Learning to Augment
BI and Human Intelligence

A new group of startups are leveraging AI to solve complex problems, that may have been unsolvable a few years ago. The startups are targeting every industry. General purpose AI platforms are being fed a huge amount of data and automatically discovering interesting patterns.

This is a new wave in analytics, which saw visual-based discovery tools such as Tableau and Qlik disrupt the traditional business intelligence market of IBM Cognos and SAP BusinessObjects.

Data volumes are increasing and becoming more complex. Yet the processes organizations employ to prepare data for analysis, analyze data, build advanced analytics models, interpret results and tell stories with data, remain largely manual and prone to bias, as Rita Sallam, research VP with Gartner, has recently written on her blog.

The number of variables driving an outcome or best action is growing to the point where exploring every possible pattern and determining the most relevant finding is either impossible or impractical using manual approaches, she writes. Thus, analysts often resort to their own biased hypotheses, thereby missing key findings and drawing incomplete conclusions. Meanwhile, data science modeling requires specialist skills in short supply.

This sets the stage for a new paradigm — augmented analytics — which Sallam began writing about in 2015. Central to the development is the use of machine-learning automation to augment human intelligence, and provide context awareness across the workflow. “Augmented analytics will be crucial for delivering unbiased decisions,” Sallam states. Certain startups are pushing the modern BI and analytics vendors, including Tableau and Qlik, to invest in augmented analytics for their products. We take a look at a few of those startups here.

DataRobot Seeks to Automate Machine Learning

A leader is machine learning automation, DataRobot of Boston was funded in 2012 and has raised $124.6 million in six rounds, according to Crunchbase.

Cofounder and CEO Jeremy Achin is a data scientist turned entrepreneur. Prior to DataRobot, he was Director of Research and Modeling at Travelers Insurance where he built predictive models for pricing, retention, conversion, elasticity, lifetime value, customer behavior and claims. He studied math, physics, computer science, and statistics at the University of Massachusetts, Lowell.

Cofounder and CTO Tom de Godoy is also a data scientist who worked at Travelers Insurance, here he was senior director of research and modeling, managing a team of data scientists. He has a master’s degree in mathematics and an undergraduate degree in physics from the University of Massachusetts, Lowell.

DataRobot recently unveiled new features in its platform, including integration with file structures from SAS Institute, the long-established analytics software vendor. The new release is also aimed at making it easier for analysts of any skill level to quickly build and deploy accurate predictive models. New features for the insurance industry were also announced.

“We’re the only software in the market that not only automated the heavy lifting inherent in machine learning, but is built specifically for enterprise deployment,” states CEO Achin in a press release. “Our next few releases are doing to shake the worlds of business data science and beyond.”

A new set of Generalized Additive Models for the insurance industry are aimed at helping users solve pricing and risk segmentation problems more accurately. The new release also offers the ability to export data preparation, preprocessing and scoring code in Java, improving the ability to employ DataRobot models in more environments.

“Today, DataRobot solves predictive modeling problems across all core functions in an insurance company,” stated Satadru Sengupta, general manager of insurance at DataRobot, in the release. “What is really differentiating is the user-centric design of the product. DataRobot is a technology company founded and built by people from the insurance industry.”

DataRobot recently acquired Nutonia, a data science software company specializing in time series analytics modeling; the Nutonia features are expected to be embedded into the DataRobot platform later this year.

DataRobot customer Lending Tree is currently investing in customer analytics and AI projects, aimed at helping the online loan marketplace operator to improve its customer interactions. Teams are currently using predictive analytics to better target marketing campaigns, and determine financial products individual customers might be qualified to use.

Much of the analytics work is being done in a cloud-based tool from DataRobot. Lending Tree also has a team of data scientists using open source tools, such as R and Python, investigating AI projects that could enhance customer-targeting efforts.

“You have to have good data management to take advantage of AI,” stated Akshay Tandon, vice president and head of strategy and analytics at LendingTree, in an interview with TechTarget.

While it may take some time for the efforts to have productive results, Lending Tree wants to concentrate on proper data management to be in a position to benefit as AI matures. “The companies that will win out will be companies that are invested in data,” Tandon stated. “We’ve been able to take advantage because we invested in data a year or two ago.”

For now, Lending Tree is putting a priority on developing data governance best practices to ensure that data is used and stored consistently. “For companies that are in the tech space, AI is undoubtedly going to give an edge,” Tandon stated. “You’re going to see an increased use by business. It’s already starting to happen. It’s impossible to think about every aspect of where the customer is. Investing through AI that thinks through that cycle is an undoubted edge.”

Endor Putting Social Physics on the Map

Social physics is based on the premise that every event representing human activity — such as a phone call, a credit card purchase, website usage — contains a set of mathematical patterns that are embedded within that data. This math data can then serve as a filter for detecting emerging behavioral patterns before they can be observed by any other technique.

This new science, which uses big data analysis and mathematical laws of biology to understand the behavior of human crowds, was originated at MIT by Prof. Alex “Sandy” Pentland and Dr. Yaniv Altshuler. It was further developed by Endor, using proprietary technology, resulting in an engine that purports to explain and predict human behavior.

A CIO in the Israeli Intelligence Corps is quoted on the company website as saying that Endor offers “a revolutionary concept and a truly technological breakthrough. The results they presented are unmatched by any competing tool.”

Here is an example, provided on the Endor website, of how the tool can be used:

In a recent test, a customer provided 15 million Tweets’ meta-data to Endor as raw data for analysis. In addition, the customer revealed the identity of 50 Twitter accounts known to be ISIS activists that were contained in the input data. The customer wanted to test Endor’s ability to detect an additional 74 accounts that were hidden within the data.

Endor’s engine completed the task on a single laptop in 24 minutes, measured from the time the raw data was introduced into the system until the final results were available. Endor identified 80 Twitter accounts as “lookalikes” to the provided example, 45 of which (56%) turned out to be part of the list of the 74 hidden accounts. This provided an extremely low false alarm rate (35 False Positive results), so that the customer could practically have human experts investigate the identified targets

“We are coming to realize that human behavior is determined as much by the patterns of our culture as by rational, individual thinking,” states Prof. Pentland on the Endor website. “These patterns can be described mathematically and used to make accurate predictions.”

Customers of Endor feed data into the model offered as a cloud service. Then they can begin asking questions. Endor CEO Doron Alter said in an interview with TechCrunch that the biggest bottleneck is getting the data into the system.

Endor’s system searches for groups of individuals that display behaviors that Social Physics theory says cannot randomly occur. “We only look at human behavior using Social Physics equations to transcend the limitations of the traditional machine learning.”

Endor raised $5 million in seed funding when it launched in September 2014. That round was led by Market with participation from Innovation Endeavors, Eric Schmidt’s firm. Endor is still in stealth mode, working with early customers and design partners on the product. The company is based in Tel Aviv and has 10 employees.  Endor plans to launch at a later date to be determined.

Paxata Offers Self-Service Data Preparation for Analytics

Paxata offers the Adaptive Information Platform, a self-service data preparation platform for analytics. The platform is a visual application used to access, explore and shape data with clicks, not code. It is built on Apache Spark and is optimized to run in Apache Hadoop. Paxata is said to leverage automated artificial intelligence, elastic cloud architecture and distributed computing in its quest to automate the data-to-insight pipeline.

Forrester Research recently named Paxata a leader in Data Preparation Tools, in a Q1 2017 Forrester Wave report. The report stated, “Paxata focuses on usability and fast time-to-insights for business analysts. Paxata is powered by a unified set of technologies designed to support and balance data integration,quality, governance, collaboration and enrichment.”

Also, Paxata combines a user experience that’s intuitive for business analysts, along with machine learning plus text and semantic analytics, so analysts can connect data quickly and get into insights faster. A ClicktoPrep feature enables bi-directional integration with BI tools. Paxata also sues Spark for large-scale data preparation and a multi-tenant architecture built for the cloud.

Paxata is said to have launched the self-service at preparation category in 2013. Co-founder and CEO Prakash Nanduri was quoted in the release as saying that the Forrester recognition is a testament to the firm’s business value of “delivering a comprehensive, enterprise grade self-service platform that empowers all business consumers to turn raw data into information, instantaneously.”

Customers include Standard Chartered Bank. Sarah Burnett, head of business intelligence at Standard Chartered, made a presentation entitled “Democratizing Data with Paxata” at a Paxata event in Singapore in December.

Customer experience stories on the firm’s website include an account of how a financial services company is using Paxata to streamline client data quality control. The Paxata Adaptive Information Platform can merge and reconcile customer records from different internal sources. If there are exceptions or duplicates, they are identified and sent to business analysts to remediate. This remediation process saves time and allows business groups and IT to better coordinate.

In another account, a national food distributor uses Paxata to work with multiple data systems across the enterprise, combining internal production data with retail data feeds, to get a single view of a product progressing through the supply chain. This allows the company to gain efficiency and minimize waste as it manages perishable inventory. For example, the time it takes for a business analyst to prepare transit time data has been reduced from five hours per month to less than one. Also, the company can now more accurately identify “pinch points” in the distribution chain, and it has stopped the coupon fraud occurring on its website by identifying offending email addresses almost instantly. Paxata’s can feed its data to business unit using the analytic tool of choice, including Qlik, Excel and MicroStrategy.

Founded in 2012, Paxata has raised $61 million in four rounds, most recently $13 million in July, according to CrunchBase.

Written and compiled by John P. Desmond, AI Trends Editor