How TD Bank Uses Big Data, AI & Machine Learning


TD Bank is Canada’s second-largest bank and the tenth-largest bank in North America. Its focus is on retail banking, and it employs over 26,000 people.

The bank provides its services to around 25 million customers, and its guiding philosophy is “legendary customer service.” In short, this means that every interaction with the bank should be a memorable and enjoyable experience.

Over the last five years, TD Bank’s Enterprise Information Management team has rolled out the “Googlefication” of the business. Essentially this has meant transforming it from a finance company into a tech company – and managing the cultural shift that this entails.

It started with an idea that moving to a Big Data environment could bring about a 50-fold reduction in costs over the relational database infrastructure that had previously been in place. A tall order, on the face of it, but one that started with a fairly simple premise.

IT doesn’t do anything in a business that couldn’t already be done by someone else – so the sole reason for it existing at all is to enable things to be done better, faster and cheaper than it could be done before.

By moving to a data-lake infrastructure, and switching to providing data-as-a-service functions, TD Bank effectively democratized access to the information it gathers and stores as part of its business. These include transactional records and customer service interactions – enabling it to act far more quickly on data-driven insights.

The first part of this was getting all of the data into one place where it could be used together – the data lake. But simply throwing all of an organization’s information together – particularly a bank’s – isn’t a simple process. The data needs to be in a state where it can be quickly found and used by those who need it, but of course, there are obligations around access and data security too.

In fact, the whole point of a data lake is to make the data accessible and usable across an organization, rather than it being compartmentalized in “data silos” where its usefulness is restricted, usually to those who collected it in the first place. But if you aren’t careful how you go about it, the end result could more closely resemble a data-swamp than a data-lake!

This was overcome by the information management team by breaking down the essential information about information – the metadata – that needed to be recorded to make the data useful.

The team established that this came down to:

1 What is the data?

2 Who can access it?

3 Under what circumstances can they access it?

Once these tags were filled in, the team knew that data could be loaded into the lake and would always be in a suitable format to be found and used.

Data gathered by TD Bank into its lake included data on customer behavior, personal data such as their interests, and internal and external data, in both structured and unstructured forms. This unstructured data includes audio and video recordings of customer interactions with the bank.

The next step was to set out what problems this newly accessible and available data can be used for. A decision was made at this stage to go for “quick wins.” These are mission-critical objectives where it can be quickly shown that building a Big Data infrastructure can pay off – generating savings through driving efficiencies that are greater than the infrastructure costs to deploy.

TD Bank’s Hadoop private cloud was built around Cloudera’s solution, and the bank looked to Talend as its integration partner to build services that enable value to be quickly extracted from data anywhere in the business.

The data infrastructure brings together open source tools such as Hive, Impala, Spark and Tableau that enable querying of data and output of reports and visualizations. Talend helps tie the pieces together and drive data transformations that deliver data in a format that can be easily consumed by a broad range of business teams.

Throughout this deployment, the team aimed to create an experience dependent on “configuration not coding.” To get at the insights, the bank’s employees should simply have to set up the software using the parameters relevant to their particular task – rather than code solutions from scratch.

This enabled the development of tools that enabled the bank to act on the wealth of data it holds on its customers in order to offer them tailor-made services. For example, if the bank knows that a customer is in the process of a major life event such as buying a house, marrying, or having a child, this data informs the products and services they might be offered.

This is done by rolling out what it describes as a “BI (business intelligence) in Hadoop” strategy. By deploying this technology, it says it has reduced the operating cost by a factor of around 50 times per gigabyte of data processed.

The infrastructure has also made it possible to create customer-centric digital services such as its MySpend app, which allows customers to track their monthly spending. Insights derived from the aggregated data created by millions of customers are used to offer suggestions that can help improve individual spending habits.

This capability to act on data-driven insights received a boost with the acquisition this year of Toronto machine learning experts Layer 6. This investment in smart, self-learning technology will help it build systems that can more accurately predict customer needs. Machine learning is being increasingly adopted throughout financial services for its ability to accurately predict customer needs, provide personalised product and service recommendations, anticipate complaints and power chatbots to provide a smoother feedback experience.

TD Bank had already shown its commitment to using automation to improve customer experience through its Twitter chatbot and its adoption of Amazon’s Alexa devices to offer voice banking to customers.

Read the source post in Forbes.