Got Data? The Importance of High-Quality Data for Building Effective Machine Learning-Based Solutions

Sponsored by:

Previous Webinars – When it comes to annotating data for academic purposes, there are specific industry standards that are commonly used.  However, when it comes to the commercial sector, building a solution that relies on machine learning requires different data annotation standards.  To build a strong solution that can understand and mimic humans, high-quality, human-annotated training data is key.

In this webinar you will learn:

  • Pros and cons of licensable public data vs. building your own datasets
  • Choices and tradeoffs in the level of effort you invest in acquiring and labeling data
  • Why curated crowds yield higher quality data for your machine learning

Speaker: James Lyle is Director of the Custom Linguistic Solutions team at Appen Inc. After earning his Ph.D. in linguistics at the University of Washington, James joined Microsoft in 1999 and spent more than 14 years working on various natural language technologies, including proofing tools, information extraction, and text analytics. Since joining Appen in 2013, he has focused on providing tech industry clients with linguistic consultation and high-quality annotated data for machine-learned NLU solutions.

  • To view the Cambridge Innovations Institute’s privacy statement click here.