From the Lab to the Factory: Building a Production Machine Learning Infrastructure


At most companies, advanced analytics expertise is contained in a lab environment: a small team of analysts sitting at their computers and churning out reports and insights to support business decisions. But the real impact from advanced analytics comes from building models that make real-time decisions within production workflows. We will discuss how to use the ecosystem of technologies around Hadoop to support bringing models out of the lab and into the factory, with a focus on strategies for data integration, large-scale machine learning, and experimentation.

About the Speaker:
Josh Wills (@josh_wills) is the Senior Director of Data Science at Cloudera, a leading distrubtor of Hadoop and related services. Wills is the creator of Apache Crunch (a top level Apache project) and serves as the project’s chair. Prior to Cloudera, Wills was an engineer at Google. Wills is an open source advocate and all around awesome guy.

Leave a Reply