PAPER CLUB

Accelerating AI with Databricks

September 23rd | 12:00-1:30 pm EST

Databricks is a company strongly rooted in bringing R&D to production.

Join us on September 23rd from 12:00-1:30 pm EST to discuss how you can improve your ML reliability and performance with Databricks Lakehouse, Delta Engine, and hosted MLflow.

Additionally, hear case studies from two Databricks customers—Tealium and Arteria.ai— showcasing how they are accelerating data and machine learning initiatives with Databricks.

To join this session, apply to join Paper Club below.

Databricks Workshop V

Agenda

Time
12:00 - 12:05 Introduction
12:05 - 12:45 Lakehouse and Delta Lake Overview
12:45 - 1:10 Demo, Zach & Jesse @ Tealium
1:10 - 1:30 Demo, Amir & Kunal @ Arteria.AI

What is Paper Club?

Georgian’s Paper Club runs on a monthly basis to explore applied AI techniques and includes machine learning experts in our network and data scientists from our companies, like Integrate.ai, Ritual, Tractable.

We structure our sessions with 50% theory and 50% hands-on.

Just in the last year, we’ve had have had sessions from AI teams including HuggingFace, Layer6.ai, Google AI.

About Delta Engine and ML Flow

ABOUT DELTA ENGINE

Lakehouse is a modern data management paradigm that enables companies to get the most out of their data by leveraging the best of data lakes and data warehouses. Delta Lake serves as the foundation of Lakehouse, acting as an open source storage layer that brings reliability to your data lakes.

Delta Lake provides ACID transactions, scalable metadata handling, transaction log, time travel, compaction -- features that are crucial for achieving the Lakehouse vision which consists of both near real-time analytics and large batch processing.

ABOUT MLFLOW

The Databricks platform enables customers to build machine learning models on a single source of truth. Managed MLflow on Databricks is a fully managed version of MLflow providing practitioners with reproducibility and experiment management across Databricks Notebooks, Jobs, and data stores, with the reliability, security, and scalability of the Unified Data Analytics Platform.

For example, Tealium was able to migrate their large-scale Parquet pipelines to Delta and successfully promote their ML workloads to Production using Databricks. Tealium is also using Databricks hosted MLflow as the backbone for their Fraud Detection product where they are training and serving models in real-time using the built-in capabilities which has freed up their ML scientists to do more with less overhead.

Arteria, another Databricks customer, has also embraced Databricks and MLflow as their primary method of training, monitoring and serving product-grade machine learning pipelines. Not only have they successfully migrated multiple models over to Databricks, but they have also established best practices on MLOps, including how to properly set up governance, tracking, and versioning for their models.

Follow the Georgian Impact Blog.

For our latest insights on machine learning & AI research from the R&D team at Georgian, follow the Georgian Impact Blog on Medium.