A Discussion on ML and AI Opportunities and Gaps at CoLab Day 2021
At our recent CoLab Day event, Parinaz Sobhani, Georgian’s Head of Applied Research, hosted a session with special guest Zahi Karam, Vice President of Data Science, Bluecore and two Georgian Applied Research Scientists — Azin Asgarian and Akshay Budhkar — to share their insights to help CoLab companies with their machine learning and artificial intelligence challenges and identify new opportunities.
Where to Start Your AI or ML Journey
We began the session by asking where a company should start their AI or ML journeys.
My advice is don’t start with data science.
For companies that aren’t AI-first, but want to incorporate AI and ML to improve their customer experience and use it for differentiation, one lesson from Bluecore is to not start with data science. In many cases, companies are able to build extreme value with just heuristics (rule-based technique) — no machine learning required. So once the business is ready to start building out a data science team, they have already created the data foundation and are able to build on top of that.
In the beginning, the main priority is the data pipeline and capturing as much data as possible. Focus more on how you structure the data and how you collect it because you don’t yet know what data is going to be most useful.
When you start building your first model, start with the simplest thing that you can package in a proof-of-concept with one or two clients before you invest in building out the full model.
A data science product is really just a data product.
The data has to be structured well enough that you can take action on it. On the product side, it has to be packaged in a way that will be easy for your initial users to test it out. Once the data and the product sides have been established, then you can add the intelligence or data science aspect to augment it.
How to Iterate or Fail Faster
Many early models don’t focus on iterating quickly. Zahi shared that at Bluecore, they spent six months building a model based on an exciting vision, but when they released it, they weren’t getting the adoption they expected.
The panel recommends that you start with proving out your hypothesis by leveraging the data with a simple model. Get something out there and instrument it well enough to measure performance and iterate. Measuring performance is the key.
In terms of sequencing, Akshay recommends thinking about the feasibility of different options that you have on your product docket – then rank the options by value to the customer and lower effort in order to decide which to tackle first.
Two lines of code with the easiest statistical models would get you to a very strong baseline, which will add a lot of value to your customer.
Creating a Healthy Culture for Data Science Within the Broader Organization
What does a healthy, mature data science culture look like once it’s part of the broader product and engineering organization? How do you transition to create and foster that broader culture?
Creating close relationships between the Data and Engineering teams and the Product team, and then between the Product team and the client, enables all the teams to align based on an understanding of the pain point and not on what’s cool about the data. Without that cross-functional alignment, the Data team might try to hack their way into engineering processes and the product to serve up a model, which can lead to quick deliveries, but can also create a lot of technical debt and tension between the teams.
A good relationship with Product also helps ensure a good relationship with Design. This helps encourage adoption with your client, which in turn creates more organizational support.
When you start, having engineers working closely alongside data scientists makes it easier to have the conversation of how you interface with the product in a healthy, extensible way. Then as you evolve, identify repeated questions that you need to solve for. Knowing these questions allows the engineering team to interface into the product once, and then that allows the data scientist to iterate on the models using new models without explicit engineering support.
Azin added that because data science inherently has a high degree of uncertainty, ensuring that your team feels safe enough to explore, try new things, make mistakes and fail will allow them to be more creative and open minded, ultimately leading to better problem solving and innovation.
Hiring Your First Data Scientist
In hiring a first data scientist, if you don’t have the Product team that has a sense of ML and AI, then you probably want a data scientist that can think a little bit like a PM as well — because they’re going to have to play that role on the team, to some extent. Your first data scientist can’t just be somebody that is deep in the data. They must be able to speak to the business and understand what product can be built to solve the pain point that the client is asking for — or, what the business is asking for, if it’s an internal-facing data science team.
As a startup grows, one mistake a business might make is applying the “jack-of-all-trades” approach and hiring a developer or even a Head of Engineering to manage the data. It’s important to ensure you’ve got the right people, with the right skills, collecting the right data and making the right analysis on that data.
At first, the data scientist will be focused on understanding the data, visualizing it, making it accessible to the business, and making it accessible to your clients. As the craft evolves, it will tackle higher level problems and create products that address a broader range of the client base. That might mean that you need to shelter the product-based data science team from some of the noise of one-off requests and interruptions or diversions in order to make real progress.
Building Business Cases Around Data Products
Parinaz notes that at Georgian, we don’t charge our portfolio companies when we work with them, but we ensure that we’re working on high-value, high-impact opportunities for them. To do this, we help them build the ROI model and estimate the business value of the opportunity. However, many companies will push back saying that for data science, machine learning, or data products, it’s much harder to build ROI cases than for other, more traditional products. What does the panel think?
Zahi suggests that it depends on how well you’re instrumented and what your business and the pricing model are.
The instrumentation is key because without it, you can’t measure the lift that data science brought in and the value that adding intelligence to the data contributes.
In terms of the ROI, it depends on whether you are selling each model as a separate engagement or it is at the core of your offerings, or it is a cost reduction automation tool.
Read more like this
Why Georgian Invested in Armis (Again)
Armis offers visibility, security and risk management to enterprises across the Internet…
Why Georgian Invested in Glooko (Again)
We are pleased to announce that Georgian has led Glooko’s $100 million...
Why Georgian Invested in Coder
We are excited to announce that Georgian has led Coder’s $35M fundraise…