Research
Innovating with the latest technology.
Solving business challenges with machine learning and trustworthy AI.
We take the latest technology and research trends and turn them into business solutions.
Currently, we’re researching areas in advanced machine learning like transfer learning, NLP representation learning and AutoML.
We’re also focused on technologies around trustworthy AI, including differential privacy, fairness and bias detection, and explainability.
Focused on delivering competitive advantage.
When choosing where to focus, we look at: ROI, scalability, speed and differentiation.
Tangible ROI
Can this technology solve critical business problems?
Scalability
Can this technology be applied to multiple growth-stage problems.
Speed
Can we quickly create value by building expertise and applying it to multiple problems?
Differentiation
Will this technology create sustainable differentiation for our companies?
Focused on delivering competitive advantage.
When choosing where to focus, we look at: ROI, scalability, speed and differentiation.
Read more about our research areas.
Advanced Machine Learning
Improve model performance and accelerate time to value
Transfer learning is the computer science equivalent of using a skill or knowledge you’ve picked up in the past and applying it to a new situation.
For example, consider a person that knows how to play tennis. This person can take many of the same skills and knowledge learned from tennis and apply it to a new racquet sport, like squash. This person will likely learn squash much faster than someone that is starting from scratch.
Transfer learning follows this logic, allowing you to leverage relevant existing information from past customers when building a machine learning model for a new customer. Take the example above – you might use the hand-eye coordination and the forehand/backhand strokes from tennis and apply them to squash. But court positioning? Not so much.
Transfer learning can significantly reduce the amount of time it takes to train the new model, easing the pain of the cold start problem. The model may also perform better because it uses existing knowledge as a starting point and improves from there.
Learn more about transfer learning

Artificial Intelligence
Overcoming the Cold Start Problem: How We Make Intractable Tasks Tractable

Artificial Intelligence
Four machine learning techniques in simple terms

Artificial Intelligence
Driving efficiencies in your AI process

Differential Privacy
Why We’re Collaborating with Google in TensorFlow Privacy

Artificial Intelligence
An Introduction to Transfer Learning
Improve language-based model performance and time to value
Natural Language Processing (NLP) is a field in machine learning concerned with the ability of a computer to understand, analyze, manipulate and generate human language. It allows you to unlock the value of your existing unstructured text data.
Representation Learning (RL) is a family of approaches that allows you to automatically discover the representations (feature engineering) needed for downstream NLP tasks. That means taking what we already know about language and encoding it to help models learn.
NLP representation learning models various linguistic characteristics in your data, so that your models can approach problems with prior knowledge, making them more effective at solving the task at hand.
This means you can get to the same level of performance by using much smaller datasets and leveraging pre-trained embeddings and models.
This can then dramatically improve the performance of tasks such as:
- Text and document classification.
- Sentiment analysis and opinion mining.
- Entity extraction.
- Dialogue systems.
- Text and music generation.
Learn more about NLP representation learning
Automate manual tasks in the machine learning process
AutoML allows you to automate some manual tasks in the machine learning process to speed up the development of AI solutions.
This can increase the capacity of small ML teams, by minimizing the time to explore new opportunities and enabling them to focus on high-value activities.
Learn more about autoML
Solve complex problems with scarce resources
Constrained optimization is a process that uses applied mathematics to tell you exactly what the optimal decision is where resources are limited.
In constrained optimization, there are usually three components:
- Variables (e.g.., restaurants to recommend)
- Constraints (e.g, the number of people each restaurant can serve at lunch)
- An objective (e.g., how to maximize user satisfaction with recommendations)
By transforming each component into mathematical functions, you can build a model to answer the question: how do I set my variables so I can maximize my objective while staying within my constraints? How do I recommend the restaurants that will maximize user satisfaction while not overwhelming each restaurant?
In this way, constrained optimization gives you a formal way to allocate resources in a complex and dynamic setting. Using the outputs from this process, you can confidently automate your decision making, knowing you have found the optimal allocation.
Learn more about constrained optimization
TRUSTWORTHY AI
Protect user data and maintain compliance
Machine learning scientists often use private data to train a model. Once in production, there is a risk that attackers could query the model enough times to identify personal information.
You can minimize this risk using a technique called differential privacy. This concept involves injecting noise—or random data—into your original dataset, so it’s harder to tell what is real data and what is noise. How much random data do you add? Enough that it’s virtually impossible to identify specific individuals with full certainty, but that the data is still useful from a statistical standpoint.
Using differential privacy gives you a dial that you can turn up if you’re particularly sensitive to private data loss. This is a great way to ensure that your AI product is measurably private and trustworthy.
Learn more about differential privacy

Differential Privacy
Episode 128: Creating a Privacy Culture with Spotify’s Vivian Byrwa

Artificial Intelligence
Four machine learning techniques in simple terms

Differential Privacy
Why We’re Collaborating with Google in TensorFlow Privacy

Artificial Intelligence
A Brief Introduction to Differential Privacy

Differential Privacy
Episode 83: Understanding Differential Privacy with Chang Liu

Artificial Intelligence
What Is Differential Privacy?

Differential Privacy
CEO’s Guide to Differential Privacy

Differential Privacy
Episode 53: Differential Privacy Demystified (Pt II)

Differential Privacy
Episode 52: Differential Privacy Demystified (Pt I)
Detect and remove bias
While technology has the power to greatly improve people’s lives, it can also reinforce existing societal biases and unintentionally create new ones. Fairness and objectivity in AI only exist if data and models are free of bias. If your machine learning model is trained on biased data sets, your product or service will perpetuate unfairness and discrimination.
To build fair machine learning systems, the first step is detecting potential biases and their roots. Tools such as fairTest and Google what-if can help to identify such unwanted associations between model prediction and sensitive attributes. The next step is mitigating bias by having a thorough plan and transparent communication. Removing sensitive attributes from a model does not solve the issue of fairness because there can be correlations between the remaining attributes and sensitive attributes, and removing any attributes can result in sacrificing the model’s predictive power.
Learn more about fairness and bias detection
Understand how and why machine learning models make decisions.
Explainability is a broad term that describes several machine learning techniques that can get inside the black box and explain your outputs.
One explainability technique that we have found to be particularly useful and scalable is the Shapley Additive Explanations (SHAP) technique.
In the SHAP technique, the reason for a specific outcome is broken down into a list of features. When assessing whether or not someone should receive a credit limit increase, relevant features might be historical credit score, existing assets, salary, and so on.
Each of these features is then assigned a SHAP value. The SHAP value tells you how much influence that particular feature had on the outcome achieved. This value can be positive or negative, depending on whether the feature made the outcome more (positive) or less (negative) likely to occur. In our credit approval example, a ‘bad’ credit score would likely have a negative SHAP value in the decision to approve a credit limit increase.
Once you have insights that explain how your model is working, you can use them to investigate any bias before releasing the product. When you’re happy with your model, you can translate your explanations into user-friendly language and find thoughtful ways to insert them into your product experience. Doing this effectively will make your users feel comfortable and help them perform your core actions with confidence.
Learn more about explainability
Our ML software toolkit
Our toolkit is an open-source collection of reusable software assets to scale our companies' R&D capacity to quickly adopt breakthroughs in our applied research areas. Here are some examples of the applications of our toolkit.
PUT ML MODELS INTO PRODUCTION FASTER
Our AutoML tool, Foreshadow, increases the productivity of ML teams by enabling them to build more AI models in less time. Foreshadow provides an end-to-end AutoML experience to automate decisions in data cleaning, feature synthesis/transformation and hyperparameter tuning.
TRAIN HIGH-PERFORMANCE NLP MODELS WITH LIMITED ANNOTATION
Alchemy allows you to reduce the amount of annotation work done by experts to train an NLP model, lowering the costs associated with training these models.
QUICKLY ADAPT THE LATEST NLP MODELS TO YOUR TARGET DOMAIN
Our domain adaptation tool allows ML teams to adapt a generic BERT language model to a target domain and task to improve performance, without the cost and development time.
MACHINE LEARNING WITH PRIVACY GUARANTEES
To convince customers to share data or models, you often need to guarantee their information privacy and gain their trust. We built our differentially private machine learning software to address this challenge in collaboration with Bluecore and WorkFusion.
We have since made it available as part of TensorFlow Privacy. With TensorFlow Privacy, you can guarantee your customers' privacy, earn their trust, gain access to more data and ultimately improve your products.
Read research papers written by the team.
Our team collaborates with their peers at our companies to write research papers that have been accepted at the leading global machine learning conferences.