Stability.AI – Democratizing Access to Machine Learning

Stability.AI, a UK-based startup famous (or notorious?) for releasing the Stable Diffusion image generation model, just raised $100m on a $1bn valuation

Their goal is to “Democratize AI.” They have done so by open-sourcing the Stable Diffusion text-to-image model and are working on releasing other models, including large language models. 

This approach is in stark contrast to the one taken by OpenAI, Facebook, Google, etc. These companies have gated access to ML models like GPT-3 via APIs or invite-only programs. The reasoning is that these models could be used to generate hateful text and images and are generally too dangerous to be released to the ignorant masses.

In a recent interview, Emad Mostaque, the CEO of Stability.Ai and a fascinating thinker, talks about the inevitability of generative and large language models leaking out to the wild. He wants to focus on giving people a framework for the ethical use of AI while giving them the tools to build and train models for their specific uses. 

Stability.Ai has struck a deal with Eros Interactive to get access to their massive library of Indian content. Can you imagine what could be trained using that data?

Congratulations to Stability.Ai. I am curious about what this more open (or perhaps reckless?) approach to ML will bring us.

Generated image of a Robot having a celebratory drink.
Image generated by Stable Diffusion – Prompt: “A happy robot drinking champagne at a cocktail party at night, oil painting, muted, candid, high resolution, trending on artstation”

AlphaTensor – Speeding up number crunching with Machine Learning

For some, matrix multiplication may trigger memories of tedious high school algebra exercises. Last week, this humble mathematical operation was also the topic of a significant breakthrough in machine learning. 

Art generated by Stable Diffusion

Background – Matrix Multiplication

Matrix multiplication is the foundation on which many core computational operations are built. Graphic processing, machine learning, computer gaming, etc. – all rely on matrix multiplication. At any given point in time, there are millions of computers doing (probably) billions of matrix multiplication operations. 
Making this humble operation faster would result in significant computational and efficiency gains.

Why do we want faster matrix multiplication?

Multiplying two matrices involves doing a large number of multiplication and addition operations. 
For example, multiplying a 4X5 and a 5X5 matrix involves 100 multiplication operations using the traditional matrix multiplication method that has been around since the early nineteenth century. 
In 1969, a mathematician, Volker Strassen, came up with an ingenious method that reduced the number of operations required by about 10%. This was hailed as a groundbreaking discovery in the world of mathematics.

DeepMind Enters the Arena

This brings us to DeepMind’s paper last week, where they used the AlphaTensor deep learning model to discover a new algorithm for matrix multiplication that is faster by about 10 – 20% than the Strassen method. 
This is a *colossal deal*!
We are seeing a machine learning model find new algorithms to solve material, real-world problems. We have already seen DeepMind make groundbreaking discoveries in computational biology with AlphaFold. We now see applications of its Deep Learning models (based on playing games) to foundational aspects of the modern world. 
Exciting times are ahead!

TikTok – Succeeding with ML (and lots of cash)

TikTok* has caused political controversies, made Meta change its Instagram platform to mimic it, and caused many a moral panic. All signs of success.

TikTok’s use of machine learning to present a never-ending stream of engaging content is an example of the successful application of machine learning at a gargantuan scale. 

But, as the linked WSJ article shows, TikTok’s growth is driven by massive investments in technology and advertising. 

  • ByteDance, which owns TikTok, lost more than $7 billion from its operations in 2021 on $61.4b in revenues
  • The company spent $27.4b on user acquisition and $14.6b on R&D

I believe that the value of applied machine learning technologies will accrue to those companies that can deploy vast resources to acquire data (in TikTok’s case – users who generate the data) and build massive data and ML infrastructure. I am sure we will see similar revenue and spending trends if we analyze Meta and Google’s results.

While Data Science and Machine Learning careers grab the limelight, making ML platforms more efficient and processing data much cheaper will be more lucrative in the long term. 

If a company spends significant cash on ML and data infrastructure, it will always look for people to make things more efficient. Possible careers for the future:

  • Data Engineering
  • Data center operation and efficiency engineering
  • The broad “ML Operations” category

Natural Language Processing Made Easy with GPT-3

Natural Language Processing or NLP is a catch-all term for making sense of unstructured text-like data. Google search recommendations, chatbots, and grammar checkers are all forms of NLP.
This is a field with many years of research. But, for the last 5-7 years, machine learning has reigned supreme. 

Five years ago, machine learning approaches to NLP were labor intensive. Success meant having access to large amounts of clean and labeled training data that would train ML models. A text summarization model would be pretty different from one that did sentiment analysis. 

The development of large language models or LLMs has revolutionized this field. Models like GPT-3 are a general-purpose tools that can be used to do several different tasks with very little training.

To show GPT-3 in action, I built a tiny slack bot that asks some questions and uses GPT-3 to generate actions. The video below is a demo of the bot and also an explanation of how to prompt GPT-3 to do NLP tasks.

Machine Learning and its consequences

Machine Learning has brought huge benefits in many domains and generated hundreds of billions of dollars in revenue. However, the second-order consequences of machine learning-based approaches can lead to potentially devastating outcomes. 

This article by Kashmir Hill in the New York Times is exceptional reporting on a very sensitive topic – the identification of abusive material or CSAM. 

As the parent of two young children in the COVID age, I rely on telehealth services and friends who are medical professionals to help with anxiety-provoking (yet often trivial) medical situations. I often send photos of weird rashes or bug bites to determine if it is something to worry about.  

In the article, a parent took a photo of their child to send to a medical professional. This photo was uploaded to Google Photos, where it was flagged as being potentially abusive material by a machine learning algorithm. 

Google ended up suspending and permanently deleting his Gmail account and his Google Fi phone and flagging his account to law enforcement. 

Just imagine how you might deal with losing both your primary email account, your phone number, and your authenticator app. 

Finding and reporting abuse is critical. But, as the article illustrates, ML-based approaches often lack context. A photo shared with a medical professional may share similar features to those showing abuse. 

Before we start devolving more and more of our day-to-day lives and decisions to machine learning-based algorithms, we may want to consider the consequences of removing humans from the loop.