A trio of researchers from the Google Brain team recently unveiled the next big thing in AI language models: a massive transformer system with a trillion parameters.
The next largest model on the market, as far as we know, is OpenAI’s GPT-3, which only uses 175 billion parameters.
Background: Language models can perform a variety of functions, but perhaps the most popular is the generation of novel text. For example, you can talk to a “Philosopher AI” language model that tries to answer all of the questions you ask (with a number of notable exceptions).[Read next: How Netflix shapes mainstream culture, explained by data]
While these incredible AI models are at the cutting edge of machine learning technology, it’s important to remember that they are essentially just performing salon tricks. These systems don’t understand the language, they are just tuned to look like they do.
This is where the number of parameters comes into play – the more virtual knobs and dials you can turn and adjust to achieve the outputs you want, the more control you have over that output.
What Google did: Put simply, the Brain team found a way to make the model itself as simple as possible, while using as much computing power as possible to accommodate the increased number of parameters. In other words, Google has a lot of money, so it can afford to use as much hardware computing as the AI model can possibly use.
In the team’s own words:
Switch Transformers are scalable and effective natural language learners. We simplify the mix of experts to create an architecture that is easy to understand, stable to train, and significantly more sample-efficient than equally sized dense models. We find that these models excel in a variety of natural language tasks and in different training regimens, including pre-training, fine-tuning, and multitasking training. These advances make it possible to train models with hundreds of billions to trillion parameters that achieve significant accelerations compared to dense T5 baselines.
Take quickly: It is unclear what this exactly means or what Google plans to do with the techniques described in the pre-printed paper. There’s more to this model than just a one-off OpenAI, but exactly how Google or its customers might use the new system is a little mushy.
The big idea is that enough brute force leads to better techniques for using computers, which in turn make it possible to do more with less computation. The current reality, however, is that these systems do not justify their existence when compared to more environmentally friendly, more useful technologies. It’s difficult to develop an AI system that can only be run by trillion dollar tech companies willing to ignore the massive carbon footprint of such a large system.
Context: Google has pushed the limits of what AI can do for years, and it’s no different. Taken by itself, the achievement seems to be the logical progression of what has happened in the field. But the timing is a bit suspicious.
For your information @mmitchell_ai and I found out that there was a 40-person meeting about LLMs on Google in September that no one on our team was invited to or knew about the meeting. So, they don’t want ethical AI to be a stamp until they have decided what to do in their playground. https://t.co/tlT0tj1sTt
– Timnit Gebru (@timnitGebru) January 13, 2021
H / t: venture beat
Published on January 13, 2021 – 17:08 UTC