Small but Mighty: MIT's Small-Scale Language Model is Outperforming Giants

Greetings from Malaga, Spain. I’m here to keynote about AI and emerging tech at the Digital Enterprise Show tomorrow. You’d think that it would be okay to miss one day of tech news while changing time zones, but no. There’s something new every single day.

I read a white paper on the plane about a new small-scale language model called SimPLE (Simple Pseudo-Label Editing) that outperforms larger counterparts by up to 500 times in certain language understanding tests. SimPLE uses self-training – learning from its own predictions – eliminating the need for additional annotated training data. It was developed by researchers at MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL).

The model’s performance enhancement is attributed to the use of “textual entailment,” a relationship between two statements where the truth of one implies the truth of the other. This training approach has improved the model’s comprehension and adaptability across various tasks.

SimPLE also addresses privacy concerns associated with large language models. It requires only a task-defining template from annotators, avoiding direct handling of sensitive data. This method, combined with uncertainty estimates and voting strategies, ensures robust and accurate predictions.

Why do you care? Right now, everyone is focused on GPT-4 and other large language models (LLMs) that require mountains of data, have inherent privacy risks when used with sensitive data, and are expensive to train. It’s easy to imagine models like SimPLE disrupting the disruptors.

If you want to go deeper into the world of generative AI and get a better understanding of how LLMs are trained, please sign up for our free online course, Generative AI for Execs.

Author’s note: This is not a sponsored post. I am the author of this article and it expresses my own opinions. I am not, nor is my company, receiving compensation for it.

Shelly Palmer is the Professor of Advanced Media in Residence at Syracuse University’s S.I. Newhouse School of Public Communications and CEO of The Palmer Group, a consulting practice that helps Fortune 500 companies with technology, media and marketing. Named LinkedIn’s “Top Voice in Technology,” he covers tech and business for Good Day New York, is a regular commentator on CNN and writes a popular daily business blog. He's a bestselling author, and the creator of the popular, free online course, Generative AI for Execs. Follow @shellypalmer or visit shellypalmer.com.

Small but Mighty: MIT’s Small-Scale Language Model is Outperforming Giants

About Shelly Palmer

Tags

Categories

Get Briefed Every Day!