Earlier this week, I had the pleasure of keynoting and hosting the MMA’s CMO AI Transformation Summit presented by Meta and Bain. Big thanks to Greg Stuart, his outstanding MMA team, and our presenting sponsors. Special thanks to Mark Marshall and his extraordinary team at NBCU.
During the Summit, one key topic was the pressing need for ways to control and moderate LLM output, so Mistral’s latest announcement is timely. The company has launched a new moderation API, powered by its fine-tuned Ministral 8B model, designed to classify content across nine categories, including violence, self-harm, and personally identifiable information. The tool, which underpins Mistral’s Le Chat chatbot platform, processes both raw and conversational text in multiple languages, offering customizable moderation capabilities that align with varying safety standards.
Entering a competitive space, Mistral’s API joins Jigsaw’s Perspective API and OpenAI’s moderation API, with each addressing critical AI moderation needs but facing challenges with biases and linguistic nuances. Jigsaw’s tool is known for sentiment analysis but can misclassify informal language as toxic, while OpenAI’s moderation API is valued for flexibility across applications. Mistral says its solution “sets itself apart by focusing on model-generated harm reduction and cost efficiency through batch processing, cutting costs by 25% on high-volume requests with asynchronous operations.”
Making LLMs and AI tech stacks “safe for work” is extremely hard to do. If you’re not familiar with the process of post-training LLMs and AI tech stacks, please reach out and I’ll direct you to some resources.
Author’s note: This is not a sponsored post. I am the author of this article and it expresses my own opinions. I am not, nor is my company, receiving compensation for it. This work was created with the assistance of various generative AI models.