AudioCraft: Meta AI’s Text-to-Sound Model

chest-shot_-of-a-young-woman-looking-straight-into-the-camara-wearing-_over-the-ear_-headphones

Illustration created by SDXL 1.0 with the prompt “A chest shot of a young woman looking straight into the camera wearing over-the-ear headphones listening to music – which she is enjoying very much. The background is a bokeh of an exceptionally modern city street”

 

In a stunning display of AI prowess, Meta AI has introduced AudioCraft, a single-stop code base that does an impressive job of composing, arranging, orchestrating, producing and playing music. AudioCraft is not just an AI that can mimic a melody. It’s a sophisticated system that can create music, sound effects, and even handle audio compression.

The system operates on a single autoregressive Language Model (LM) that works with compressed discrete music representation, or tokens. It’s a simple yet elegant approach that efficiently models audio sequences, capturing the long-term dependencies in the audio and generating high-quality sound.

People are going to instantly ask questions about subjective quality, emotional integrity, the magic of music, etc. I will explore these subjects in detail in my upcoming Sunday essay, “AI Can Hum a Tune. But Is It Music or Just Notes?” For today, just visit the AudioCraft site and listen. Remember, you’re the world’s foremost expert on the music you like. And, when it comes to music, your opinion is the only one that matters.

If you want to learn more about how AI models create music, sign up for our free online course Generative AI for Execs. It will give you a good baseline understanding of the topic.

Author’s note: This is not a sponsored post. I am the author of this article and it expresses my own opinions. I am not, nor is my company, receiving compensation for it.

Get Briefed Every Day!

Subscribe to my daily newsletter featuring current events and the top stories in technology, media, and marketing.

Subscribe