MusicLM: Google AI generates music in various genres at 24 kHz

An AI-generated image of an exploding ball of music.

On Thursday, researchers from Google announced a new generative AI model called MusicLM that can create 24 KHz musical audio from text descriptions, such as “a calming violin melody backed by a distorted guitar riff.” It can also transform a hummed melody into a different musical style and output music for several minutes.

MusicLM uses an AI model trained on what Google calls “a large dataset of unlabeled music,” along with captions from MusicCaps, a new dataset composed of 5,521 music-text pairs. MusicCaps gets its text descriptions from human experts and its matching audio clips from Google’s AudioSet, a collection of over 2 million labeled 10-second sound clips pulled from YouTube videos.

Generally speaking, MusicLM works in two main parts: first, it takes a sequence of audio tokens (pieces of sound) and maps them to semantic tokens (words that represent meaning) in captions for training. The second part receives user captions and/or input audio and generates acoustic tokens (pieces of sound that make up the resulting song output). The system relies on an earlier AI model called AudioLM (introduced by Google in September) along with other components such as SoundStream and MuLan.

Read 7 remaining paragraphs | Comments

Post Views: 55

MusicLM: Google AI generates music in various genres at 24 kHz

technology_o6swjd

You May Also Like

Volkswagen Group plans to roll out an app store for its car brands soon, including apps from TikTok, Spotify, Yelp, and other third-party car-optimized apps (Patrick George/The Verge)

Breast cancer: The AI tool trained to spot the disease