Hands-on Google's MusicLM: Hitting Highs and Lows

By Chris McKay May 11, 2023 • 2 min read

MusicLM, Google's AI-powered text-to-music generator, is now available via the AI Test Kitchen. First announced in January, MusicLM has sparked significant interest as a potentially transformative tool in the field of music creation. This new product is expected to advance the capabilities of generative AI technologies in music production. But how well does this new technology perform when put to the test? Let’s delve deeper.

To set the stage, we must first appreciate the broader context. The intersection of AI and music isn't a new phenomenon. From OpenAI's MuseNet to Sony's Flow Machines, we've witnessed AI's ability to compose music in a range of styles and genres. Google’s MusicLM is the latest entrant into this unfolding AI symphony, promising to transform text prompts into music.

Yet, every tool has its limitations. Google explicitly states that MusicLM will not generate music that mentions specific artists or includes vocals. It's an honest caveat that is important in setting user expectations, but it limits the AI’s creative ambit.

In a series of tests, we challenged MusicLM with prompts that spanned across music genres and styles. When asked for an "uplifting EDM track with heavy bass drops", MusicLM created a fast-paced rhythm filled with synthesizers and drum machines. However, the melody was inconsistent, highlighting a common issue with AI music generation – lack of continuity. While it succeeded in generating an EDM-like texture, the absence of a continuous melody made the composition feel disjointed.

The tool faltered further when asked to generate "the score to an inspiring sports movie climax". This prompt returned an error message indicating that the model “cannot generate music for that.” Even after removing the word climax (which satisfied the tool’s constraints), the result was far from cohesive, sounding like muffled vocals rather than an inspiring anthem.

For a "minimalist ambient post-rock instrumental", MusicLM delivered something, but again it lacked cohesion. This was a recurring theme - the tool can generate sound, but stitching it together into a coherent piece of music is where it struggles.

Notably, the tool showed an intriguing trait when presented with a nonsensical input, "the galumph trundled over gormless unbirthday". Here it generated a moody guitar riff. This implies that even in the absence of clear semantic meaning, the tool doesn't fail to generate music, albeit with varying results.

Interestingly, when asked to generate "the next big pop hit single", the tool fell flat. Rather than a catchy tune, it churned out muffled vocals and noise, reminding us of the vast gulf that still exists between AI and the human creativity that fuels pop culture.

So, where does this leave us? MusicLM is a fascinating experiment, a testament to our ever-evolving understanding of artificial intelligence. It has potential, but it's clear that we are still a long way from AI tools that can replicate the richness, depth, and coherence of human-created music.

Like the AI models that have come before it, MusicLM contributes to our understanding of the interaction between AI and human creativity. It's a step forward, and while it may not be ready for a Grammy, it's certainly worth watching as the AI symphony continues to play.

‍

Chris McKay is the founder and chief editor of Maginative. His thought leadership in AI literacy and strategic AI adoption has been recognized by top academic institutions, media, and global brands.