data-nimg=”1″ style=”color:transparent” src=”https://img.decrypt.co/insecure/rs:fit:3840:0:0:0/plain/https://cdn.decrypt.co/wp-content/uploads/2025/05/AI-music-decrypt-style-03-gID_7.jpg@webp” alt=”AI music generation” width=”1700″ height=”956″>
The artificial intelligence landscape for music creation is experiencing a significant evolution with two major updates from industry players: ElevenLabs’ Music v2 and Stability AI’s Stable Audio 3.0. These releases introduce advanced compositional capabilities, longer track generation, and a strong emphasis on licensed training data, aiming to address the complexities of music creation while navigating the burgeoning legal challenges surrounding AI-generated content.
Key Takeaways
- ElevenLabs has launched Music v2, featuring advanced functionalities like mid-track genre switching, section-by-section song construction, and precise audio inpainting.
- Stability AI has released Stable Audio 3.0, comprising four models with open weights for three variants, trained on licensed music, and capable of producing tracks up to six minutes and twenty seconds.
- Both ElevenLabs and Stability AI are prioritizing the use of licensed training data to mitigate copyright concerns, a critical factor following recent industry lawsuits.
- These developments aim to challenge the current market leader, Suno, which holds a substantial valuation and user base but faces ongoing legal scrutiny.
- The new models offer enhanced control for creators, enabling more sophisticated and coherent audio outputs for a variety of applications.
ElevenLabs, a voice AI company with a significant valuation, has unveiled Music v2, its second-generation music model. This iteration promises unprecedented coherence in generative audio, allowing for seamless transitions between disparate genres such as opera and heavy metal within a single track, and the integration of non-musical sound effects without compromising the overall structure. This addresses a common failing in generative audio, where complexity often leads to degradation of quality.
The new model introduces sophisticated editing capabilities, including inpainting, which allows users to regenerate specific segments of an audio track while preserving the rest. Furthermore, Music v2 facilitates a structured approach to song creation, enabling users to build compositions section by section (intro, verse, chorus) with maintained continuity. This level of control is a significant step forward for both amateur creators and professional sound designers.
Music v2 is accessible through multiple platforms: ElevenMusic for individual creators, ElevenAPI for developers integrating AI audio into their applications, and ElevenCreative for brand-focused audio solutions. ElevenLabs has also reduced pricing for its API and creative tiers, signaling a push to capture a wider market share, particularly targeting Suno’s user base with its consumer-facing ElevenMusic app.
Simultaneously, Stability AI has launched Stable Audio 3.0, an expanded offering that includes four distinct models. Notably, three of these models feature open weights, encouraging broader community development and innovation. The models are trained on licensed data, a crucial move given the ongoing copyright disputes in the AI music sector. Stable Audio 3.0 extends track generation capabilities to over six minutes, a significant increase from its predecessor.
The Stable Audio 3.0 family includes specialized models like Small SFX for sound effects and Small for on-device music composition, requiring no GPU. Larger models, Medium and Large, offer longer generation times and higher fidelity, with the Large model reserved for API access by enterprise clients. The architecture incorporates a new semantic-acoustic autoencoder designed to ensure melodic consistency over extended audio pieces. Support for LoRA fine-tuning allows artists to customize the models for specific styles or to incorporate their existing work, while inpainting features provide granular control for editing and extending tracks.
Stability AI’s strategy with open weights echoes its successful approach with Stable Diffusion, aiming to foster a vibrant developer ecosystem around its audio generation technology. Their partnerships with major music labels like Universal Music Group and Warner Music Group underscore a commitment to legally compliant audio generation.
Both ElevenLabs and Stability AI are positioning their new offerings as competitive alternatives to Suno, the current frontrunner in AI music generation, which boasts a substantial valuation and user engagement. However, Suno faces significant legal challenges from major record labels, prompting competitors to prioritize licensed data and transparent practices. While Udio has settled its lawsuits and become a closed platform, ElevenLabs and Stability AI are pursuing different avenues, with licensing deals and open-weight models, respectively, to ensure wider adoption and legal security.
Long-Term Technological Impact on the Industry
The advancements in AI music generation, as exemplified by ElevenLabs’ Music v2 and Stability AI’s Stable Audio 3.0, are poised to profoundly reshape the music industry’s technological landscape. The ability to generate highly coherent, genre-fluid, and structurally complex music through simple prompts or section-by-section composition democratizes music creation. This will likely lead to an explosion of new musical content, potentially blurring the lines between human-composed and AI-generated works. The integration of AI tools like inpainting and fine-tuning with LoRA models provides creators with unprecedented control, enabling rapid prototyping and iteration of musical ideas. This will accelerate the creative process, allowing artists and producers to explore a wider sonic palette and develop unique musical signatures more efficiently. Furthermore, the focus on licensed data and open-weight models suggests a future where AI audio generation is more ethically and legally sound, fostering greater trust and adoption within the professional music sphere. This could lead to new revenue streams through AI-powered composition tools, custom sound design for media, and personalized music experiences, fundamentally altering music production workflows and the economics of music creation and distribution.
According to the portal: decrypt.co
