AI firm Stability AI, described as the “world’s first community-driven, open-source artificial intelligence company” just raised USD $101 million in funding.
According to Bloomberg, the oversubscribed funding round values the London-based tech startup at $1 billion.
According to a press statement, Stability AI plans to use the funding to speed up the development of its open AI models for image, language, audio, video, 3D, and more, both for consumer and for enterprise use cases globally.
Stability AI is the company behind Stable Diffusion, a free and open-source text-to-image generator that launched in August.
Since launching, Stability AI says that Stable Diffusion has been downloaded and licensed by more than 200,000 developers globally.
In addition to its work in image, language and video, Stability AI backs AI research in audio (and music) via a “community-driven” organization called Harmonai.
Harmonai, which releases open-source generative audio tools, claims on its website that it aims to “make music production more accessible and fun for everyone”.
It adds further that its tools will let you “generate your own custom infinite sound libraries”. It also claims to want to “bring the power back to the artists”.
Harmonai recently released an audio generation tool that it calls Dance Diffusion, described in this Weights & Biases blog post as a ‘family of audio-generating machine learning models’.
The article adds that, “you can use a pre-trained Dance Diffusion model (or train your own Dance Diffusion model) to generate random audio samples in a particular style, regenerate a given audio sample, or interpolate between two different audio samples”.
A few examples of audio samples generated by Harmonai’s Dance Diffusion can be heard here, ranging from piano pieces and strings to (rather unsettling) AI voices.
It’s still early days for this research, but with Stability AI’s major funding injection, we’ll likely be hearing about more developments from Harmonai in the future.
At the end of September, when Dance Diffusion was launched, TechCrunch pointed out that, “Assuming Dance Diffusion one day reaches the point where it can generate coherent whole songs, it seems inevitable that major ethical and legal issues will come to the fore”.
Back in August, Barry Scannell, a consultant lawyer with Ireland-based law firm, William Fry LLP, wrote an op/ed for MBW covering the legal issues around AI authorship in the music business.
In that piece, Scannell referenced another text-to-image generator called the DALL·E 2 platform from OpenAI, and noted that “as AI creative technologies become more common, the question of AI authorship is particularly important”.
He added: “Provided that the correct legal arrangements are in place and rights are being adequately protected, AI technology ought to be enormously beneficial to the music industry.
“AI technologies can be, and are, being used by authors and composers and musicians to create new music all the time, and the technology enhances creativity and productivity just as other music technologies have before it. The music industry needs to make the technology work for itself. AI could be an enormously beneficial technology to the music industry, provided that the underlying infrastructure is in place.”
Meanwhile, the development of tools that use artificial intelligence for music-making purposes has been attracting the attention of some of the most prominent players in the global entertainment industry.
For example, HYBE, the South Korea-based entertainment giant behind K-Pop stars like BTS and SEVENTEEN, recently acquired an AI voice startup called Supertone, and told its investors on Monday (October 17) that the company’s tech will “serve as a key piece of the technology sphere we aim to create”.
Founded in South Korea in 2020, Supertone claims to be able to create “a hyper-realistic and expressive voice that [is not] distinguishable from real humans”.
The startup generated global media attention in January 2021 with its so-called Singing Voice Synthesis (SVS) technology.
Supertone used this tech to “resurrect” the voice of South Korean folk superstar Kim Kwang-seok, with the subsequent AI-generated voice debuted on Korean television show Competition of the Century: AI vs Human (see below).
We also recently learned that games giant Activision has been developing tech for AI-generated personalized music for players within a video game, according to a US patent filed for an invention titled, “Systems and Methods for Dynamically Generating and Modulating Music Based on Gaming Events, Player Profiles and/or Player Reactions”.
Activision explains in its filing that “Multiplayer online gaming has seen explosive proliferation across the globe with access to a wide range of age groups” and “while many features of video games have become highly customizable, musical elements tend to be standardized across all players”.
The filing explains further that, “By automating the process of what kind of music is being played and to what intensity based on the situation, player experience, etc., music and audio can create more immersive and enjoyable gameplay experiences”.
It adds: “By leveraging artificial intelligence (AI), an infinite combination of music and audio can be automatically generated to avoid having to manually create music/audio which then needs to be tagged for play based on different situational queues.”
“AI promises to solve some of humanity’s biggest challenges. But we will only realize this potential if the technology is open and accessible to all.”
Emad Mostaque, Stability AI
Emad Mostaque, founder and CEO of Stability AI, said: “AI promises to solve some of humanity’s biggest challenges. But we will only realize this potential if the technology is open and accessible to all.
“Stability AI puts the power back into the hands of developer communities and opens the door for ground-breaking new applications. An independent entity in this space supporting these communities can create real value and change.”
“At Coatue, we believe that open source AI technologies have the power to unlock human creativity and achieve a broader good.”
Sri Viswanath, Coatue
Sri Viswanath, general partner at Coatue, said: “At Coatue, we believe that open source AI technologies have the power to unlock human creativity and achieve a broader good.
“Stability AI is a big idea that dreams beyond the immediate applications of AI. We are excited to be part of Stability AI’s journey, and we look forward to seeing what the world creates with Stability AI’s technology.”
“At Lightspeed, we believe that an independent company like Stability AI is best positioned to democratize generative AI.”
Gaurav Gupta, Lightspeed Venture Partners
Gaurav Gupta, partner at Lightspeed Venture Partners, said: “At Lightspeed, we believe that an independent company like Stability AI is best positioned to democratize generative AI.
“The company’s open-source model, along with its collaboration with AI developer communities, has allowed it to innovate rapidly and make the technology freely available to people everywhere.”
The funding round was led by Coatue, Lightspeed Venture Partners, and O’Shaughnessy Ventures LLC.Music Business Worldwide