- OpenVoice, a new open-source voice cloning model, offers rapid and detailed voice cloning.
- OpenVoice allows users to control tone, emotion, and accent.
- Available for public use, including a web app interface by MyShell and on HuggingFace without an account.
January 3, 2024: A groundbreaking development in AI technology has emerged with the introduction of OpenVoice, an open-source voice cloning model.
This innovative tool, developed through a collaboration between the Massachusetts Institute of Technology (MIT), Tsinghua University, and Canadian AI startup MyShell, allows users to clone voices with extraordinary precision and control.
Today, we proudly open source our OpenVoice algorithm, embracing our core ethos – AI for all.
Experience it now: https://t.co/zHJpeVpX3t. Clone voices with unparalleled precision, with granular control of tone, from emotion to accent, rhythm, pauses, and intonation, using just a… pic.twitter.com/RwmYajpxOt
— MyShell (@myshell_ai) January 2, 2024
Unlike other proprietary algorithms and software that require significant development funds, OpenVoice stands out with its near-instant cloning capabilities and detailed control options.
It enables users to adjust various aspects of the voice, including tone, emotion, accent, rhythm, and intonation, using just a small audio clip. This level of control is not commonly found in existing voice cloning platforms.
OpenVoice’s release was accompanied by a research paper detailing its development.
It’s available on the MyShell web app interface and HuggingFace, offering widespread accessibility.
So, here’s an example of me (Mukund Kapoor) trying to use OpenVoice and generating a voice clone of my voice. I don’t think this is very close to how I sound but still a great tool.
MyShell’s lead researcher, Zengyi Qin, emphasized the company’s commitment to supporting the open-source research community, aligning with its vision of “AI for All.”
The creation of OpenVoice involved two distinct AI models: a text-to-speech model and a tone converter.
These models were trained on diverse audio samples and languages, enabling them to capture the nuances of human speech and emotion. As a result, OpenVoice can replicate a user’s voice and modify its emotional expression.
MyShell, a startup founded in Calgary, Alberta, in 2023, has quickly gained popularity.
They offer various AI-native apps and services, including OpenVoice. While the voice cloning tool is open-source, MyShell generates revenue through a subscription model for its web app and charges for AI training data.
OpenVoice marks a significant step in the realm of AI voice technology.
Its open-source nature and advanced capabilities pave the way for more accessible and versatile voice cloning applications.
This development is a glimpse into the future of Artificial General Intelligence, showcasing the potential of AI in language, vision, and voice modalities.