A simple, robust, and OpenAI-compatible FastAPI wrapper for the MyShell-AI/MeloTTS text-to-speech engine.
This project provides an easy-to-use HTTP interface for MeloTTS, allowing you to integrate high-quality, natural-sounding text-to-speech into your applications with simple API calls. The API structure is designed to mimic common TTS services for easy integration.
- High-Quality TTS: Leverages the power of MeloTTS for fast and natural-sounding speech synthesis.
- OpenAI-Compatible Endpoint: Includes a
/v1/audio/speechendpoint that mirrors the structure of OpenAI's TTS API for drop-in compatibility with tools like Open WebUI. - RESTful Interface: Provides clear endpoints to list available models and voices.
- Reliable Installation: A comprehensive
requirements.txtfile ensures a clean and complete installation in an isolated environment. - CPU Ready: Configured to run on CPU out-of-the-box, no GPU required.
For a reliable setup, please follow these steps exactly. This guide has been tested on Debian-based systems like Linux Mint 20 and Ubuntu 24.04.
sudo apt update && sudo apt upgrade -y
sudo apt install git python3 python3-pip python3-venv -ygit clone https://github.com/highfillgoods/MeloTTS-API-Locally.git cd MeloTTS-API-Locally 🌿 Step 3: Create and Activate a Virtual Environment
conda create --name melo_api_env python=3.11 -y conda activate melo_api_env
python3 -m venv venv source venv/bin/activate
This single command installs all necessary Python packages from the perfected requirements.txt file.
pip install -r requirements.txt
The text processor requires data packages from the NLTK library. This command downloads the necessary models.
python3 -m nltk.downloader averaged_perceptron_tagger punkt
uvicorn melotts_api:app --host 0.0.0.0 --port 8000The server will start, load the MeloTTS model, and become available at http://0.0.0.0:8000.
This API is designed to work directly with Open WebUI.
In Open WebUI, navigate to Settings > Audio. Configure the TTS Settings section with the following values: Text-to-Speech Engine: OpenAI OpenAI API Base URL: http://localhost:8000/v1 OpenAI API Key: Can be set to anything (e.g., 12345). Your settings should look like this:
You can also interact with the API directly using tools like curl.
curl http://localhost:8000/v1/models
curl http://localhost:8000/v1/audio/voices Synthesize Speech
curl -X POST \
http://localhost:8000/v1/audio/speech \
-H "Content-Type: application/json" \
--data '{
"input": "Hello, this is a test from the land down under.",
"voice": "EN-AU"
}' \
--output test_audio.mp3These instructions are for running the original Gradio WebUI developed by MyShell-AI, which is separate from the FastAPI server above. It's recommended to do this in a different folder and a new, clean environment.
Clone the Original MeloTTS Repository
git clone https://github.com/myshell-ai/MeloTTS.git cd MeloTTS Create a Separate, Clean Environment (e.g., conda create --name melo_ui_env python=3.11 -y) and activate it.
pip install torch torchvision torchaudio gradio librosa tqdm transformers cn2an pypinyin jieba eng_to_ipa inflect unidecode num2words pykakasi fugashi g2p_en anyascii jamo gruut cached_path unidic
python3 -m nltk.downloader averaged_perceptron_tagger punkt python3 -m melo.app
