“Suno has developed proprietary state-of-the-art models that generate music and speech using AI. Modal's superb developer experience enables our team to ship new models to production quickly, and with and confidence we'll scale to thousands of simultaneous users.”
“At Phonic, we train our own proprietary models for audio generation. We moved all our large-scale audio processing batch jobs to Modal. Our engineers are ecstatic with the result – we can run at a much larger scale than before, no longer have to babysit our batch jobs, and we can ship much faster.”
Outperform managed APIs
Get faster speeds at lower costs compared to popular transcription APIs like AssemblyAI and Deepgram by leveraging open-source models on Modal.
Scale on demand
Distribute transcription tasks across hundreds of containers simultaneously.
Transform text into natural-sounding speech using the latest open-source models.
Deploy text-to-speech models like XTTS directly on Modal's platform.
Cutting-edge hardware access
Tap into Modal's fleet of A100 and H100 GPUs for memory-intensive voice models.
Lightning-fast cold starts
Generate speech on-demand without lengthy startup times with our optimized container file system and engine.
Use Cases