Export Voice
Kyutai Pocket TTS allows you to generate speech with voice cloning from an audio sample. However, processing an audio file each time is relatively slow and inefficient.
The export-voice command allows you to convert an audio file to a voice embedding (it's actually the kvcache) in safetensors format. The safetensors file can then be loaded very quickly whenever you generate speech.
Basic Usage
uvx pocket-tts export-voice audio-path export-path
# or if installed manually:
pocket-tts export-voice audio-path export-path
Only the first 30 seconds of the audio file will be processed.
Command Options
Required Parameters
audio-path: Path of the audio file to convert.audio-pathcan point to anhttp:orhf:(hugging face) file. Supports popular audio file formats like wav and mp3.export-path: Path of the output safetensors file to write.
Options
--quiet: Do not print any output except errors.--config: Model config path or signature
Examples
# export a single file
pocket-tts export-voice voice_memo127762.mp3 jack.safetensors
# export an online file to current directory
pocket-tts export-voice https://huggingface.co/kyutai/tts-voices/resolve/main/alba-mackenna/announcer.wav ./announcer.safetensors
# use the exported safetensors
pocket-tts generate --text "Hello, welcome to today's game between the Bears and Cubs." --voice announcer.safetensors