Supported languages

The real-time API accepts ISO-639-1 language codes (en, es, zh, …). Regional tags are accepted (en-US, pt-BR) — only the primary subtag is used.

Pass auto as sourceLanguage to let Pinch detect the spoken language; the detected code is returned on every transcript frame as detected_language.


Languages

Every code below is supported as both sourceLanguage (input speech) and targetLanguage (output text). The Voice output column marks which languages also have a synthesized voice — use these with audioOutputEnabled=true. Languages without a voice are text-only and must be requested with audioOutputEnabled=false.

CodeLanguageVoice output
arArabic
csCzech
daDanish
deGerman
elGreek
enEnglish
esSpanish
faPersian (Farsi)
fiFinnish
filFilipino
frFrench
hiHindi
huHungarian
idIndonesian
itItalian
jaJapanese
koKorean
mkMacedonian
msMalay
nlDutch
plPolish
ptPortuguese
roRomanian
ruRussian
svSwedish
thThai
trTurkish
viVietnamese
yueCantonese
zhChinese (Mandarin)

Every voice-output language ships with both a male and a female voice — select via voiceType (male or female).

Requesting an unsupported targetLanguage with audioOutputEnabled=true returns an error frame. Pass audioOutputEnabled=false to get transcripts in any of the languages above.