Request Session - DeepL Documentation

Authorizations

Authorization

string

header

required

Authentication with Authorization header and DeepL-Auth-Key authentication scheme. Example: DeepL-Auth-Key <api-key>

Body

application/json

source_media_content_type

enum<string>

required

The audio format for streaming, which specifies container, codec, and encoding parameters. See the table below for supported formats. If audio/auto is specified, the server will auto-detect the container and codec for all supported combinations, except PCM. That requires explicit encoding parameters. All formats need to be single channel audio.

Content Type	Container	Codec
`audio/auto`	Auto-detect: FLAC / Matroska / MPEG / Ogg / WebM	Auto-detect AAC / FLAC / MP3 / OPUS
`audio/flac`	FLAC (flac)	FLAC
`audio/mpeg`	MPEG (mp3/m4a)	MP3
`audio/ogg`	Ogg (ogg/oga)	Auto-detect FLAC / OPUS
`audio/webm`	WebM (webm)	OPUS
`audio/x-matroska`	Matroska (mkv/mka)	Auto-detect: AAC / FLAC / MP3 / OPUS
`audio/ogg;codecs=flac`	Ogg (ogg/oga)	FLAC
`audio/ogg;codecs=opus`	Ogg (ogg/oga)	OPUS
`audio/pcm;encoding=alaw;rate=8000`	-	PCM A-Law 8000 Hz (G.711)
`audio/pcm;encoding=ulaw;rate=8000`	-	PCM µ-Law 8000 Hz (G.711)
`audio/pcm;encoding=s16le;rate=8000`	-	PCM signed 16-bit little-endian 8000 Hz
`audio/pcm;encoding=s16le;rate=16000`	-	PCM signed 16-bit little-endian 16000 Hz
`audio/pcm;encoding=s16le;rate=44100`	-	PCM signed 16-bit little-endian 44100 Hz
`audio/pcm;encoding=s16le;rate=48000`	-	PCM signed 16-bit little-endian 48000 Hz
`audio/webm;codecs=opus`	WebM (webm)	OPUS
`audio/x-matroska;codecs=aac`	Matroska (mkv/mka)	AAC
`audio/x-matroska;codecs=flac`	Matroska (mkv/mka)	FLAC
`audio/x-matroska;codecs=mp3`	Matroska (mkv/mka)	MP3
`audio/x-matroska;codecs=opus`	Matroska (mkv/mka)	OPUS

We recommend the following bitrates as good tradeoff between quality and bandwidth:

AAC: 96 kbps
FLAC: 256 kbps (16000 Hz)
MP3: 128 kbps
OPUS: 32 kbps (recommendation for low bandwidth scenarios)
PCM: 256 kbps (16000 Hz, default recommendation)

Available options:

audio/auto,

audio/flac,

audio/mpeg,

audio/ogg,

audio/webm,

audio/x-matroska,

audio/ogg;codecs=flac,

audio/ogg;codecs=opus,

audio/pcm;encoding=alaw;rate=8000,

audio/pcm;encoding=ulaw;rate=8000,

audio/pcm;encoding=s16le;rate=8000,

audio/pcm;encoding=s16le;rate=16000,

audio/pcm;encoding=s16le;rate=44100,

audio/pcm;encoding=s16le;rate=48000,

audio/webm;codecs=opus,

audio/x-matroska;codecs=aac,

audio/x-matroska;codecs=flac,

audio/x-matroska;codecs=mp3,

audio/x-matroska;codecs=opus

Example:

"audio/ogg;codecs=opus"

message_format

enum<string>

default:json

Message encoding format for WebSocket communication. Determines how messages are serialized and transmitted. Using json, messages are JSON-encoded and sent as TEXT WebSocket frames. All binary fields (such as audio data) are base64-encoded strings. Using msgpack, messages are MessagePack-encoded and sent as BINARY WebSocket frames. All binary fields (such as audio data) contain raw binary data.

For more details, see Message Encoding.

Available options:

json,

msgpack

Example:

"json"

source_language

enum<string>

The source language of the audio stream. It can be left empty or must be one of the supported Voice API source languages and comply with IETF BCP 47 language tags. Note: Some source transcription languages are provided through external service partners. See the supported languages table for details.

Available options:

ar,

bg,

bn,

cs,

da,

de,

el,

en,

es,

et,

fi,

fr,

ga,

he,

hr,

hu,

id,

it,

ja,

ko,

lt,

lv,

mt,

nb,

nl,

pl,

pt,

ro,

ru,

sk,

sl,

sv,

th,

tl,

tr,

uk,

vi,

zh

Example:

"en"

source_language_mode

enum<string>

default:auto

Controls how the source_language value is used.

auto: Treats source language as a hint; server can override
fixed: Treats source language as mandatory; server must use this language

Available options:

auto,

fixed

Example:

"fixed"

target_languages

enum<string>[]

List of target languages for translation. The stream will emit translations for each language. The maximum allowed target languages per stream is 5. Language identifiers must comply with IETF BCP 47. See the supported languages table for details.

Maximum array length: 5

Available options:

ar,

bg,

bn,

cs,

da,

de,

el,

en,

en-GB,

en-US,

es,

et,

fi,

fr,

ga,

he,

hr,

hu,

id,

it,

ja,

ko,

lt,

lv,

mt,

nb,

nl,

pl,

pt,

pt-BR,

pt-PT,

ro,

ru,

sk,

sl,

sv,

th,

tl,

tr,

uk,

vi,

zh,

zh-HANS,

zh-HANT

Example:

["de", "fr", "es"]

target_media_languages

enum<string>[]

(EAP) List of target languages for which to generate synthesized audio. Languages specified here will automatically be added to target_languages if not already present, ensuring you receive both text translation and audio synthesis for these languages. If omitted, only text transcription and translation will be provided (no audio synthesis). The maximum allowed target media languages per stream is 5. Language identifiers must comply with IETF BCP 47. Note: Some translated audio languages are provided through external service partners. See the supported languages table for details.

Maximum array length: 5

Available options:

ar,

bg,

cs,

da,

de,

el,

en,

en-GB,

en-US,

es,

fi,

fr,

hu,

id,

it,

ja,

ko,

nb,

nl,

pl,

pt,

pt-BR,

pt-PT,

ro,

ru,

sk,

sv,

tr,

uk,

vi,

zh,

zh-HANS,

zh-HANT

Example:

["de", "en-GB"]

target_media_content_type

enum<string>

default:audio/webm;codecs=opus

(EAP) The audio format for synthesized target media streaming. Specifies container, codec, and encoding parameters for the audio returned in target_media_chunk messages. If not specified, defaults to audio/webm;codecs=opus. Only applies when target_media_languages is specified.

Content Type	Container	Codec
`audio/flac`	FLAC (flac)	FLAC 24000 Hz
`video/mp2t;codecs=aac`	MPEG Transport Stream (Audio only)	AAC 70 kbit/s
`video/mp2t;codecs=opus`	MPEG Transport Stream (Audio only)	OPUS 32 kbit/s
`audio/ogg`	Ogg (ogg/oga)	OPUS 32 kbit/s
`audio/ogg;codecs=flac`	Ogg (ogg/oga)	FLAC 24000 Hz
`audio/ogg;codecs=opus`	Ogg (ogg/oga)	OPUS 32 kbit/s
`audio/opus`	-	OPUS 32 kbit/s
`audio/pcm;encoding=alaw;rate=8000`	-	PCM A-Law 8000 Hz (G.711)
`audio/pcm;encoding=ulaw;rate=8000`	-	PCM µ-Law 8000 Hz (G.711)
`audio/pcm;encoding=s16le;rate=16000`	-	PCM signed 16-bit little-endian 16000 Hz
`audio/pcm;encoding=s16le;rate=24000`	-	PCM signed 16-bit little-endian 24000 Hz
`audio/webm`	WebM (webm)	OPUS 32 kbit/s
`audio/webm;codecs=opus`	WebM (webm)	OPUS 32 kbit/s
`audio/x-matroska;codecs=aac`	Matroska (mkv/mka)	AAC 70 kbit/s
`audio/x-matroska;codecs=flac`	Matroska (mkv/mka)	FLAC 24000 Hz
`audio/x-matroska;codecs=opus`	Matroska (mkv/mka)	OPUS 32 kbit/s

We recommend the following formats as good tradeoffs between quality and bandwidth:

OPUS (WebM): 32 kbps, recommended for low bandwidth scenarios (default)
PCM 24kHz: 384 kbps, high quality

Available options:

audio/flac,

video/mp2t;codecs=aac,

video/mp2t;codecs=opus,

audio/ogg,

audio/ogg;codecs=flac,

audio/ogg;codecs=opus,

audio/opus,

audio/pcm;encoding=alaw;rate=8000,

audio/pcm;encoding=ulaw;rate=8000,

audio/pcm;encoding=s16le;rate=16000,

audio/pcm;encoding=s16le;rate=24000,

audio/webm,

audio/webm;codecs=opus,

audio/x-matroska;codecs=aac,

audio/x-matroska;codecs=flac,

audio/x-matroska;codecs=opus

Example:

"audio/webm;codecs=opus"

target_media_voice

enum<string>

(EAP) Target audio voice selection for synthesized speech. The default voice is language dependent.

Available options:

male,

female

Example:

"female"

glossary_id

string

A unique ID assigned to a glossary.

Example:

"def3a26b-3e84-45b3-84ae-0c0aaf3525f7"

formality

enum<string>

default:default

Sets whether the translated text should lean towards formal or informal language. Possible options are:

default - use the default formality for the target language
formal/more - for a more formal language
informal/less - for a more informal language

Available options:

default,

formal,

more,

informal,

less

Example:

"formal"

Response

Successfully obtained streaming URL and token.

streaming_url

string

required

The WebSocket URL to use for establishing the stream connection.

Example:

"wss://api.deepl.com/v3/voice/realtime/connect"

token

string

required

A unique ephemeral token for authentication with the streaming endpoint. Pass this as a query parameter when connecting to the streaming URL. This token is ephemeral and valid for a short time and one-time use only.

Example:

"VGhpcyBpcyBhIGZha2UgdG9rZW4K"

session_id

string

Internal use only. A unique identifier for the requested stream.

Example:

"4f911080-cfe2-41d4-8269-0e6ec15a0354"

API Reference

Authorizations

Body

Response