Daisys API JSON models¶
Pydantic classes representing the JSON interface for the Daisys API.
- class daisys.v1.speak.models.AffectProsody(*, pitch: int, pace: int, valence: int, dominance: int, arousal: int)¶
- Prosody features based on analysis of affect. See also parent class - ProsodyFeaturesfor other fields.- valence¶
- The valence; -10 for negativity, 10 for positivity, 0 for neutral. - Type:
- int 
 
 - arousal¶
- The arousal; -10 for unexcited, 10 for very excited, 0 for neutral. - Type:
- int 
 
 - dominance¶
- The dominance; -10 for docile, 10 for commanding, 0 for neutral. - Type:
- int 
 
 - model_config: ClassVar[ConfigDict] = {}¶
- Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict]. 
 
- class daisys.v1.speak.models.ProsodyFeatures(*, pitch: int, pace: int)¶
- Base prosody features supported by all models. - pitch¶
- The normalized pitch; -10 to 10, where 0 is a neutral pitch. - Type:
- int 
 
 - pace¶
- The normalized pace; -10 to 10, where 0 is a neutral pace. - Type:
- int 
 
 - model_config: ClassVar[ConfigDict] = {}¶
- Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict]. 
 
- daisys.v1.speak.models.ProsodyFeaturesUnion¶
- A union type representing different prosody feature variations. - alias of - SimpleProsody|- AffectProsody|- SignalProsody
- class daisys.v1.speak.models.ProsodyType(value)¶
- An enum representing different prosody feature types. - Not all models accept all prosody types. See the prosody_types field of - TTSModel.- SIMPLE¶
- corresponds with SimpleProsody 
 - AFFECT¶
- corresponds with AffectProsody 
 - SIGNAL¶
- corresponds with SignalProsody 
 - static from_class(prosody: SimpleProsody | AffectProsody | SignalProsody)¶
- Return an enum value based on the prosody class provided. - Parameters:
- prosody – The prosody object from which to derive the enum value. 
 
 - prosody(**kwargs)¶
- Return a prosody object corresponding to this value, initialized with the given arguments. 
 
- class daisys.v1.speak.models.SignalProsody(*, pitch: int, pace: int, tilt: int, pitch_range: int)¶
- Prosody features based on signal analysis. See also parent class ProsodyFeatures for other fields. - tilt¶
- The normalized spectral tilt; -10 for flat, 10 for bright, 0 for neutral. - Type:
- int 
 
 - pitch_range¶
- The normalized pitch range; -10 for flat, 10 for highly varied pitch, 0 for neutral. - Type:
- int 
 
 - model_config: ClassVar[ConfigDict] = {}¶
- Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict]. 
 
- class daisys.v1.speak.models.SimpleProsody(*, pitch: int, pace: int, expression: int)¶
- Simplified prosody features, supported by all models. See also parent class - ProsodyFeaturesfor other fields.- expression¶
- The normalized “expression”; -10 to 10, where 0 is neutral. - Type:
- int 
 
 - model_config: ClassVar[ConfigDict] = {}¶
- Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict]. 
 
- class daisys.v1.speak.models.Status(value)¶
- Represents the status of a take or voice generation process. - WAITING¶
- Item is waiting to be processed. 
 - STARTED¶
- Processing has started for this item. 
 - PROGRESS_25¶
- Item has been 25% processed. 
 - PROGRESS_50¶
- Item has been 50% procesesd. 
 - PROGRESS_75¶
- Item has been 75% procesesd. 
 - READY¶
- Item is ready to be used; for takes, audio is available. 
 - ERROR¶
- An error occurred during processing of this item. 
 - TIMEOUT¶
- Processing did not finish for this item. 
 - Note that - TIMEOUTis used for very long intervals; it does not indicate a few seconds or minutes, but rather that an item has been in the queue for more than a day and has therefore been removed. It should only be considered to represent circumstances where processing errors were not detected by normal means.
- class daisys.v1.speak.models.StreamMode(value)¶
- Whether a websocket messages should contain a whole part or chunks of parts. - Note: upper case in Python, lower case in JSON. - Values:
- PARTS, CHUNKS 
 
- class daisys.v1.speak.models.StreamOptions(*, mode: StreamMode = StreamMode.PARTS)¶
- Options for streaming. - mode¶
- The streaming mode to use. 
 - model_config: ClassVar[ConfigDict] = {}¶
- Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict]. 
 
- class daisys.v1.speak.models.TTSModel(*, name: str, displayname: str, flags: list[str] = [], languages: list[str], genders: list[VoiceGender], styles: list[list[str]] = [], prosody_types: list[ProsodyType], voice_inputs: list[VoiceInputType] | None)¶
- Information about a speech model. - name¶
- The unique identifier of this model. - Type:
- str 
 
 - displayname¶
- A friendlier name that might contain spaces. - Type:
- str 
 
 - flags¶
- A list of flags that indicate some features of this model. - Type:
- list[str] 
 
 - languages¶
- A list of languages supported by this model. - Type:
- list[str] 
 
 - genders¶
- A list of genders supported by this model. - Type:
 
 - styles¶
- A list of style sets; each sublist is a list of mutually exlusive style tags. - Type:
- list[list[str]] 
 
 - prosody_types¶
- A list of which prosody types are supported by this model. - Type:
 
 - voice_inputs¶
- A list of which voice input types are supported by this model. - Type:
- list[daisys.v1.speak.models.VoiceInputType] | None 
 
 - model_config: ClassVar[ConfigDict] = {}¶
- Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict]. 
 
- class daisys.v1.speak.models.TakeGenerate(*, text: str, override_language: str | None = None, style: list[str] | None = None, prosody: SimpleProsody | AffectProsody | SignalProsody | None = None, status_webhook: Webhook | None = None, done_webhook: Webhook | None = None, user_data: Annotated[str, StringConstraints(strip_whitespace=None, to_upper=None, to_lower=None, strict=None, min_length=None, max_length=256, pattern=None)] | int | float | None = None, voice_id: str)¶
- Parameters necessary to generate a “take”, an audio file containing an utterance of the given text by the given voice. See - TakeGenerateWithoutVoicefor documentation on the remaining fields.- voice_id¶
- The id of the voice to be used for generating audio. The voice is attached to a specific model. - Type:
- str 
 
 - model_config: ClassVar[ConfigDict] = {}¶
- Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict]. 
 
- class daisys.v1.speak.models.TakeGenerateWithoutVoice(*, text: str, override_language: str | None = None, style: list[str] | None = None, prosody: SimpleProsody | AffectProsody | SignalProsody | None = None, status_webhook: Webhook | None = None, done_webhook: Webhook | None = None, user_data: Annotated[str, StringConstraints(strip_whitespace=None, to_upper=None, to_lower=None, strict=None, min_length=None, max_length=256, pattern=None)] | int | float | None = None)¶
- Parameters necessary to generate a “take”, an audio file containing an utterance of the given text. No voice is provided here, for the purpose of embedding in - VoiceGeneratefor the voice example.- text¶
- The text that the voice should say. - Type:
- str 
 
 - override_language¶
- Normally a language classifier is used to detect the language of the speech; this allows for multilingual sentences. However, if the language should be enforced, it should be provided here. Currently accepted values are “nl-NL” and “en-GB”. - Type:
- str | None 
 
 - style¶
- A list of styles to enable when speaking. Note that most styles are mutually exclusive, so a list of 1 value should be provided. Accepted styles can be retrieved from the associated voice’s - VoiceInfo.stylesor the model’s- TTSModel.stylesfield. Note that not all models support styles, thus this can be left empty if specific styles are not desired.- Type:
- list[str] | None 
 
 - prosody¶
- The characteristics of the desired speech not determined by the voice or style. Here you can provide a - SimpleProsodyor most models also accept the more detailed- AffectProsody.
 - status_webhook¶
- An optional URL to be called using - POSTwhenever the take’s status changes, with- TakeResponsein the body content.- Type:
 
 - done_webhook¶
- An optional URL to be called exactly once using - POSTwhen the take is- READY,- ERROR, or- TIMEOUT, with- TakeResponsein the body content.- Type:
 
 - user_data¶
- An optional string (max 256 chars) or numerical value that can be attached to a take for use in user applications; for example, storing video timestamps, sentence index, or external database keys. - Type:
- str | int | float | None 
 
 - model_config: ClassVar[ConfigDict] = {}¶
- Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict]. 
 
- class daisys.v1.speak.models.TakeInfo(*, duration: int, audio_rate: int, normalized_text: list[str])¶
- Some information available when a take is - READY, attached to the- TakeResponse.- duration¶
- The length of the audio in samples. To get the length in seconds, divide by audio_rate. - Type:
- int 
 
 - audio_rate¶
- The number of samples per second in the audio. - Type:
- int 
 
 - normalized_text¶
- The text used for text-to-speech after normalization, ie. translated from “as written” to “as spoken”. Provided as a list of sentences. - Type:
- list[str] 
 
 - model_config: ClassVar[ConfigDict] = {}¶
- Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict]. 
 
- class daisys.v1.speak.models.TakeResponse(*, text: str, override_language: str | None = None, style: list[str] | None = None, prosody: SimpleProsody | AffectProsody | SignalProsody | None = None, status_webhook: Webhook | None = None, done_webhook: Webhook | None = None, user_data: Annotated[str, StringConstraints(strip_whitespace=None, to_upper=None, to_lower=None, strict=None, min_length=None, max_length=256, pattern=None)] | int | float | None = None, voice_id: str, take_id: str, status: Status, timestamp_ms: int, info: TakeInfo | None = None)¶
- Information about a take, returned during and after take generation. Also includes fields from - TakeGenerate.- take_id¶
- The unique identifier of this take. - Type:
- str 
 
 - status¶
- The status of this take, whether it is ready, in error, or in progress. 
 - timestamp_ms¶
- The timestamp that this take generation was requested, in milliseconds since epoch. - Type:
- int 
 
 - model_config: ClassVar[ConfigDict] = {}¶
- Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict]. 
 
- class daisys.v1.speak.models.Version(*, version: int, minor: int)¶
- Represents the version of the API. - version¶
- The major version number of the API. - Type:
- int 
 
 - minor¶
- The minor version number of the API. - Type:
- int 
 
 - model_config: ClassVar[ConfigDict] = {}¶
- Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict]. 
 
- class daisys.v1.speak.models.VoiceGender(value)¶
- Represents the gender of a voice. - Note: upper case in Python, lower case in JSON. - Values:
- MALE, FEMALE, NONBINARY 
 
- class daisys.v1.speak.models.VoiceGenerate(*, name: str, model: str, gender: VoiceGender, description: str | None = None, default_style: list[str] | None = None, default_prosody: SimpleProsody | AffectProsody | SignalProsody | None = None, example_take: TakeGenerateWithoutVoice | None = None, done_webhook: Webhook | None = None)¶
- Parameters necessary to generate a voice. - name¶
- A name to give the voice, may be any string, and does not need to be unique. - Type:
- str 
 
 - gender¶
- The gender of this voice. 
 - description¶
- A description of this voice. - Type:
- str | None 
 
 - default_style¶
- An optional list of styles to associate with this voice by default. It can be overriden by a take that uses this voice. Note that most styles are mutually exclusive, and not all models support styles. - Type:
- list[str] | None 
 
 - default_prosody¶
- An optional default prosody to associate with this voice. It can be overridden by a take that uses this voice. 
 - example_take¶
- Parameters for an example take to generate for this voice. If not provided, a default example text will be used, depending on the language of the model. - Type:
 
 - done_webhook¶
- An optional URL to call using - POSTwhen the voice is available, with the response of VoiceInfo in the body content. This shall be called once, after the voice and example take have been generated.- Type:
 
 - model_config: ClassVar[ConfigDict] = {}¶
- Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict]. 
 
- class daisys.v1.speak.models.VoiceInfo(*, name: str, model: str, gender: VoiceGender, description: str | None = None, default_style: list[str] | None = None, default_prosody: SimpleProsody | AffectProsody | SignalProsody | None = None, example_take: TakeGenerateWithoutVoice | None = None, done_webhook: Webhook | None = None, voice_id: str, status: Status, timestamp_ms: int, example_take_id: str | None = None)¶
- Information about a voice. - voice_id¶
- The unique identifier of this voice. - Type:
- str 
 
 - status¶
- The status of this voice, whether it is ready, in error, or in progress. 
 - timestamp_ms¶
- The timestamp that this voice generation was requested, in milliseconds since epoch. - Type:
- int 
 
 - example_take_id¶
- An optional identifier for a take that represents an example of this voice. - Type:
- str | None 
 
 - model_config: ClassVar[ConfigDict] = {}¶
- Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict]. 
 
- class daisys.v1.speak.models.VoiceUpdate(*, name: str | None = None, gender: VoiceGender | None = None, default_style: list[str] | None = None, default_prosody: SimpleProsody | AffectProsody | SignalProsody | None = None)¶
- Update parameters of a voice. - name¶
- A name to give the voice, may be any string, and does not need to be unique. - Type:
- str | None 
 
 - gender¶
- The gender of this voice. - Type:
 
 - default_style¶
- An optional list of styles to associate with this voice by default. It can be overriden by a take that uses this voice. Note that most styles are mutually exclusive, and not all models support styles. - Type:
- list[str] | None 
 
 - default_prosody¶
- An optional default prosody to associate with this voice. It can be overridden by a take that uses this voice. 
 - model_config: ClassVar[ConfigDict] = {}¶
- Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict]. 
 
- class daisys.v1.speak.models.Webhook(*, post_url: str, timestamp_ms: int | None = None, status_code: int | None = None)¶
- Store information about a registered webhook and its status. - When specifying a webhook, only - urlneeds to be provided.- post_url¶
- The URL to be called with POST. - Type:
- str 
 
 - timestamp_ms¶
- The time it was last called at, milliseconds since epoch. - Type:
- int | None 
 
 - status_code¶
- The HTTP status code of the last response from the webhook. - Type:
- int | None 
 
 - model_config: ClassVar[ConfigDict] = {}¶
- Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].