Example: Websocket example, synchronous client¶
This example shows:
How to open a websocket connection using a context manager.
Generate a take, specifying status and audio callbacks.
The signature of each of these callbacks and how to interpret their arguments.
How to make a request with and without chunks enabled. (Add argument
--chunks
.)
Example output¶
$ python3 -m examples.websocket_example
Found Daisys Speak API version=1 minor=0
Status.WAITING
Status.STARTED
[0.739s] Received part_id=0 (chunk_id=None) for take_id='t01jqrjc9bfx8z0w6zarf8hcq8y' with audio length 235564
Read 235564 bytes of wav data, wrote "websocket_part1.wav".
Status.PROGRESS_50
[1.166s] Received part_id=1 (chunk_id=None) for take_id='t01jqrjc9bfx8z0w6zarf8hcq8y' with audio length 106540
Read 106540 bytes of wav data, wrote "websocket_part2.wav".
[1.166s] Received part_id=2 (chunk_id=None) for take_id='t01jqrjc9bfx8z0w6zarf8hcq8y' with audio length (empty -- done receiving)
Status.READY
Deleting take t01jqrjc9bfx8z0w6zarf8hcq8y: True
Example output (chunks enabled)¶
$ python3 -m examples.websocket_example --chunks
Found Daisys Speak API version=1 minor=0
Status.WAITING
Status.STARTED
[0.311s] Received part_id=0 (chunk_id=0) for take_id='t01jqrjq6xs1vbkdm1493e60gv2' with audio length 4140
[0.323s] Received part_id=0 (chunk_id=1) for take_id='t01jqrjq6xs1vbkdm1493e60gv2' with audio length 4096
[0.335s] Received part_id=0 (chunk_id=2) for take_id='t01jqrjq6xs1vbkdm1493e60gv2' with audio length 4096
[0.347s] Received part_id=0 (chunk_id=3) for take_id='t01jqrjq6xs1vbkdm1493e60gv2' with audio length 4096
[0.359s] Received part_id=0 (chunk_id=4) for take_id='t01jqrjq6xs1vbkdm1493e60gv2' with audio length 4096
[0.372s] Received part_id=0 (chunk_id=5) for take_id='t01jqrjq6xs1vbkdm1493e60gv2' with audio length 4096
[0.385s] Received part_id=0 (chunk_id=6) for take_id='t01jqrjq6xs1vbkdm1493e60gv2' with audio length 4096
[0.396s] Received part_id=0 (chunk_id=7) for take_id='t01jqrjq6xs1vbkdm1493e60gv2' with audio length 4096
[0.409s] Received part_id=0 (chunk_id=8) for take_id='t01jqrjq6xs1vbkdm1493e60gv2' with audio length 4096
[0.421s] Received part_id=0 (chunk_id=9) for take_id='t01jqrjq6xs1vbkdm1493e60gv2' with audio length 4096
[0.433s] Received part_id=0 (chunk_id=10) for take_id='t01jqrjq6xs1vbkdm1493e60gv2' with audio length 4096
[0.445s] Received part_id=0 (chunk_id=11) for take_id='t01jqrjq6xs1vbkdm1493e60gv2' with audio length 4096
[0.457s] Received part_id=0 (chunk_id=12) for take_id='t01jqrjq6xs1vbkdm1493e60gv2' with audio length 4096
[0.469s] Received part_id=0 (chunk_id=13) for take_id='t01jqrjq6xs1vbkdm1493e60gv2' with audio length 4096
[0.479s] Received part_id=0 (chunk_id=14) for take_id='t01jqrjq6xs1vbkdm1493e60gv2' with audio length 4096
[0.489s] Received part_id=0 (chunk_id=15) for take_id='t01jqrjq6xs1vbkdm1493e60gv2' with audio length 4096
[0.500s] Received part_id=0 (chunk_id=16) for take_id='t01jqrjq6xs1vbkdm1493e60gv2' with audio length 4096
[0.510s] Received part_id=0 (chunk_id=17) for take_id='t01jqrjq6xs1vbkdm1493e60gv2' with audio length 4096
[0.520s] Received part_id=0 (chunk_id=18) for take_id='t01jqrjq6xs1vbkdm1493e60gv2' with audio length 4096
[0.530s] Received part_id=0 (chunk_id=19) for take_id='t01jqrjq6xs1vbkdm1493e60gv2' with audio length 4096
[0.540s] Received part_id=0 (chunk_id=20) for take_id='t01jqrjq6xs1vbkdm1493e60gv2' with audio length 4096
[0.550s] Received part_id=0 (chunk_id=21) for take_id='t01jqrjq6xs1vbkdm1493e60gv2' with audio length 4096
[0.560s] Received part_id=0 (chunk_id=22) for take_id='t01jqrjq6xs1vbkdm1493e60gv2' with audio length 4096
[0.570s] Received part_id=0 (chunk_id=23) for take_id='t01jqrjq6xs1vbkdm1493e60gv2' with audio length 4096
[0.580s] Received part_id=0 (chunk_id=24) for take_id='t01jqrjq6xs1vbkdm1493e60gv2' with audio length 4096
[0.590s] Received part_id=0 (chunk_id=25) for take_id='t01jqrjq6xs1vbkdm1493e60gv2' with audio length 4096
[0.600s] Received part_id=0 (chunk_id=26) for take_id='t01jqrjq6xs1vbkdm1493e60gv2' with audio length 4096
[0.610s] Received part_id=0 (chunk_id=27) for take_id='t01jqrjq6xs1vbkdm1493e60gv2' with audio length 4096
[0.621s] Received part_id=0 (chunk_id=28) for take_id='t01jqrjq6xs1vbkdm1493e60gv2' with audio length 4096
[0.631s] Received part_id=0 (chunk_id=29) for take_id='t01jqrjq6xs1vbkdm1493e60gv2' with audio length 2560
[0.631s] Received part_id=0 (chunk_id=30) for take_id='t01jqrjq6xs1vbkdm1493e60gv2' with audio length (empty -- done receiving)
Read 121388 bytes of wav data, wrote "websocket_part1.wav".
Status.PROGRESS_50
[0.979s] Received part_id=1 (chunk_id=0) for take_id='t01jqrjq6xs1vbkdm1493e60gv2' with audio length 4140
[0.989s] Received part_id=1 (chunk_id=1) for take_id='t01jqrjq6xs1vbkdm1493e60gv2' with audio length 4096
[0.998s] Received part_id=1 (chunk_id=2) for take_id='t01jqrjq6xs1vbkdm1493e60gv2' with audio length 4096
[1.007s] Received part_id=1 (chunk_id=3) for take_id='t01jqrjq6xs1vbkdm1493e60gv2' with audio length 4096
[1.018s] Received part_id=1 (chunk_id=4) for take_id='t01jqrjq6xs1vbkdm1493e60gv2' with audio length 4096
[1.028s] Received part_id=1 (chunk_id=5) for take_id='t01jqrjq6xs1vbkdm1493e60gv2' with audio length 4096
[1.038s] Received part_id=1 (chunk_id=6) for take_id='t01jqrjq6xs1vbkdm1493e60gv2' with audio length 4096
[1.048s] Received part_id=1 (chunk_id=7) for take_id='t01jqrjq6xs1vbkdm1493e60gv2' with audio length 4096
[1.058s] Received part_id=1 (chunk_id=8) for take_id='t01jqrjq6xs1vbkdm1493e60gv2' with audio length 4096
[1.069s] Received part_id=1 (chunk_id=9) for take_id='t01jqrjq6xs1vbkdm1493e60gv2' with audio length 4096
[1.079s] Received part_id=1 (chunk_id=10) for take_id='t01jqrjq6xs1vbkdm1493e60gv2' with audio length 4096
[1.089s] Received part_id=1 (chunk_id=11) for take_id='t01jqrjq6xs1vbkdm1493e60gv2' with audio length 4096
[1.100s] Received part_id=1 (chunk_id=12) for take_id='t01jqrjq6xs1vbkdm1493e60gv2' with audio length 4096
[1.109s] Received part_id=1 (chunk_id=13) for take_id='t01jqrjq6xs1vbkdm1493e60gv2' with audio length 2048
[1.110s] Received part_id=1 (chunk_id=14) for take_id='t01jqrjq6xs1vbkdm1493e60gv2' with audio length (empty -- done receiving)
Read 55340 bytes of wav data, wrote "websocket_part2.wav".
[1.110s] Received part_id=2 (chunk_id=0) for take_id='t01jqrjq6xs1vbkdm1493e60gv2' with audio length (empty -- done receiving)
Status.READY
Deleting take t01jqrjq6xs1vbkdm1493e60gv2: True
examples/websocket_example.py¶
1import sys, os, time
2from typing import Optional
3from daisys import DaisysAPI
4from daisys.v1.speak import (DaisysWebsocketGenerateError, HTTPStatusError, Status, TakeResponse,
5 StreamOptions, StreamMode)
6
7# Override DAISYS_EMAIL and DAISYS_PASSWORD with your details!
8EMAIL = os.environ.get('DAISYS_EMAIL', 'user@example.com')
9PASSWORD = os.environ.get('DAISYS_PASSWORD', 'pw')
10
11# Please see tokens_example.py for how to use an access token instead of a password.
12
13def main(chunks):
14 with DaisysAPI('speak', email=EMAIL, password=PASSWORD) as speak:
15 print('Found Daisys Speak API', speak.version())
16
17 # A buffer to receive parts; we initialize with a single empty bytes()
18 # because we will use it to accumulate chunks of the current wav file
19 # there. In total we will end with a list of wav files, one for each
20 # part. Parts are bits of speech, usually full sentences, that end with
21 # silence.
22 audio_wavs = [bytes()]
23
24 # Assume at least one voice is available
25 voice = speak.get_voices()[0]
26
27 with speak.websocket(voice_id=voice.voice_id) as ws:
28 # Flags we can use to only wait on our one take request; we wait
29 # until the take is READY, and we also wait until we are done
30 # receive all audio parts.
31 done = False
32 ready = False
33
34 # Time the latency from when we submit the request until each part
35 # is received.
36 t0 = time.time()
37
38 # The audio callback receives "parts" consisting of audio .wav files
39 # with WAV headers on each part. Depending on the stream settings,
40 # the file may be divided into chunks, where chunk_id==None indicates
41 # the last chunk of a part. If audio==None, then no more parts will
42 # arrive for that take_id.
43 def audio_cb(request_id: int, take_id: str, part_id: int, chunk_id: Optional[int],
44 audio: Optional[bytes]):
45 nonlocal done
46
47 # Report timing info and function arguments
48 print(f'[{time.time()-t0:0.3f}s] Received {part_id=} ({chunk_id=}) for {take_id=} '
49 'with audio length', len(audio) if audio else '(empty -- done receiving)')
50
51 # We only requested one take_id; the take_id is generated by the
52 # Daisys API, so we do not know it until the first status
53 # message arrives. Therefore we can check that the request_id
54 # is the expected one.
55 assert request_id == generate_request_id
56 assert generated_take is None or take_id == generated_take.take_id
57
58 if audio is None:
59 # If stream is done for this part
60 if chunk_id in [0, None]:
61 # If we have any audio data, write out the last file
62 if len(audio_wavs[-1]) > 0:
63 with open(f'websocket_part{len(audio_wavs)}.wav', 'wb') as f:
64 f.write(audio_wavs[-1])
65 print(f'Read {len(audio_wavs[-1])} bytes of wav data, wrote "{f.name}".')
66 # Flag that we are done receiving audio
67 done = True
68
69 # If we are receiving the last chunk of a part
70 elif chunk_id > 0:
71 # Write out the part
72 with open(f'websocket_part{len(audio_wavs)}.wav', 'wb') as f:
73 f.write(audio_wavs[-1])
74 print(f'Read {len(audio_wavs[-1])} bytes of wav data, wrote "{f.name}".')
75
76 # Start a new part
77 audio_wavs.append(bytes())
78
79 # Otherwise append the chunk.
80 else:
81 audio_wavs[-1] = audio_wavs[-1] + audio
82
83 # If non-chunked stream, the part is ended immediately
84 if chunk_id is None:
85 # If we have any audio data, write out the file
86 with open(f'websocket_part{len(audio_wavs)}.wav', 'wb') as f:
87 f.write(audio_wavs[-1])
88 print(f'Read {len(audio_wavs[-1])} bytes of wav data, wrote "{f.name}".')
89
90 # Start a new part
91 audio_wavs.append(bytes())
92
93 # The status callback is called every time the take's status
94 # changes. Here we use it to end the update loop.
95 def status_cb(request_id: int, take: TakeResponse):
96 nonlocal ready, generated_take
97 assert request_id == generate_request_id
98 generated_take = take
99 print(take.status)
100 if take.status == Status.READY:
101 ready = True
102
103 # Submit a request to generate a take over the websocket connection.
104 generate_request_id = ws.generate_take(
105 voice_id=voice.voice_id,
106 text='Hello from Daisys websockets! How may I help you?',
107 status_callback=status_cb,
108 audio_callback=audio_cb,
109
110 # Optional
111 stream_options=StreamOptions(mode=StreamMode.CHUNKS) if chunks else None,
112 )
113
114 # Will be filled in by callbacks. On submitting the generate
115 # request, we do not yet know what take_id will be assigned so we
116 # must discover it by means of the status callback.
117 generated_take = None
118
119 # We loop on the websocket while waiting 5 seconds between updates,
120 # and end when the take as been set to READY and all audio has been
121 # received. This update waits 1 second by default, here we set to 5
122 # seconds, but it can also wait forever by setting timeout to None
123 # or be made a non-blocking operation by setting timeout to 0.
124 # (Important: in async client, timeout=0 leads to TimeoutError, it
125 # cannot be used for non-blocking operations with asyncio.)
126 while not (ready and done) and (time.time() - t0) < 60:
127 try:
128 ws.update(timeout=5)
129 except DaisysWebsocketGenerateError as e:
130 # As opposed to other websocket errors, if a generate error
131 # occurs it does not necessarily mean we want to close the
132 # stream.
133 print(e)
134
135 # In this example, however, we actually do, because we only
136 # requested a single take, so stop here.
137 break
138
139 # Delete the take
140 if generated_take:
141 print(f'Deleting take {generated_take.take_id}:', speak.delete_take(generated_take.take_id))
142
143if __name__=='__main__':
144 try:
145 main('--chunks' in sys.argv[1:])
146 except HTTPStatusError as e:
147 try:
148 print(f'HTTP error status {e.response.status_code}: {e.response.json()["detail"]}, {e.request.url}')
149 except:
150 print(f'HTTP error status {e.response.status_code}: {e.response.text}, {e.request.url}')