Omar
Omar•2w ago

Just built this. Using GPU cloud

Just built this. Using GPU cloud inference for the transcription+generation, so that's another 2 WebSockets. Convex is great for real time sync of messages between client and server.
3 Replies
daleal
daleal•2w ago
How did you achieve this? Haven't been able to upgrade the HTTP connection to a websocket with the client 😭
Omar
OmarOP•2w ago
STT/TTS models deployed via Python/Rust to RunCloud/Modal instances I do not think HTTP Actions can be upgraded into WebSocket
Barrel Of Lube
Barrel Of Lube•2w ago
you can use fly.io gpu cloud to deploy a python server/docker image and auto scale it based on network or compute usage u'll have to email them for gpu access but they reply wth couple of hrs they have a suspend state with < 300ms but for gpus its around 2s-5s on avg

Did you find this page helpful?