Siraj
Siraj•9mo ago

Streaming without realtime database updated.

is there a way to stream from OpenAI response directly to frontend app without using realtime database? because reactive database approach consumes a lot of bandwidth which isn't ideal for us.
21 Replies
lee
lee•9mo ago
sorry about kapa.ai not being helpful. there is currently no way to stream data out of an action directly. i've worked on this and it's not done yet. the only ways to get intermediate data out of an action are to store it with a mutation or send it to some other server with a fetch (which does support streaming requests and responses)
ian
ian•9mo ago
Mitigations for a lot of data writes I can think of: 1. Stream it one line or sentence at a time rather than per token. 2. Stream it from the API using a next.js api route, then write the result at the end to the DB (but accept that you might never write it if the request gets interrupted). You can use the streamed response as ephemeral data on the client until the message comes down after being written. Like an optimistic update. Note: you can't see the stream on multiple clients or after a page refresh in this case. 3. Paginate the data from the frontend, so each query page isn't loading as much data on the fast-refreshing data. You could even have a separate subscription / query on the "in-progress" message so it isn't in the default query fetching path. #3 is related to concepts from https://stack.convex.dev/queries-that-scale
Queries that scale
As your app grows from tens to hundreds to thousands of users, there are some techniques that will keep your database queries snappy and efficient. I’...
Siraj
SirajOP•9mo ago
Thanks @ian! Already tried #1 but it still taking a lot of bandwidth and tried #3 too, unfortunately it was affecting UX which required more frontend work to fix it. Before trying #2 I wanted to confirm it with you, as @lee mentioned that they are already working on this, do we have an estimate when this could be ready?
ian
ian•9mo ago
I can't remember the timeline on streaming from http actions. @Indy might have a better idea. but to be clear, this would not be streaming from a regular action / mutation, and you'd want to store the result alongside into the DB (maybe writing the incremental values or just the whole value at the end), and filter that result out using an index, from the client's query, until it's done
Indy
Indy•9mo ago
No timeline on streaming http actions. I vaguely remember it being somewhat non-trivial when we discussed it so it'll take a moment before we figure this out. But as Ian said you can probably try this with a different http service. The important part is to save the result to the DB at some regular interval.
Siraj
SirajOP•9mo ago
okay! thank you 🙂
Michael Rea
Michael Rea•9mo ago
I've also been looking at this, as my bandwidth has been eaten up from this. Number 2 seems like what I'll go for.
ian
ian•9mo ago
We have streaming from http actions working internally on a branch as of today. No guarantee on when it'll ship, but it's progress!
Michael Rea
Michael Rea•9mo ago
Awesome to hear that it's in the pipeline, one of the reasons you guys are awesome is your awareness of generative Ai, the vector Db etc
Abhishek
Abhishek•8mo ago
@ian @Indy will streaming be live soon I am revamping the convex code and thinking of going with 2nd approach but if http actions streaming is coming soon then I can wait. Thanks
Indy
Indy•8mo ago
In progress! I'll check with the team and get back to you.
Abhishek
Abhishek•8mo ago
Great THANKS
Indy
Indy•8mo ago
There are some edge cases we're working through. Hopefully in a few short weeks so we can make sure it's working well.
ian
ian•8mo ago
@Siraj @Michael Rea @Abhishek we now have http response streaming - so you can stream directly to a client and only periodically write to the db (or not at all). Check it out: https://news.convex.dev/announcing-convex-1-12/ And sample code: https://github.com/sshader/streaming-chat-gpt/blob/sshader-streaming/convex/http.ts
Convex News
Announcing Convex 1.12
We’ve had a busy month, and we have a bunch of different improvements to share! Support for Svelte, Bun, and Vue! We have a few more logos under our quickstarts section – we've added guides for Svelte, Bun, and Vue including our first community-maintained client library! HTTP action response streaming
GitHub
streaming-chat-gpt/convex/http.ts at sshader-streaming · sshader/st...
An example of streaming ChatGPT via the OpenAI v4.0 node SDK. - sshader/streaming-chat-gpt
Abhishek
Abhishek•8mo ago
Letss gooo @ian great work thankyou so much
ampp
ampp•8mo ago
I haven't had a chance to get into llama-farm example yet but does it streaming work different or the old way as it was released before 1.12?
ian
ian•8mo ago
It's using the normal flow - it writes to the DB at the end of sentences/clauses, since the user request isn't being piped all the way to the worker. I wrote this post on the implementation: https://stack.convex.dev/implementing-work-stealing
Implementing work stealing with a reactive database
Implementing "work stealing" - a workload distribution strategy - using Convex's reactive database.
Siraj
SirajOP•8mo ago
That's supercool! Thanks @ian & team convex 🙂
Michael Rea
Michael Rea•7mo ago
Yuss, awesome! Been waiting for this!!
ian
ian•7mo ago
Sarah recently wrote a post on it: https://stack.convex.dev/ai-chat-with-http-streaming and a quick video of using the ai npm library: https://www.youtube.com/watch?v=kP0HYN6NpA0
AI Chat with HTTP Streaming
By leveraging HTTP actions with streaming, this chat app balances real-time responsiveness with efficient bandwidth usage. Users receive character-by-...
Convex
YouTube
Build your own ChatGPT in 5 Minutes
Convex recently launched its support for the Vercel AI SDK, so we wanted to show off what you could do with it. Sarah goes over setting up your own AI Chat bot and how to use Hono for CORS middleware.
Michael Rea
Michael Rea•7mo ago
Thanks guys, I'm looking to do something similar but with audio streams

Did you find this page helpful?