Advice on Convex & OpenAI Assistants API: Real-Time Reactivity, Data Redundancy, & Collaborative UX
I came across Ian Macartney’s post, "GPT Streaming With Persistent Reactivity," while exploring patterns for using Convex with OpenAI. Since the post is over a year old, I wanted to ask if the team has any new insights, particularly around collaborative user experiences powered by Convex and the OpenAI Assistants API.
While I was thinking about it, I felt uncomfortable about the idea of having two sources of truth. The Assistants API already stores a lot of information about threads, messages, and tool calls. It feels redundant to store the same information in Convex if I can access it through the API.
However, as Ian mentioned, browser-based HTTP streaming alone is unreliable for real-time reactivity, especially in a collaborative multi-user environment. A real-time database solution like Convex seems essential to achieve the required synchronization.
With OpenAI Assistants API, it’s also very annoying that processing messages (including tool calls) first during streaming (as shown in the Quickstart (https://github.com/openai/openai-assistants-quickstart/blob/06fc2d444a5d41b574082080f4c7b2e48156b84f/app/components/chat.tsx#L191) ) can’t follow the same logic in later browser sessions, because tool calls and messages come from different OpenAI API endpoints. During the stream, they’re processed together; but afterward, they’re separated. I managed to merge them by matching timestamps, but it feels wrong to have two distinct algorithms for handling the same output data.
Therefore, I’m convinced that using Convex as a bridge between my client and OpenAI is the right choice for my use case.
(continues below)
GitHub
openai-assistants-quickstart/app/components/chat.tsx at 06fc2d444a5...
OpenAI Assistants API quickstart with Next.js. Contribute to openai/openai-assistants-quickstart development by creating an account on GitHub.
6 Replies
Thanks for posting in <#1088161997662724167>.
Reminder: If you have a Convex Pro account, use the Convex Dashboard to file support tickets.
- Provide context: What are you trying to achieve, what is the end-user interaction, what are you seeing? (full error message, command output, etc.)
- Use search.convex.dev to search Docs, Stack, and Discord all at once.
- Additionally, you can post your questions in the Convex Community's <#1228095053885476985> channel to receive a response from AI.
- Avoid tagging staff unless specifically instructed.
Thank you!
I also noticed Sarah Shader's example, which seems to modify Ian’s approach by using HTTP streaming while updating other users through Convex. Is the idea to use traditional HTTP streaming for the requesting user and Convex as the source of truth for others? When I tested it, the real-time reactivity felt the same across different browsers in terms of speed (if there’s a difference, I really didn’t notice). However, after reloading the requester’s browser, other users stopped receiving updates, and the completion didn’t seem to be stored in the database. If non-requester users reload, the stream resumes as expected, and real-time reactivity continues. Although this is to be expected for the reasons discussed in Ian Macartney’s article, I recommend to acknowledge trade-off in the README of Sarah Shader’s repository (and similarly, the http-streaming branch on Ian Macartney’s repo).
This is a use case where I believe Convex really shines. I also came across Michal Srb’s post on "Build AI Chat With OpenAI’s Assistants API", but it feels outdated now that the Assistants API supports streaming (he used polling). Are there any newer resources or suggestions for handling similar scenarios?
In brief, any tips or insights would be greatly appreciated. I also think this is a valuable use case that could be highlighted in your content marketing. Best practices on reducing data redundancy and optimal schema design for Assistants API integration would be incredibly useful.
Also, I’d appreciate your thoughts on whether my concerns about data redundancy are misplaced. Should I be worried about storing too much data, or is it okay to store many columns in a table—or even the entire message object as serialized JSON in a single field? Would this impact the performance or make the database more expensive in the long run?
I don't think things have changed that much, besides the real-time streaming you have things like https://stack.convex.dev/implementing-work-stealing that came out in the recent past. My needs right now are not real time so i cant exactly answer this with confidence. But generally i don't see any other option then a one row per message or more. The only way to save money on db costs is to never grab more than you need and do it in a few functions as possible. Because of good dx you can almost rely on the fact that you will find reasons to refactor and few excuses not to.
Implementing work stealing with a reactive database
Implementing "work stealing" - a workload distribution strategy - using Convex's reactive database.
My app language hopper dot com uses Sarah’s approach for the translator. I could make it collaborative by adding logic to update the database throughout the stream, but I choose not to.
I insert the whole JSON in a column. Helps as logging, essentially
Thank you both for the advice! I implemented Ian's approach because I do want to have full collaborative reactivity that is resilient even in the face of things like browser reload. It's working very well so far, but it does require to be careful about database bandwidth.
Yeah, I think the high bandwidth usage comes from calling a mutation upon each iterator of the stream. It’s easy to code that approach, but is unnecessarily high update frequency