Issues with limits/timeouts doing cron jobs on large datasets; infrequent aggregates of data
Hi, does anyone have a good solution for doing bulk data operations within Convex cron jobs? I keep running into either data query limits or action timeouts. I have a table closing in on 1 million rows on which I want to run weekly-ish updates and store the aggregates results in another table (like 20 rows, that's the easy part). At the moment I was able to group by a field in the data, reducing the data set to maybe 50-100k rows and run the aggregates on that smaller dataset but I just keep running into limits or timeouts.
The aggregates can be slow to compute, that's no problem, the cron jobs just need to be stable. Is there maybe a way to offload it somewhere outside of a "regular" action?
3 Replies
Thanks for posting in <#1088161997662724167>.
Reminder: If you have a Convex Pro account, use the Convex Dashboard to file support tickets.
- Provide context: What are you trying to achieve, what is the end-user interaction, what are you seeing? (full error message, command output, etc.)
- Use search.convex.dev to search Docs, Stack, and Discord all at once.
- Additionally, you can post your questions in the Convex Community's <#1228095053885476985> channel to receive a response from AI.
- Avoid tagging staff unless specifically instructed.
Thank you!
Check out the workflow component: https://www.convex.dev/components/workflow
Convex
Workflow
Simplify programming long running code flows. Workflows execute durably with configurable retries and delays.
Aah I didn't know about that, thanks for the suggestion! I think this could help. Do you think that could also work around query timeout issues? Also with queries that fall within the max. document limit