arbalada
arbalada•2mo ago

convex runMutation continuously times out

Hello, I'm currently hitting a wall while trying out convex. I have to write quite a bit of data, about millions of tiny documents. I haven't been able to finish the download, but the size of the data looks like around writing a gb of data in minutes. The initial runMutations run fine, but at some point of the download script just time out. At first, each runMutation was inserting 7000 documents. I did try to batch it to 500 docs, but I still run into timeout issues. this happened with the free plan and with the paid plan, on both development and production environments. Should I just retry the mutations when they timeout?
8 Replies
Convex Bot
Convex Bot•2mo ago
Thanks for posting in <#1088161997662724167>. Reminder: If you have a Convex Pro account, use the Convex Dashboard to file support tickets. - Provide context: What are you trying to achieve, what is the end-user interaction, what are you seeing? (full error message, command output, etc.) - Use search.convex.dev to search Docs, Stack, and Discord all at once. - Additionally, you can post your questions in the Convex Community's <#1228095053885476985> channel to receive a response from AI. - Avoid tagging staff unless specifically instructed. Thank you!
arbalada
arbaladaOP•2mo ago
getting also this screen many times on the dashboard.
No description
arbalada
arbaladaOP•2mo ago
Hum, I've tried batching with 100 docs, and it runs better, but a bit slow. With 300 docs seems to be better, but also slow. I wonder if convex can handle heavy write spikes if I batch well enough + handle timeouts with exponential backoffs?
djbalin
djbalin•2mo ago
Mutations are limited to 1 second execution times. Sounds like you may want to use an action? Are you performing some download logic in your function as well? Calls to third-party APIs or services should be handled in actions. Actions can run for 10 minutes, and inside an action you can schedule/fire any number of mutations etc. So inside your action, you could prepare the data that you want to write, and then perform many invocations of your mutation with small batch sizes (to keep each mutation <1 sec)
arbalada
arbaladaOP•2mo ago
I want to syncronize all filenames for a few large repos for many commits. The download part is quite fast, the part that's very slow is the writing. The mutations being limited to 1 second might be the issue, as some mutations are approaching 700ms+ of query time I might want to change my schema though, it's a bit inneficient, each filename has a key to the commit id.
arbalada
arbaladaOP•2mo ago
Stateful Online Migrations using Mutations
Online migrations in Convex using mutations, including a Convex Component to manage them.
djbalin
djbalin•2mo ago
If you could share your mutation code maybe I can help more 🙂 i dont get 100% what you mean by synchronizing filenames across repos, but sounds interesting!
arbalada
arbaladaOP•2mo ago
So the mutation code is very simple @djbalin
let commits = await ctx.runQuery(api.functions.getAllRepoCommitsWithoutFiles, {
repoId: repoId,
})

for (let i = 0; i < commits.length; i++) {
const commit = commits[i]!
console.log(
`processing commit ${i + 1}/${commits.length}: getting tree for`,
owner,
repoName,
commit.sha,
)
let allFiles = await githubClient.getRepoTree(owner, repoName, commit.sha)
if (allFiles.error) {
console.error(allFiles.error)
continue
}

let fileNames = allFiles.data.tree.map((f) => f.path)
console.log('upserting', fileNames.length, 'files for commit', commit.sha)

await ctx.runMutation(api.functions.upsertFiles, {
commitId: commit._id,
fileNames: fileNames,
})

console.log(`finished upserting commit ${commit.sha}`)
}
let commits = await ctx.runQuery(api.functions.getAllRepoCommitsWithoutFiles, {
repoId: repoId,
})

for (let i = 0; i < commits.length; i++) {
const commit = commits[i]!
console.log(
`processing commit ${i + 1}/${commits.length}: getting tree for`,
owner,
repoName,
commit.sha,
)
let allFiles = await githubClient.getRepoTree(owner, repoName, commit.sha)
if (allFiles.error) {
console.error(allFiles.error)
continue
}

let fileNames = allFiles.data.tree.map((f) => f.path)
console.log('upserting', fileNames.length, 'files for commit', commit.sha)

await ctx.runMutation(api.functions.upsertFiles, {
commitId: commit._id,
fileNames: fileNames,
})

console.log(`finished upserting commit ${commit.sha}`)
}
I'm iterating through a few designs. Before I saved each filename as it's own row, associated with a commit. But the problem with that is that I duplicate the commit id string a lot, which can result in quite a bit of database space wasted. I did realize that most of the times I will need all files at once, so I might as well just store all filenames in a array of strings. I believe I will always need to query all filenames at once to build the sidebar, so I might aswell just prejoin them in an array. The solution to this problem is to just try to change the database schema to filenames in array of strings. If I was to go with single filename as row, batching them 100 rows per mutation also prevents the timeouts. I still did see that after a few commits processed, some mutations did slow down a lot. Ideally if I'm overwhelming convex I should get a rate limit error or something.

Did you find this page helpful?