guillemus•4mo ago

convex runMutation continuously times out

Hello, I'm currently hitting a wall while trying out convex. I have to write quite a bit of data, about millions of tiny documents. I haven't been able to finish the download, but the size of the data looks like around writing a gb of data in minutes. The initial runMutations run fine, but at some point of the download script just time out. At first, each runMutation was inserting 7000 documents. I did try to batch it to 500 docs, but I still run into timeout issues. this happened with the free plan and with the paid plan, on both development and production environments. Should I just retry the mutations when they timeout?

8 Replies

Convex Bot•4mo ago

Thanks for posting in <#1088161997662724167>. Reminder: If you have a Convex Pro account, use the Convex Dashboard to file support tickets. - Provide context: What are you trying to achieve, what is the end-user interaction, what are you seeing? (full error message, command output, etc.) - Use search.convex.dev to search Docs, Stack, and Discord all at once. - Additionally, you can post your questions in the Convex Community's <#1228095053885476985> channel to receive a response from AI. - Avoid tagging staff unless specifically instructed. Thank you!

guillemusOP•4mo ago

getting also this screen many times on the dashboard.

guillemusOP•4mo ago

Hum, I've tried batching with 100 docs, and it runs better, but a bit slow. With 300 docs seems to be better, but also slow. I wonder if convex can handle heavy write spikes if I batch well enough + handle timeouts with exponential backoffs?

djbalin•4mo ago

Mutations are limited to 1 second execution times. Sounds like you may want to use an action? Are you performing some download logic in your function as well? Calls to third-party APIs or services should be handled in actions. Actions can run for 10 minutes, and inside an action you can schedule/fire any number of mutations etc. So inside your action, you could prepare the data that you want to write, and then perform many invocations of your mutation with small batch sizes (to keep each mutation <1 sec)

guillemusOP•4mo ago

I want to syncronize all filenames for a few large repos for many commits. The download part is quite fast, the part that's very slow is the writing. The mutations being limited to 1 second might be the issue, as some mutations are approaching 700ms+ of query time I might want to change my schema though, it's a bit inneficient, each filename has a key to the commit id.

guillemusOP•4mo ago

https://stack.convex.dev/migrating-data-with-mutations#batching-via-recursive-scheduling I might also want to use this

Stateful Online Migrations using Mutations

Online migrations in Convex using mutations, including a Convex Component to manage them.

djbalin•4mo ago

If you could share your mutation code maybe I can help more 🙂 i dont get 100% what you mean by synchronizing filenames across repos, but sounds interesting!

guillemusOP•4mo ago

So the mutation code is very simple @djbalin

    let commits = await ctx.runQuery(api.functions.getAllRepoCommitsWithoutFiles, {
        repoId: repoId,
    })

    for (let i = 0; i < commits.length; i++) {
        const commit = commits[i]!
        console.log(
            `processing commit ${i + 1}/${commits.length}: getting tree for`,
            owner,
            repoName,
            commit.sha,
        )
        let allFiles = await githubClient.getRepoTree(owner, repoName, commit.sha)
        if (allFiles.error) {
            console.error(allFiles.error)
            continue
        }

        let fileNames = allFiles.data.tree.map((f) => f.path)
        console.log('upserting', fileNames.length, 'files for commit', commit.sha)

        await ctx.runMutation(api.functions.upsertFiles, {
            commitId: commit._id,
            fileNames: fileNames,
        })

        console.log(`finished upserting commit ${commit.sha}`)
    }

    let commits = await ctx.runQuery(api.functions.getAllRepoCommitsWithoutFiles, {
        repoId: repoId,
    })

    for (let i = 0; i < commits.length; i++) {
        const commit = commits[i]!
        console.log(
            `processing commit ${i + 1}/${commits.length}: getting tree for`,
            owner,
            repoName,
            commit.sha,
        )
        let allFiles = await githubClient.getRepoTree(owner, repoName, commit.sha)
        if (allFiles.error) {
            console.error(allFiles.error)
            continue
        }

        let fileNames = allFiles.data.tree.map((f) => f.path)
        console.log('upserting', fileNames.length, 'files for commit', commit.sha)

        await ctx.runMutation(api.functions.upsertFiles, {
            commitId: commit._id,
            fileNames: fileNames,
        })

        console.log(`finished upserting commit ${commit.sha}`)
    }

I'm iterating through a few designs. Before I saved each filename as it's own row, associated with a commit. But the problem with that is that I duplicate the commit id string a lot, which can result in quite a bit of database space wasted. I did realize that most of the times I will need all files at once, so I might as well just store all filenames in a array of strings. I believe I will always need to query all filenames at once to build the sidebar, so I might aswell just prejoin them in an array. The solution to this problem is to just try to change the database schema to filenames in array of strings. If I was to go with single filename as row, batching them 100 rows per mutation also prevents the timeouts. I still did see that after a few commits processed, some mutations did slow down a lot. Ideally if I'm overwhelming convex I should get a rate limit error or something.

convex runMutation continuously times out

Did you find this page helpful?