Database seed with 70,000 items
I've been trying to get past this error:
You have an outstanding query call. Operations should be awaited or they might not run. Not awaiting promises might result in unexpected failures. See https://docs.convex.dev/functions/actions#dangling-promises for more information.
This is my code, I've tried writing it in many ways
I have a npm script that seeds data into a database, this seed is an internal action that is meant to fetch a JSON response from an external source, the size of the JSON response is approximately 7MB.
When this action runs, I want to run a mutation that adds each of the item to a table. I'm not able to get past this issue with Promises.23 Replies
The seed function looks like this:
The scheduler calls this action with this code, no console log happens but I confirm it fetches the JSON data with 70,000+ results.
you have a
ctx.runQuery
call that is missing an await
(at the bottom of the code you pasted)@lee That was a typo. I still get the same error with
await
before ctx.runQuery
The asyncMap
was a second approach I tried, the for
loop was the first, they are meant to achieve the same thing. Both return same erroryou also need to await the asyncMap
@lee No luck still, error is the same
Interesting. Can you try simplifying to just have the loop. No
.map
or Promise.all
or asyncMap
The "you have an outstanding query" thing isn't actually an error. It's a warning. It's happening because of a separate error (which unhelpfully appears to have the message "Error") which is short circuiting the Promise.all
. I would investigate by making everything serial (remove the Promise.all parallelism) and track down the error@lee I only get the error when I run the convex query iniside a loop / map...
The first
null
value is the same query that was inisde the loop.Can you call the query inside the loop?
Like this
For my own debugging, is this action in a file with "use node" at the top?
@lee Yes, there is "use node" at the top. This ran without the error
Gotcha thanks for checking. If you remove the log line, does it execute successfully without erroring?
I think you're running into the limit on concurrent operations within node: https://docs.convex.dev/functions/actions#limits
Actions | Convex Developer Hub
Actions can call third party services to do things such as processing a payment
(i'm creating internal tasks because this error message is not helpful)
@lee Yes, it's currently running. I'm seeding 70,000 data, this action is triggered from a scheduler... It's currently at 2k plus..
I see from the link that:
In my case and from the code shared above, which of the operations were concurrent? Was it referring to the operation inside the loop?
when you do
ctx.runQuery
and await it in a Promise.all
with other ctx.runQuery
promises, then they are running concurrentlyI get it, thanks for that information @lee
I was able to seed 9000+ data before hitting this error
Seeding stopped
hmm i'm not sure what that transient error could be. I also don't really understand the flow here. Can you describe how
seedDiagnosis
works? It looks to me like it does a fetch request but doesn't save the data into Convex@lee
I tried to run it today, I got this error
thanks for sharing, now i understand the flow. I would expect this to work, so i'm confused why you're getting this error. But here are some ideas to try:
- have
addDiagnosis
take in a batch of ~100 diagnosis documents, and insert them all in a loop. Having fewer mutations may help them avoid conflicts.
- check getDiagnosisByCode
to make sure it's using an index. I wouldn't expect a query to contribute to the conflict, but making the query more efficient can't hurt
- since it's a one-time operation, you could try a different flow where you curl the endpoint from your computer, construct a csv or jsonl, and use npx convex import
to upload the data. This will do the efficient patterns.@lee
getDiagnosisByCode
uses an index by_code
The batch solution worked for me and it was way faster. Thank you very much. I batched with 1000, this creates 70 scheduled functions. Each of the function run a loop of 1000 items that adds the mutation. All 70k items are in.
Are there any violations with using schedulers like this?
Awesome! Sounds like a good usage of the scheduler to me