Rodrigo-R
Rodrigo-R7mo ago

PDF Parsing - actions and mutations

Hi! I have this dilemma: I need to parse a PDF using pdf2json (or any other library) to store its contents ,and the only way i got it semi working is using an action marked with "use node". But when using this configuration i can't call mutations, since the compiler refuses to accept it. This effectively defeats the purpose of parsing a pdf in an action in the first place, forcing me to upload the file to store it, download it to parse it and re upload the contents again, very inefficiently. Is there a better way?
9 Replies
v
v7mo ago
Maybe you could use https://pdf-lib.js.org/, I'm not sure if this will work.
sshader
sshader7mo ago
But when using this configuration i can't call mutations, since the compiler refuses to accept it.
Can you say a little more about what issue you're encountering here? Actions can't directly call methods like ctx.db.insert, but they can call these indirectly with ctx.runMutation. I'd imagine you'd parse your PDF in a node action and then do something like ctx.runMutation(internal.myFunctions.savePdfMetadata, metadata)?
lee
lee7mo ago
note the mutation cannot be defined in the same file that has "use node". it should be defined in a separate file. then you can call it as sarah describes
KinKon
KinKon7mo ago
I had a similar use a month ago. Maybe this can help. https://github.com/konradhy/casefold/blob/main/convex/ingest/extract.ts
GitHub
casefold/convex/ingest/extract.ts at main · konradhy/casefold
Read twice as many cases in half the time. Contribute to konradhy/casefold development by creating an account on GitHub.
Rodrigo-R
Rodrigo-ROP7mo ago
Sure! In order to accept that third party library (psd2json or pdf-lib) the only way the compiler accepts it is using the "use node" directive at the top of the file. When i have the contents of the file i need to store them using an internalMutation, but this mutations does that ... mutate the data using pathc. This is what the compiler is complaining: updateDocument defined in documentActions.js is a Mutation function. Only actions can be defined in Node.js. See https://docs.convex.dev/functions/actions for more details. If i remove the "use node" directive i can use internal mutations, but i can't use the external library
Actions | Convex Developer Hub
Actions can call third party services to do things such as processing a payment
Rodrigo-R
Rodrigo-ROP7mo ago
thank you guys, i tried to define the internalMutation in a different file, it didn't work, but maybe i was dizzled by trying so hard, i'll give it another try and report back! Thanks for your help! @KinKon How did you solved the issue with the test file in pdf-parse ? Uncaught Failed to analyze process/extract.js: ENOENT: no such file or directory, open './test/data/05-versions-space.pdf'
KinKon
KinKon7mo ago
Not sure. I built this as a proof of concept so I didn't run any tests
Rodrigo-R
Rodrigo-ROP7mo ago
cool thanks, i got back to use pdf2json with no problem using your recomendations!
KinKon
KinKon7mo ago
Awesome glad it works

Did you find this page helpful?