Puppeteer in convex
is it possible to use puppeteer in convex actions?
I am getting the following error
18 Replies
I’m be curious to hear the answer, do you have use node at the top of the file?
yes
basically I want to load webpages in
<iframe>
in frontend. But some website have a CSP header that helps prevent this. So I am trying to create a proxy server thing using convex actionsPuppeteer won't work currently (there's work in progress for this) but you should be able to create a proxy with fetch() in either Node.js or normal Convex runtime.
Alright. no problem. I dont know if i can use fetch or axios because I want everything like css fonts images etc
Convex will likely support puppeteer before long, your approach of running the site and then grabbing the HTML is useful — just for now a more passthrough approach like a normal HTTP proxy is necessary
If you run this in an iframe you'd need to rewrite urls in your proxy in order to request the fonts, scripts etc. in an iframe too. You'd need to rewrite the headers to remove the CSP header. This might work for you but might be challenging depending on the site you're trying to iframe
yes. makes sense
I worked on Tom's suggestion and created an action that works like a proxy to fetch the entire webpage with css fonts images and stuff
I used absolutify and cheerio to create that action
Code: https://github.com/ashuvssut/cookied/blob/dev/apps/convex/webContent.ts#L40
GitHub
cookied/apps/convex/webContent.ts at dev · ashuvssut/cookied
Contribute to ashuvssut/cookied development by creating an account on GitHub.
Thanks for suggesting the alternative solution!
Nice! Good to know about absolutify, I hadn't heard of it and it looks super useful
absolutify
can not convert relative urls written in srcset
attribute of <img>
tag
For example: <img srcset="/_next/image?url=%2Fimage1.jpg&w=100 1x, /_next/image?url=%2Fimage1.jpg&w=200 2x">
So you have to look for alternatives like cheerio
to do this particular thing
The fetch() solution that you mentioned is will not work for scraping JS webapps like React apps. Since, the HTML is only hydrated in client side, I would need puppeteer for this to load the webpage and then do the web scraping thing
I ll be waiting for puppeteer support in convex. For now Vercel Serverless seems to do the job
Ah! today I saw an issue in Prod.
Puppeteer doesn't work with vercel serverless functions in prod build.
It just worked locally on my pc in dev mode
When I saw the logs (Production), it said
Actually its because vercel's 50MB file-size limit
For, now I think there's no free CSR webapp scraping serverless solution.
Going to stick with fetch
Ah good to know, thanks!
haha
Just wanted to clarify the misinformation I provided earlier. 😅
the local dev experience feels great but can bite you when the prod environment is different sadly
yup. happened many times😅
Hi Tom,
I have a question, and it's unrelated to our ongoing thread.
This question might sounds a bit silly 😅
I noticed in the recent announcement that Convex will have first-class Python support. I'm curious about what "first-class" means in this context. Does it imply that we'll have fully typed support for both the frontend in TypeScript and the backend in Python? I'm trying to understand how the type system will work with Python functions.
Do we need to write specific schemas that the Convex generator understands to generate types for both TypeScript on the frontend and Python functions on the backend?
In this context "first-class" probably means subscriptions: right now the Convex Python client just does queries and mutations and actions, but not subscribing to a query.
But yeah we want types in Python too, this is close!
We can't use TypeScript type inference for this, we'll need to use
{ args: ..., handler: ..., output: ...}
We've already limited schemas in such a way that they'll work for Python, so you will not need to change the way you write schemas!
All Convex types will work in every language we support.
This is part of why you can't use e.g. a v.date()
in Convex schema; every type in here needs to be representable in every client language Convex has types / codegen for
@ashuvssut (ashu) Are you thinking you'd like to use Python for something? Curious to hear what you'd like here.I was discussing with a friend of mine regarding my puppeteer issue
He said that In python there is a library called "scrapy"
This library's bundle size will be less than 50mb. But implementing web scraping of CSR web app may be a little tricky
Whatever it may be. I will try out scrapy when Python support comes up in Convex
Ah so this would require submitting a Convex function written in Python. That's something we think about, but isn't a concrete plan yet. The announced plan is only to make a first-class Python client.
Oh... got it. I actually thought that there will be python convex functions lol