Rob
Rob
CCConvex Community
Created by Rob on 5/7/2024 in #support-community
Crawlee in Convex Action
Ah, I see. Thanks. Any plans to support it in the future?
5 replies
CCConvex Community
Created by Rob on 5/7/2024 in #support-community
Crawlee in Convex Action
Here is my action code:
"use node";
// For more information, see https://crawlee.dev/
import { Configuration, PlaywrightCrawler, ProxyConfiguration } from "crawlee";
import { router } from "./routes.js";
import { action } from "./_generated/server.js";

export const runCrawlee = action({
args: {},
handler: async () => {
const startUrls = ["https://crawlee.dev/"];

const crawler = new PlaywrightCrawler(
{
// proxyConfiguration: new ProxyConfiguration({ proxyUrls: ['...'] }),
requestHandler: router,
// Comment this option to scrape the full website.
maxRequestsPerCrawl: 20,
},
new Configuration({ persistStorage: false })
);

await crawler.run(startUrls);

return await crawler.getData();
},
});
"use node";
// For more information, see https://crawlee.dev/
import { Configuration, PlaywrightCrawler, ProxyConfiguration } from "crawlee";
import { router } from "./routes.js";
import { action } from "./_generated/server.js";

export const runCrawlee = action({
args: {},
handler: async () => {
const startUrls = ["https://crawlee.dev/"];

const crawler = new PlaywrightCrawler(
{
// proxyConfiguration: new ProxyConfiguration({ proxyUrls: ['...'] }),
requestHandler: router,
// Comment this option to scrape the full website.
maxRequestsPerCrawl: 20,
},
new Configuration({ persistStorage: false })
);

await crawler.run(startUrls);

return await crawler.getData();
},
});
"use node";
import { createPlaywrightRouter } from "crawlee";

export const router = createPlaywrightRouter();

router.addDefaultHandler(async ({ enqueueLinks, log }) => {
log.info(`enqueueing new URLs`);
await enqueueLinks({
globs: ["https://crawlee.dev/**"],
label: "detail",
});
});

router.addHandler("detail", async ({ request, page, log, pushData }) => {
const title = await page.title();
log.info(`${title}`, { url: request.loadedUrl });

await pushData({
url: request.loadedUrl,
title,
});
});
"use node";
import { createPlaywrightRouter } from "crawlee";

export const router = createPlaywrightRouter();

router.addDefaultHandler(async ({ enqueueLinks, log }) => {
log.info(`enqueueing new URLs`);
await enqueueLinks({
globs: ["https://crawlee.dev/**"],
label: "detail",
});
});

router.addHandler("detail", async ({ request, page, log, pushData }) => {
const title = await page.title();
log.info(`${title}`, { url: request.loadedUrl });

await pushData({
url: request.loadedUrl,
title,
});
});
This code is essentially the template for Playwright + Typescript provided by crawlee, adapted to be in a Convex action, with some slight adjustments based on documentation of deploying on an AWS lambda https://crawlee.dev/docs/deployment/aws-cheerio
5 replies
CCConvex Community
Created by thedevstockgirl on 2/13/2024 in #support-community
SOC 2, GDPR and HIPAA.
Thank you for the update! Looking forward to it 🙂
19 replies
CCConvex Community
Created by thedevstockgirl on 2/13/2024 in #support-community
SOC 2, GDPR and HIPAA.
bumping @jamwt
19 replies
CCConvex Community
Created by thedevstockgirl on 2/13/2024 in #support-community
SOC 2, GDPR and HIPAA.
@jamwt any update on estimated timelines for HIPAA compliance?
19 replies
CCConvex Community
Created by thedevstockgirl on 2/13/2024 in #support-community
SOC 2, GDPR and HIPAA.
+1 on HIPAA. I'm creating an app for the healthcare space and would love to use Convex
19 replies