greystark
greystark3y ago

snapshot for analytics & ml

I’m evaluating convex for a project that would require snapshotting the data periodically for analytics and training ML models.. Is there a recommended mechanism for this? (Ie does convex regularly snapshot to s3, etc) Don’t really need to get new data/events in real-time for now
5 Replies
lee
lee3y ago
Hi! If you want automated regular snapshotting you can use our Airbyte integration https://docs.convex.dev/using/integrations/airbyte . If you want to do it manually, there's a button on the convex dashboard https://docs.convex.dev/using/export .
Using Convex with Airbyte | Convex Developer Hub
Analyze your Convex data by exporting via Airbyte.
Exporting your data | Convex Developer Hub
Export your data out of Convex
lee
lee3y ago
I'd be really interested if you have feedback on either of these options, or if you're looking for something else. These features are new and we haven't gotten much feedback on them yet
jamwt
jamwt3y ago
@greystark specifically, if you use the airbyte integration, you can use the s3 destination connector: https://airbyte.com/connectors/s3
ETL to S3 | Open-source Data Integration | Airbyte
The Airbyte S3 ELT data integration destination connector will replicate your data from APIs, databases and files to S3.
jamwt
jamwt3y ago
this will keep your database in sync with a s3 bucket. you can then snapshot that bucket or whatever using any mechanism you desire (also, if this is for analytics/ml, you could use airbyte to directly connect convex to databricks or several other places rather than use s3 as an intermediary)
greystark
greystarkOP3y ago
Airbyte sounds good! Didn’t know it was there thanks

Did you find this page helpful?