Jonathan
Jonathan3w ago

Migrating to Convex from RDMS/SQL

I would love to hear about someone's approach moving from a traditional SQL platform to Convex and any patterns or practices they took. For me we run a very large CRM application and would love to move to something like Convex even if it was some type of lift and shift approach. But I would love to hear opinions or approaches anyone has taken. I feel like we could start finding a way to use Convex just for reading data on a small scale but it is when we want to shift to writing is the big question.
32 Replies
Convex Bot
Convex Bot3w ago
Thanks for posting in <#1088161997662724167>. Reminder: If you have a Convex Pro account, use the Convex Dashboard to file support tickets. - Provide context: What are you trying to achieve, what is the end-user interaction, what are you seeing? (full error message, command output, etc.) - Use search.convex.dev to search Docs, Stack, and Discord all at once. - Additionally, you can post your questions in the Convex Community's <#1228095053885476985> channel to receive a response from AI. - Avoid tagging staff unless specifically instructed. Thank you!
Jonathan
JonathanOP3w ago
I feel like I could take an approach: 1. Get schema setup 2. Push data from our currrent database. (I can run serverless functions on db that will allow me to push directly to convex) 3. Read from Convex for all queries 4. Create Actions (not mutations at first) to push updates to our old data base. This would allow us to get convex api to be our main api and all our legacy systems to eventually be replaced 5. Retire legacy systems 6. Replace actions with Mutations to fully move to Convex My thought is that my actions would be easily replaceable with mutations once we start to move completely out of our legacy system, but I would need to make sure the original db is being updated. Since the actions could just essentially be swapped to a mutation, this could potentially be fairly smooth or at least in theory
Jonathan
JonathanOP3w ago
I did not see that, le tme check that out So my question going back to pushing the data into convex from the source db, can the actions use a static ip or anything if I needed to access the db or would I need to do that via an api
erquhart
erquhart3w ago
I don't believe there's a static ip
Jonathan
JonathanOP3w ago
If you create tables during via the streaming, is there an easy way to pull that schema back into the schema.ts sorry, I know some of these are probably basic questions just trying to decide how deep I am going to dive
erquhart
erquhart3w ago
Schema is 100% code based, you'd have to make edits to it and then deploy
Jonathan
JonathanOP3w ago
gotcha, I will start off slow, there are about 500 tables
erquhart
erquhart3w ago
Yeah if you can make streaming imports work that's probably your best bet
Jonathan
JonathanOP3w ago
even if for a while just reading off convex would take a huge load off of our db and be worth it Can it stream storage too? I see airbyte has that as a source, (azure storage blobs) for us
erquhart
erquhart3w ago
Hmm I actually don't know on that one
ballingt
ballingt3w ago
re schema, on the dashboard you can generate the code for the data currently in there after an import / streaming
ballingt
ballingt3w ago
No description
ballingt
ballingt3w ago
re static IP, you can use a shared secret to give privledged access to certain APIs. You don't get to use the client IP in API functions to decide who it is though.
Jonathan
JonathanOP3w ago
Yeah I was more looking for a way to just go directly to our legacy db and open that firewall, I can find another way around that
ballingt
ballingt3w ago
@Jonathan your approach here sounds reasonable to me, moving the read path first seems nice because you get the reactivity right away, even while your old DB is the data of record. ah gotcha, static IP in the other direction
Jonathan
JonathanOP3w ago
yep My only big questions are more about building some of our complex queries to read from convex, we have our own "query language" basically, we translate it to sql in our current verison but was going to start another thread around that
ballingt
ballingt3w ago
cool, some important details there, since Convex queries are less declarative than SQL they may make a more difficult compilation target for your existing querylanguage but all depends on how that works
Jonathan
JonathanOP3w ago
yeah I think I could get it there, it would come down to limitations or optimizations on when to pull the documents and how to build indexes to handle that
ballingt
ballingt3w ago
sometimes customers write really general (e.g. CRUD utilities https://github.com/get-convex/convex-helpers/blob/main/packages/convex-helpers/README.md#crud-utilities) APIs for grabbing data when those queries are coming from trusted servers, vs the way Convex is generally demonstrated where all auth checks and business logic happen in the query functions sounds like along those lines, but yeah eventually to get all the benefits of Convex it's useful to write the heavy queries in terms of the procedural Convex query language (i.e. TypeScript, with async)
Jonathan
JonathanOP3w ago
sometimes it is simple as
first_name = 'John'
first_name = 'John'
Or more complex like
first_name = 'John' AND
contributions(date_received last year).sum(amount_received) > 100
first_name = 'John' AND
contributions(date_received last year).sum(amount_received) > 100
we essentialy treat the relations as a function
ballingt
ballingt3w ago
yeah you could write a little query evaluator that did the right thing based on the sent-in query spec so that you can receive exactly this and parser it there, but you need to reason about which indexes you'll need
Jonathan
JonathanOP3w ago
yeah I think once I get to that point is where I would have lots of questions, still in the early stage of seeing if I can make this work
ballingt
ballingt3w ago
SQL is good at evaluating all of these combination even if they're not efficient because you're missing indexes. Convex code generally tends toward only making the efficient ones work. Hence Convex really isn't for OLAP queries, more oriented for transaction processing. but yeah conceptually all doable would love to hear how it goes, if you're evaluating how possible it is maybe getting some subset in and wiring up the queries you need would provide insight
Jonathan
JonathanOP3w ago
yeah the hard thing is our user base does some pretty diverse queries, so figuring out how to optimize it for that is going to be the challenge even in SQL it is a challenge
ballingt
ballingt3w ago
hm that does sound tough, you might go down a path of "here are the indexed queries we can do, then for the long tail we paginate down one of the indexed queries"
Jonathan
JonathanOP3w ago
I would imagine a lot of cases there is part of the query that reduces the return by like 90%
ballingt
ballingt3w ago
what's the data size of the bit you don't know you'll have an index on? e.g. if you know you'll always be able to use an index to get down to <5,000 items, then very reasonable to do the rest in memory. If you know pagination is generally ok, then that could be 100,000. But truly arbitrary queries can get messy.
Jonathan
JonathanOP3w ago
I would think most I can ge tdown to less than 5k
ballingt
ballingt3w ago
we have this same problem in the convex dashboard we allow developers to use arbitrary filters, but we push them toward using an index first and then if they're using a very selective filter that there's no index on, we might need to paginate through millions of records to find the first match and you can tell by the delay in seeing those items happy to share notes on that, also the code is open source
Jonathan
JonathanOP3w ago
yeah if you point me in the direction I will put it on my list to check out

Did you find this page helpful?