glucinater
glucinater14mo ago

Is there a way to “organize” tables in Convex

I’m working on adding more schools to the site I recently posted to in showcase and would like to organize my data into the following configuration School 1 - Courses - Instructors - instances … School 2 - roughly same data as 1 with slight differences in column names/types Is there a way to make some sort of “folder” to store each set of tables? I’m worried if I keep expanding my data will be a mess of school1_Courses, school2_Courses, etc
11 Replies
erquhart
erquhart14mo ago
You would generally have separate tables for each kind of entity (schools, instructors, etc) and where the different schools have differences in how the data looks, you would find a way to normalize so they can still use the same tables and columns.
glucinater
glucinaterOP14mo ago
Gotcha, is there no other way? After uploading the data I need to do a lot of ID linking between the entities, which already takes up a lot of bandwidth as is. Im worries that adding more data to the already big tables won't scale the way I'd like
Indy
Indy14mo ago
Each table can handle a massive amount of records. Unlikely you'll run into any scalability issues. Just make sure to use indexes. That'll be faster and use less db bandwidth.
Joseph_Ebuka
Joseph_Ebuka14mo ago
How do I build a relationship between my tables ?? Am still getting used to convex
erquhart
erquhart14mo ago
@Joseph_Ebuka are you used to traditional databases or new to databases in general Oh wait I confused the OP with you Joseph, sorry. Here's a stack article that should help you wrap your head around relationships in Convex: https://stack.convex.dev/relationship-structures-let-s-talk-about-schemas @glucinater I'm betting you'll run into way worse scaling issues due to your current schema approach. I would try building out functions that handle ingesting your data and linking consistently, and updating them as new sources come into play. Let your schema be a constant and bend the data to fit. Something I've taken to when ingesting outside data is keeping immutable records of the original data on top of mapping to my own schema. So you'll have one or more tables, perhaps with no schema even since your sources vary, where you insert rows as-is with minimal mapping, for all your incoming data. The functions that map those rows to your actual tables can then reference the original row ID. The advantage is auditability when you run into problems, you have a direct link to the original data.
Joseph_Ebuka
Joseph_Ebuka14mo ago
Thanks a lot for this article @erquhart it’s really helpful
glucinater
glucinaterOP14mo ago
I see, how do you setup your functions? I have my data in pandas dataframes in python and not sure how to link the data as I upload it into convex
erquhart
erquhart14mo ago
Hmm my data has been coming in via webhooks so I have functions creating rows, linking, etc. Are you uploading via npx convex import or streaming via airbyte/fivetran?
glucinater
glucinaterOP14mo ago
I did npx convex import originally and ran migrations on all columns to add ids Issue with this approach tho is as the tables grow it’s gets more and more inefficient because migration helpers I’m using going through every row in each column
erquhart
erquhart14mo ago
Yeah I can see that. There isn't really a workaround for this that I know of currently (team may have more ideas), migrations just take time and convex has to generate the ids.
glucinater
glucinaterOP14mo ago
My thoughts right now are to mess around with the python client and set up a pipeline that uses indexes, but not sure this will work or is the best approach

Did you find this page helpful?