Skyblue (Matt)•5mo ago

Apply generated document id to a non-system field in schema

Hey all! I'm evaluating convex along with a few other services. The promises of convex are very exciting and what drew my attention to it. But I'm getting hung up on the inability to control document schema completely. Here's what I'm trying to do:

tags: defineTable({
    id: v.id("tags"),
    title: v.string(),
    description: v.optional(v.string()),
    createdAt: v.number(),
    lastModifiedAt: v.number(),
  })

tags: defineTable({
    id: v.id("tags"),
    title: v.string(),
    description: v.optional(v.string()),
    createdAt: v.number(),
    lastModifiedAt: v.number(),
  })

I know that using the v.id() is meant for storying references. But I was really hoping to use it to also generate the document id with a different name. I would prefer to not pass on the system fields to the client and use underscores. Planning to use zod to strip those out of responses. Is there a way to apply the document id to a non-system field of my choosing?

25 Replies

Convex Bot•5mo ago

Thanks for posting in <#1088161997662724167>. Reminder: If you have a Convex Pro account, use the Convex Dashboard to file support tickets. - Provide context: What are you trying to achieve, what is the end-user interaction, what are you seeing? (full error message, command output, etc.) - Use search.convex.dev to search Docs, Stack, and Discord all at once. - Additionally, you can post your questions in the Convex Community's <#1228095053885476985> channel to receive a response from AI. - Avoid tagging staff unless specifically instructed. Thank you!

lee•5mo ago

convex functions can transform data after reading it from ctx.db, before returning it to the client. So you can do

const results = await ctx.db.query("tags").collect();
return results.map((result) => ({
  id: result._id,
  createdAt: result._creationTime,
  title: result.title,
}));

const results = await ctx.db.query("tags").collect();
return results.map((result) => ({
  id: result._id,
  createdAt: result._creationTime,
  title: result.title,
}));

Skyblue (Matt)OP•5mo ago

So no way to apply the document id to another field on insert? The other main tool I'm looking at expects the dev to define the id field...its one of the differentiators. Won't share what that tool is...in case that's a party foul (new to using discord for dev stuff)

ballingt•5mo ago

@Skyblue (Matt) can you share more about what you want here? You could 1. copy _id to another property, or 2. create your own id property in the schema (which can be prettier, if you're expecting clients to use these for something visible), where lookups are just as efficient as on _id if you add an index for that new id property You're right that you can't prevent the system fields _id and _creationTime from existing on your documents, but you can add whatever other fields you like and critically, what is in the schema does not need to be directly related to what you return from your query.

Skyblue (Matt)OP•5mo ago

Preference is to have control over the names of all the fields in my tables. But I'll settle functionally for #2

ballingt•5mo ago

I'd love more context on why this feels like a differentiating feature of DB systems because when in our view it's an implementation detail Got it, so it's about it being useful to have complete control over these tables? Is it the convenience of not needing to strip these?

Skyblue (Matt)OP•5mo ago

I get that. It's about control. Anytime I use a service or 3rd party lib, I'm generally averse to control limitations, especially when they aren't defined in a way that I would prefer.

ballingt•5mo ago

These are "system fields" that will exist in just about any database, but typically they're hidden but it sounds like it'd be preferrable to be able to keep these hidden, which I can see If these were hidden would you want these not to show up in the Convex dashboard as well, or is it more about the querying interface? btw you're welcome to share the other system you're thinking about, but not required

Skyblue (Matt)OP•5mo ago

I don't mind them in the dashboard. In fact, I think exposing them is a feature that signals to the dev about how things may be working behind the scenes. I think just having the ability to assign the id to a different field name as an alias would solve a bit for me. That way I don't have to transform every query or alias the property in the client just to avoid using underscores. I do want to avoid the habit of iterating on responses, just because that can very easily create some performance issues if not written carefully. The other tool is Triplit. Schema doc here: https://www.triplit.dev/docs/schemas

ballingt•5mo ago

Cool, yeah Triplit ties the entities in the DB more closely to the entities returned from queries.

I do want to avoid the habit of iterating on responses, just because that can very easily create some performance issues if not written carefully.

Could you say more about your concerns here? In Convex this is the typical pattern.

Skyblue (Matt)OP•5mo ago

It's 2 part: 1. I enjoy the ability to "select" fields, alias them if I want, or even exclude at query time 2. Iterating on them after returning could be slow unintentionally (like spreading new items, instead of updating properties on larger arrays)

ballingt•5mo ago

The convenience argument I totally get, if you have CRUD queries that returned exactly the document from the DB, then yeah nice to use the names you'd like to expose to clients. Some folks use helpers to transform these, we generally put things like mappers in userspace but this particular one, renaming or removing system fields, could be buitl in someday.

I enjoy the ability to "select" fields, alias them if I want, or even exclude at query time

You have complete control in Convex to do this! The language we use for it is JavaScript, but this code runs in the database. That's what Lee's query is doing above, this is code that runs in the database!

Iterating on them after returning could be slow unintentionally (like spreading new items, instead of updating properties on larger arrays)

The CPU time to walk over every item in an array is very small, you're doing the same work a SQL or other declarative query would do There are some optimizations you can do with a declarative langauge but fundamentally the same work needs to be done, and V8 does a great job running this stuff quickly, including things like making spreads and maps very efficient

Skyblue (Matt)OP•5mo ago

Perhaps this is hinting at the shallowness of my DB knowledge. I assumed that the select operation, especially only selecting a few is a way to improve the performance....and that querying the whole thing and THEN stripping out and renaming fields would be missing out on this performance. (I know this isn't a trad db) So I guess I've made a habit of seeing post query transformations as less than ideal

ballingt•5mo ago

The data for a record is generally stored contiguosly in a traditional DB (and same in Convex, which is also traditional in this respect) so generally even for a select this whole thing needs to be read from disk/cache doing a read(), seek(), read() to jump over e.g. 20 bytes because you've selected them out isn't worth it

Skyblue (Matt)OP•5mo ago

Oh....that is interesting.

ballingt•5mo ago

There are real optimizations you can do (and that cool SQL databases absolutely do) but unless the columns are very large (in which case a different kind of non-contiguous storage is used) select doesn't buy you a ton

Skyblue (Matt)OP•5mo ago

So the performance is perhaps not the query, but on the transfer payload.

ballingt•5mo ago

...UNTIL you get to shipping this data to the client— then it matters! Yeah (roughtly) The gist of Convex is that instead of writing declarative SQL that gets query-planned, you write imperative TypeScript, and just run it from top to bottom. There are some interesting tradeoffs here, but the reason we can claim they're worth it is that the cost of moving this stuff it JS isn't that high; overwhelmingly the speed issues are "are you walking over the correct index," not is the SELECT happening in Rust vs JavaScript. (there are real perf multiples here, which is one reason we still have a .filter() in the query builder, instead of always pulling these out into JS and and filtering there) (but we'd like to remove that someday, when we get real efficient at pulling this stuff into JS zero-copy)

Skyblue (Matt)OP•5mo ago

Wow, this is kind of exciting actually. i think I was applying some sunk cost fallacy to all the time I've spent trying to get better at SQL but really wishing I could just do it all in JS. And clearly that's what I can do with Convex and I wasn't liking it. Ha!

ballingt•5mo ago

I hear you on the convenience though, there's an elegance of returning exactly the record stored in a table But eventually you end up adding some metadata that doesn't belong on the client, and things can get ugly but like, that initial elegance is sure nice! and worth optimizing for a bit that was soem of the motivation for the underscore, to say "hey this is a field you probably dont' care about" ifyou are returning these records directly to clients

Skyblue (Matt)OP•5mo ago

I bet a lot of folks would enjoy a chainable method on the query that feels similar to a select. But perhaps you all are trying to move away from that sort of pattern altogether intentionally.

ballingt•5mo ago

The https://labs.convex.dev/convex-ents library shows some of what doing ORM-y things like this in JS could look like unlike an ORM it's not translating these to another query language, it's mostly just doing the things in JS — but using it absolutely shows that it's nice to combine these things

Skyblue (Matt)OP•5mo ago

hey this is a field you probably dont' care about" ifyou are returning these records directly to clients

I guess in my case, the client is based on different entites being queryable in a custom way defined by the user, so I have a need to bring that there so I can include a reference in the related documents

ballingt•5mo ago

trying to move away from that sort of pattern altogether intentionally.

this would be nice, for the long term if you need efficiency you fundamentally need to break up tables that have lots of things you don't want to query, because the scan time over them increases with the data stored, not the data SELECTed but like that's not a slam dunk for not including a .select(), and Ents shows how that could be nice

Skyblue (Matt)OP•5mo ago

Thanks @Tom and @Lee really appreciate the details explanations and scary quick responses. Maybe you are both reactive ai agents running on convex cough cough

Apply generated document id to a non-system field in schema

Did you find this page helpful?