RJ•3y ago

JSON in Schemas

There isn't a way to describe JSON as a SchemaType, is there? I mean something like the following—except of course that it doesn't loop indefinitely!

function json() {
  return s.union(
    s.boolean(),
    s.number(),
    s.string(),
    s.null(),
    jsonArray(),
    jsonObject()
  );
}
function jsonArray() {
  return s.array(json());
}
function jsonObject() {
  return s.map(s.string(), json());
}

function json() {
  return s.union(
    s.boolean(),
    s.number(),
    s.string(),
    s.null(),
    jsonArray(),
    jsonObject()
  );
}
function jsonArray() {
  return s.array(json());
}
function jsonObject() {
  return s.map(s.string(), json());
}

19 Replies

alexcole•3y ago

Ooo good question. I don't think this is currently possible with our built-in schema builder because there is currently no way to write recursive types. That being said, you can do this by manually constructing the SchemaType like:

type JSONValue =
  | string
  | number
  | boolean
  | null
  | JSONValue[]
  | { [key: string]: JSONValue };

const json: SchemaType<JSONValue, string> = new SchemaType();

type JSONValue =
  | string
  | number
  | boolean
  | null
  | JSONValue[]
  | { [key: string]: JSONValue };

const json: SchemaType<JSONValue, string> = new SchemaType();

A couple of caveats: - You can't actually store arbitrary JSON in Convex because we don't allow object fields to start with _ (thats reserved for system fields). - This might break in future Convex versions if we change our SchemaType format (it's not really publicly documented)

RJOP•3y ago

Oh, neat! My use case is that I'm serializing/deserializing these JSON-encoded ProseMirror nodes and steps as strings, and it would be nice if I could only convert to/from JSON and Node/Step ProseMirror objects rather than needing to do string <-> JSON <-> Node/Step It's not terribly inconvenient, though I suppose those fields containing the nodes/steps would also consequently also be easier to view in the Convex dashboard UI if they were JSON object schema types rather than strings

You can't actually store arbitrary JSON in Convex because we don't allow object fields to start with _ (thats reserved for system fields).

I actually wonder if TypeScript template literal types can describe this invariant?

alexcole•3y ago

I actually wonder if TypeScript template literal types can describe this invariant?

Oh interesting idea. We're doing some fancy template literal stuff already but I haven't tried using them here. I think it works, but it's not the prettiest type I've written:

type LowerLetters = "a" | "b" | "c" | "d" | "e" | "f" | "g" | "h" | "i" | "j" | "k" | "l" | "m" | "n" | "o" | "p" | "q" | "r" | "s" | "t" | "u" | "v" | "w" | "x" | "y" | "z" 
type UpperLetters = Uppercase<LowerLetters>
type Identifier = `${UpperLetters | LowerLetters}${string}`
type SystemFields = `_${string}`

let x: Identifier = "_invalid"

type LowerLetters = "a" | "b" | "c" | "d" | "e" | "f" | "g" | "h" | "i" | "j" | "k" | "l" | "m" | "n" | "o" | "p" | "q" | "r" | "s" | "t" | "u" | "v" | "w" | "x" | "y" | "z" 
type UpperLetters = Uppercase<LowerLetters>
type Identifier = `${UpperLetters | LowerLetters}${string}`
type SystemFields = `_${string}`

let x: Identifier = "_invalid"

The error message is a little terrible in https://www.typescriptlang.org/play?#code/C4TwDgpgBAMg9gdwgJxhYwUGcoF4oBEAhgVAD6EBGpFBAxjYQCaMESsBmrA5qwBasAlqwBWrANasANqwC2rAHas4rMKwCOrZKyytgrAK6sAbqwSsAHqxCsAXqQBQoSFACqYSKnSZkOfO886IiwIAB54JC8MbAA+J3BoAEkmCAVgQQ5BFDwoAAMAEgBvAJQ0aN9yWERS72wAXyKsYGRBBW463PiXAGUQJohZADEsqSY-PIB9RubW9s6HKXQoCwAuKGTU9Mzs-AIJ1uMiKUEWIA

TS Playground - An online editor for exploring TypeScript and JavaS...

The Playground lets you write TypeScript or JavaScript online in a safe and sharable way.

alexcole•3y ago

Ideally there would be a way to "subtract" the SystemFields type from string so I didn't need to manually write the entire alphabet, but I don't think that's possible

RJOP•3y ago

Nice! Yeah, looks like it works, but also yeah that's a sad error message 😬 The whole Identifier type looks bad in e.g. an IDE (hover over x in let x: Identifier) Yeah I don't think you can negate types without extends, which means you need a type parameter, which causes other issues I'd think with trying to collect all those types in an e.g. Record type I didn't discover anything better when playing around with it myself, anyways Actually, I was thinking about this JSON type as being the type of a field in a document, not a top-level document, e.g.

export default defineSchema({
  myTable: defineTable({
    myJson: s.json(),
  }),
})

export default defineSchema({
  myTable: defineTable({
    myJson: s.json(),
  }),
})

Would JSON in that position have the same internal field name constraints as at the top-level (no _${string} field names)?

alexcole•3y ago

I think currently the plan is to disallow properties that start with _ on nested documents as well. But definitely down to rethink that if it's getting in the way!

RJOP•3y ago

I would find that surprising, FWIW. My mental model of Convex considers an object in a field as being represented in some fundamentally different way than an object at the top-level (specifically, it assumes that a unique ID and similar “metadata” is only assignable by Convex to top-level documents). Although the second usage example in https://docs.convex.dev/api/modules/schema#definetable contests that view of things!

Module: schema | Convex Developer Hub

Utilities for defining the schema of your Convex project.

RJOP•3y ago

That’s not to say that mental model is correct or that the system it’s modeling couldn’t change, of course! It is more conformant with a relational DB view of things, perhaps Anyways, just sharing in case that’s useful!

alexcole•3y ago

Yep, definitely useful feedback! cc @sujayakar who has been thinking about how we reserve identifiers.

Although the second usage example in https://docs.convex.dev/api/modules/schema#definetable contests that view of things!

Perhaps I should change that example. The main use case I was imagining is actually having a top level s.union in defineTable. This could be useful if a table stores a discriminated union of different document types.

Module: schema | Convex Developer Hub

Utilities for defining the schema of your Convex project.

RJOP•3y ago

Ah I see, very neat! Yes, I think that would be a great change to the documentation. I don’t think it will be obvious to most people otherwise that that sort of thing is possible otherwise, even if it could perhaps follow from the s.object usage

ian•2y ago

@RJ in case you missed it- 0.19.0 added support for arbitrary keys in nested objects. It's not quite in the schema (you'd still do v.any() not a more specific type) but it'll allow you to dump keys which used to be disallowed. We're also planning a v.record type which would represent an object with known key and value types, like TypeScript Record.

RJOP•2y ago

Neat @ian! Some follow-up questions: - Does this mean that the nested object depth limits have been lifted? - How should I understand the domain of v.any() vs a hypothetical v.json()? Is v.any() a superset of v.json()? If so, what’s the difference?

ian•2y ago

The depth limit has not been lifted but we are considering making it larger. v.any() includes any valid Convex value - v.object, v.array, ... A hypothetical v.json might: - prevent you from adding bytes or things that aren't JSON primitives - enforce that it's an object? - only allow valid json keys A planned v.record would: - enforce that it's an Object, not an array, string, etc - ensure the keys and value types match - limit keys to strings, ids, and other valid JSON keys (not an array, e.g.) Does that make sense?

RJOP•2y ago

It does, thank you! My original motivation for asking this question was that I wanted to store ProseMirror documents serialized as JSON without stringifying said JSON first (and then parsing every time I read it out of the database). While I still definitely think these latest changes are great, it sounds like none of them are sufficient to allow for accomplishing that original goal.

ian•2y ago

Besides depth, what issues do you anticipate if you were to use v.any?

RJOP•2y ago

I think depth is the only one! Having a hypothetical v.json() could still be nice to ensure no mistakes are made in the serialization and storage process, but really the main thing is definitely the depth constraint. (I'm also not working on that app anymore and so have no need to store JSON directly in Convex... at least not yet!)

sujayakar•2y ago

@RJ, would increasing the depth limit from 16 to 64 work for your use case? we’d like to put some limit since we’d like to bound how deep we’d need to traverse documents for algorithms like schema inference — curious if 64 is good enough.

RJOP•2y ago

I don't remember how deep the ProseMirror documents were, but I think it probably would have been

sujayakar•2y ago

cool, I'll play with ProseMirror and make sure it works. we'd definitely want developers to be able to dump structured data like that directly into Convex.

JSON in Schemas

Did you find this page helpful?