Michael Rea•10mo ago

Query Caching

I'm trying to understand which approach is best. Do queries cache only on the level of the whole query? ---------- Option A:

//Schema
export const Chats = Table("chats", {
    userId: v.id("users"),
});

export const Messages = Table("messages", {
    userId: v.id("users"),
    chatId: v.id("chats"),
    role: v.string(),
    content: v.string(),
});


.......messages: Messages.table.index("by_chatId_userId", ["chatId", "userId"]),

//Schema
export const Chats = Table("chats", {
    userId: v.id("users"),
});

export const Messages = Table("messages", {
    userId: v.id("users"),
    chatId: v.id("chats"),
    role: v.string(),
    content: v.string(),
});


.......messages: Messages.table.index("by_chatId_userId", ["chatId", "userId"]),

export const get = queryWithUser({
    args: { chatId: v.id("chats") },
    handler: async (ctx, { chatId }): Promise<Doc<"messages">[]> => {
        const messages = await ctx.db
            .query("messages")
            .withIndex("by_chatId_userId", (q) => q.eq("chatId", chatId).eq("userId", ctx.userId))
            .order("desc")
            .collect();
            // .take(50);

        if (messages.length === 0) {
            return [];
        }
        return messages.reverse();
    },
});

export const get = queryWithUser({
    args: { chatId: v.id("chats") },
    handler: async (ctx, { chatId }): Promise<Doc<"messages">[]> => {
        const messages = await ctx.db
            .query("messages")
            .withIndex("by_chatId_userId", (q) => q.eq("chatId", chatId).eq("userId", ctx.userId))
            .order("desc")
            .collect();
            // .take(50);

        if (messages.length === 0) {
            return [];
        }
        return messages.reverse();
    },
});

---------- Option B:

export const Chats = Table("chats", {
    userId: v.id("users"),
    messageIds: v.array(v.id("messages")),
});

export const Messages = Table("messages", {
    userId: v.id("users"),
    chatId: v.id("chats"),
    role: v.string(),
    content: v.string(),
});

export const Chats = Table("chats", {
    userId: v.id("users"),
    messageIds: v.array(v.id("messages")),
});

export const Messages = Table("messages", {
    userId: v.id("users"),
    chatId: v.id("chats"),
    role: v.string(),
    content: v.string(),
});

export const getByIds = queryWithUser({
    args: { messageIds: v.array(v.id("messages")) },
    handler: async (ctx, { messageIds }): Promise<Doc<"messages">[]> => {
        const messages = await Promise.all(
            messageIds.map((Id: Id<"messages">) => {
                return ctx.db.get(Id);
            })
        );
        if (messages.length === 0) {
            return [];
        }
        return messages.reverse();
    },
});

export const getByIds = queryWithUser({
    args: { messageIds: v.array(v.id("messages")) },
    handler: async (ctx, { messageIds }): Promise<Doc<"messages">[]> => {
        const messages = await Promise.all(
            messageIds.map((Id: Id<"messages">) => {
                return ctx.db.get(Id);
            })
        );
        if (messages.length === 0) {
            return [];
        }
        return messages.reverse();
    },
});

----- Lets say a new message comes in, will option B be using the cached queries on each ID and get a non-cached for the new message? Compared to option A (where I'm unclear) does that get all the messages without cache OR actually uses the cache under the hood?

6 Replies

erquhart•10mo ago

Option A is the way to go - queries are reactive and their results will always reflect the underlying data in realtime. I expect you'll want to flesh out your relations a bit more though, assuming you can have multiple users in a chat, so you'll likely have a chats table with no user ids, a users table with no chat info, and then a userChats table where each record joins a user id to a chat id. Finally a messages table would have a user id (sender) and chat id for each message. The joins required for this will be plenty performant.

Michael ReaOP•10mo ago

You can think of this as a ChatGPT clone essentially, I'm going down this path because I burnt 1gb of bandwidth in one day. *It was not indexed though. I'm just trying to understand caching really in this question. Like at what level does caching occur? Is it just the whole query itself considering it's inputs? I was looking to use these helpers (the one to many) for this. https://stack.convex.dev/functional-relationships-helpers I was wondering if doing it that way from the article utilized cache more that Option A you reccomend.

Database Relationship Helpers

Traverse database relationships in a readable, predictable, and debuggable way. Support for one-to-one, one-to-many, and many-to-many via utility func...

Michael ReaOP•10mo ago

My concern with Option A is that every time a new message comes in it reloads all the messages? Or am I overthinking it? I think it could be due to the mutation of the streaming, every word that is streamed might be calling the get messages query which returns all the messages.

erquhart•10mo ago

Yeah every call is getting all messages for the given chat and user I'm also not sure whether and how caching impacts billing/usage at this point. I know there's work underway in this area, but as far as I know caching is purely a performance thing at the moment. Work on this is mentioned under Plans and Billing in the Q1 retro: https://discord.com/channels/1019350475847499849/1220857675747823807

Michael ReaOP•10mo ago

Oh awesome thanks for pointing me that way. I'm fairly sure that cached queries are free.

Indy•10mo ago

Cached queries (as in query functions) don't count against database bandwidth, but do count against a function call.

Query Caching

Did you find this page helpful?