Looking into the Airbyte integration
Looking into the Airbyte integration this morning, and it doesn't appear as a supported connector from the dashboard. Wondering if this is something the Convex team manages via Airbyte or if I should reach out to them directly.

29 Replies
After searching their Slack it appears that perhaps its only currently supported on the OSS installs and not the cloud offering.
I also see some chatter in that Slack about having Convex be a destination as well as a source. What is the ETA on that?
Hi Allen, we're not in their dashboard. Convex as a source is available on the OSS side. https://docs.convex.dev/using/integrations/airbyte
Using Convex with Airbyte | Convex Developer Hub
Analyze your Convex data by exporting via Airbyte.
We wouldn't mind you nudging them as a user to get on their dashboard. 😉
Convex as a destination is still an open Pull Request on the airbyte project. We're still waiting on them to review it.
For future product planning for us: what integrations matter for you?
Honestly, I'm not strong on the data eng side and my understanding of that stack is limited. I only pursued Airbyte because it looked turnkey from your docs.
One of the appeals to Supabase was the psql backend, as it lent it self to a lot of existing integration points, as well as being able to do some more complex querying within Supabase itself.
I'm about 90% decided to use convex over supabase, but the querying limitations definitely have given me pause.
Seems to me if I could get data streaming into BigQuery, RedShift or a similar warehouse, that would unblock me for the most part, as it would open up a massive world of tooling.
Pushing aggregated views of data back to supabase so it could be consumed by my app would be ideal as well.
Having the ability to do basic partial text search and get
count
s back on queries would be simplify things as well.hey @allen ! the good news is all this is underway right now with the team. I'll address a few different points here, but feel free to ask follow ups, and/or I'm happy to jump on a call
1. I'm meeting with the airbyte PM in charge of integrations soon. we're working on getting into the cloud product for egress + ingress, I hope that will be solved soon so you don't have to run your own instance
2. we've discussed an "out of the box" simple OLAP-type thing, and this would probably amount to us running airbyte + clickhouse or something for pro accounts. but the complexity would be managed by convex so you'd have a slightly-delayed, read-only, but highly performant SQL engine to do whatever analysis you want on your convex data without having to set up your own system. we don't have a timeline on this though, we're still weighing it against other priorities. this would be lableled something like "Convex OLAP" and would be useful for sure so you don't have to string together your own airbyte + SQL solution
(we'd probably solve the "full SQL" situation this way so we can keep the OLTP core very fast and available, as opposed to directly exposing SQL on the convex data)
3. we have an in-house search system that's close to beta, but we're still discussing timelines on releasing it. may need a bit more work. this would let you do full text searches on in-convex values
3a. in the long run, our intention is the "industrial grade" solution for search once again is predicated on smooth airbyte integration, which is why we've invested in that early. namely, you can do airbyte -> elasticsearch (via this connector: https://docs.airbyte.com/integrations/destinations/elasticsearch/ )
Elasticsearch | Airbyte Documentation
Sync overview
Seems to me if I could get data streaming into BigQuery, RedShift or a similar warehouse, that would unblock me for the most part, as it would open up a massive world of tooling.Definitely 💯 . This is the promise of airbyte for something like Convex, is you can get into those systems, or PostgreSQL, or any other place to use your Convex data basically anywhere
Thanks for the insights, @jamwt . Seems like you are aware of the gaps and filling them accordingly. Let me know if I can beta anything and offer feedback.
I'll be looking to go to market in the next ~60 days. Anything coming online in that timeframe, even as a preview release?
on the airbyte front, I just got this intro from the airbyte CEO yesterday, so I'll follow up with you when I have more info on the timeline to get into the cloud product and to land the destination connector. on search, I'll defer to @james who was chatting with the team about the state of built-in search yesterday
and thanks for the offer about feedback! definitely, keep it coming 😄
in-convex sql replica sounds very appealing
The elastic search approach makes sense at a high level, I'm just unsure how it would practically come together... Execute an action that returns document IDs that then executes a query to to fetch the documents in a reactive state?
yeah, so there's a longer consideration here because search means a lot of different things. part of why we've invested a bit into 1st party search is having simple "application search" just work out of the box and have a consistent + subscription-capable relationship to everything else in convex. for many apps, this is all they need by search. it's close to what postgres calls search
and then for other things, people want sophisticated stemming, ranking, multiple languages / locales, large documents, etc etc. something closer to alogolia or elastic. and it's unlikely convex would build that in house
as opposed to recommending solving those kinds of use cases with an integration
That makes sense. Basic fuzzy field search would solve for a lot, leaving the heavier search functionality to ES.
What about a use case of say
Posts
that have a relationship to Tags
and I want to get the top 10 tags by popularity (count of posts that have those tags).
Is this something that your in-convex search would support? Seems like a basic query, but right now I would be looking at a complex data pipeline to try and get that result set back to my app.
(or trying to maintain some TopTags
table that increments/decrements in sync with mutations)that particular case may best be served with just maintaining the counts as part of the mutation, yep.
I got the Airbyte connection setup in a local container between Convex and BigQuery, however the sync is failing with "Failure Origin: normalization, Message: Something went wrong during normalization".
The tables show up in the BigQuery destination, but no data.
re. search, we have an internal implementation that allows in-convex search in a transactionally consistent way, and will likely launch in the near future. we have designs for fuzzy search and prefix matching and could add soon after
if something like this meets your needs that'd be great. there will still be a gap in functionality between built-in search and elastic, so there will be use cases that fall outside our featureset and will just want to stream to elastic search
we haven't directly tested Convex streaming into BigQuery. one would think it'd work since the airbyte destination connector on the BigQuery side should take care of it, but we can test this
Thanks @james . Let me know if the log dump from Airbyte would help.
Definitely seems to be something on the Convex side:
Sync worker failed.
No properties node in stream schema
Source did not output any state messages
State capture: No state retained.
Thanks Allen! We're looking at this now. We'll update you as soon as we know what's going on.
oh, one idea @allen ... is this a pro account? I think airbyte egress is pro accounts only
Ah, it is not. I saw that in the docs, but everything seemed to wire up fine.
yeah, granted, this is probably not the best error message to clarify if that's indeed what's going on. @Indy @Emma -- maybe that's the issue here?
Yep definitely not the right message if that's the issue (it probably is, but we'll verify for sure).
hey @allen! I'm looking into this - would you mind sending me the logs from the failed sync?
DMed you
Also, seems to be related as it started after my sync attempts -- my whole Airbyte installed is nuked, with every route serving the attached because of the shown error.

@allen and I resolved this in DMs, but the summary is that:
1. Sync won't work if you don't have a pro account. This is stated in docs but it's confusing that it partially succeeds and doesn't have a good error message in the airbyte logs. (we can improve this!)
2. The source schema was stale, so Allen continued to get normalization errors after upgrading to a pro plan. Remember to refresh your source schema when it changes!
Helpful debugging tips:
- deselecting tables helped allen locate the table that wasn't normalizing successfully
Thanks so much for your help @Emma !
@allen feel free to add anything I missed here!
you're welcome 🙂
Yey! Glad to hear it's up and running!
@allen btw in case you run into more convex-airbyte errors, I opened this PR to make them visible. Hopefully they can merge this soon and you can update! https://github.com/airbytehq/airbyte/pull/23797
GitHub
🐛 Convex source connector error messages by emmaling27 · Pull Reque...
What
This PR adds well-formatted error messages to the Convex source connector. A Convex user encountered unhelpful error messages when their sync failed. This diff adds error messages that should ...