pearcyP
Convex Community13mo ago
9 replies
pearcy

External Data Import to Convex with Dagster

I'm encountering issues when using the Convex Streaming Import API (/api/streaming_import/import_airbyte_records) to import data with relationships. I'm using Dagster to orchestrate the process, and I'm not using Airbyte directly.

Specifically, I am trying to upsert records into four tables, tools, articleTools, materials, and articleMaterials tables with one json file.

1. Initial Setup:
I'm using Dagster to manage the data import.

I'm directly calling the Convex Streaming Import API from a Python client.

I have a Convex schema with tables for articles, tools, articleTools, materials, and articleMaterials. The relationships are handled through the articleId on articleTools, and articleMaterials tables.

I have a function for creating an article first which works correctly.

2. Problem:
I successfully upload the base article record.

I'm getting errors when importing tools and materials with their relationships to articles.

The errors I've encountered have evolved during debugging but the latest error is IndexNotFoundError: Index tools._by_airbyte_primary_key not found.

3. Debugging Attempts:
Initially, I was receiving "code":"MissingStream" errors and I resolved this by providing table schemas in the payload.

I then received BadJsonBody errors, and those were resolved by:
- Ensuring primaryKey is an array of strings.
- Adding a jsonSchema with properties to correctly define the types of the fields.
- Ensuring primaryKey in the schema is also defined as a list.
- Finally ensuring primaryKey is an array of arrays.

After all these fixes, I'm back to getting IndexNotFoundError: Index tools._by_airbyte_primary_key not found.
Was this page helpful?