MugiM
Convex Community9mo ago
88 replies
Mugi

Qwisine14B: FINAL Evaluation of a Fine-Tuned Model on Convex

Qwisine: Early Evaluation of a Fine-Tuned Model on Convex


TL;DR
Qwisine is a fine-tuned Qwen3-14B model trained on Convex documentation and synthetic reasoning data. After a first round of training (∼700k tokens), it performs on par with GPT-4.1 across key Convex development tasks beating modells like DeepSeek R1, Grok 3 Mini (Beta), , GPT-4o etc. More practical data and larger model evaluations are planned.



Dataset & Training


The initial dataset was built from the Convex documentation, forming question–answer pairs across types like “what,“why,“how to,” and “task.” Each was paired with relevant context chunks.
Later Synthetic reasoning data was added using Claude 3.7-thinking to improve depth.

This first phase used ~700k tokens.

Evaluation – Phase 1


* Model: Qwisine (Qwen3-14B fine-tune)
* Evaluation scope: Core Convex categories
* Average score: 65.47%

Next Steps


* Add ~500k tokens of real-world Convex code and tasks
* Incorporate more practical, implementation-based examples
* Fine-tune and evaluate a larger 32B model

Why Convex?


I chose Convex simply because a friend uses it and talks about it all the time (ultimate convex glazer @v ). and seemed like good opportunity to learn something new.



PS: I’ll share links to Hugging Face and Datasette later — they’re still rough, and since I’m new to this and learning as I go, I’m a bit embarrassed to put them out just yet. But they’re coming!
Screenshot_from_2025-05-06_12-34-35.png
Was this page helpful?