Last November I spent three days debugging a text-classification pipeline that kept returning garbage results on Portuguese inputs. The model card said "multilingual." It did not mention that the training corpus contained fewer than 200 Portuguese samples. A single sentence in that card would have saved me seventy-two hours and a very awkward stand-up explanation.

The gap between "works on my machine" and "works in the world"

We treat model cards as afterthoughts — a template we copy from the last repo, fill in with vague adjectives, and push alongside the weights. But a model card is not documentation. It is a specification of behavior, limitations, and failure modes. When you write language: multilingual without caveats, you are making a promise that downstream systems will be built on. The Embra Hub has over 400,000 models. Roughly twelve percent of them have model cards that describe evaluation metrics in any detail. That number should disturb us.

A model card without honest limitations is like a map without cliff edges marked. Technically useful. Practically dangerous.

I started a personal practice this year: before publishing any model, I force myself to write the "Known Limitations" section first. Not the shiny benchmark table, not the architecture diagram — the part where I admit what breaks. It changed how I build. When you have to articulate a failure mode in prose, you start testing for it earlier. The model gets better because the documentation got honest first.