Engineering · Issue 0042

The case for warehouse-native pipelines, after three years of moving them back.

We tore down a six-stage Kafka cathedral last winter and rebuilt it inside the warehouse. The latency got worse by twelve seconds. Almost everything else got better.

Mira Ostrowski Staff data engineer February 14, 2025 11 min read

For most of 2022 our analytics graph looked like a small European rail map: thirty-one Airflow DAGs feeding nine Kafka topics, two stream processors holding state on a Redis tier nobody owned, and a custom Go service that translated late-arriving events into idempotent merges. It was, by any honest measure, an achievement. It was also slowly killing the people who paged for it.

Stop treating freshness as the only axis that matters.

The argument for streaming was always framed as latency, and latency was always framed in seconds. But the metric we actually optimized — the one our on-call rotation cared about at three in the morning — was the time between a bad row arriving and an engineer being able to prove which join produced it. That number was forty minutes on a good day. After we moved the entire flow into scheduled warehouse tasks and stopped chasing sub-second updates, it dropped to under four. The pipeline got slower; debugging got an order of magnitude faster, because the lineage was finally written down in one place.