A construction-tech founder needed to read every door, every outlet, every duct out of a 400-page plan set - automatically - and deliver a procurement BOM that estimators would actually trust.
Off-the-shelf VLMs hallucinated symbols. Pure CV pipelines missed any symbol they hadn't been hand-labeled for. The product had to combine both: classical detection for known symbols, vision-language reasoning for the long tail, and a verification loop that estimators could audit.
We built a hybrid engine: tile the plan, run a fast detector for the trade's high-frequency symbols, route the residual regions to a fine-tuned VLM, and reconcile against the legend on each sheet. Output goes straight into a quantity takeoff CSV that an estimator can sign off in an hour.
// what we built
Hybrid pipeline, one BOM.
[01]pdfminer · paddleocr
Sheet ingest & legend parsing
PDF tiling, OCR, sheet-level legend extraction so each page knows its own vocabulary.
[02]yolo · detr
Symbol detection
Per-trade detector trained on a curated corpus of MEP and architectural symbols.
[03]vlm · rag
VLM long-tail reasoning
Fine-tuned VLM handles ambiguous, novel, or partially occluded symbols against the live legend.
[04]py · pandas
BOM reconciliation
Cross-sheet rollup, dedup, audit trail so an estimator can sign the takeoff in an hour.
// outcomes
The receipts.
92%
symbol recall on real plan sets
8×
faster than manual takeoff
400p
plan sets, end to end
1h
estimator audit per project
8mo
concept to first paying GC
6
trades supported at launch
"
Everyone in the category showed me a demo. Bina shipped a system my estimators actually trust on real plan sets. That is a different conversation.
Oded — founder, Auto-QTO
// faq · auto-qto
What clients ask about Auto-QTO.
Reading every door, outlet, and duct out of a 400-page construction plan set automatically and producing a procurement BOM that estimators would trust enough to sign within an hour.