Data Operations Lead
Opens your email with the role pre-filled.
Head of Data Operations — Vietnam
Location: Remote Contract: Full-time, with founding-team equity Reports to: the CEO — and a Vietnamese co-founder for Vietnamese-operations strategy
Our mission
We're building the AI data infrastructure that makes Vietnamese-language models world-class. Every dataset we deliver makes Vietnamese AI stronger.
We believe Vietnamese AI deserves world-class infrastructure — and deserves to be built by Vietnamese people, for Vietnamese people. We want Vietnamese AI to lead, not follow other markets.
Tinh Lọc turns Vietnam's media archives (audio, video, documents, broadcasts — historical and modern) into verified, structured, model-ready datasets. Every dataset we ship expands Vietnamese AI capability across the industry.
We have an anchor customer with a multi-year scope of work, and we launch operations within the next 30 days. This is a chance to join at the founding-team stage, not into a settled company.
Why this role matters
At Tinh Lọc, operations IS the product — and the data pipeline IS the operation. You own the data operation end-to-end: from client raw content arriving in object storage, through automated extraction → human correction and structuring → expert evaluation and QA → structured deliverables shipping back to the client. Every stage runs on time, in budget, and at quality because you built the infrastructure it runs on.
This is the operational backbone the whole company sits on. There's no production infrastructure in place — you stand it up. There are no SLAs — you design them. There's no incident runbook — you write it. You build and own the production infrastructure the rest of the org operates within — this is an ownership role, not a support role.
As we scale from a small founding team today to 10-15 people in Year 1, you are the leader of the data-operations engine of the entire company. Your decisions on infrastructure, throughput, and cost discipline directly shape Tinh Lọc's unit economics, quality, and ability to scale. In 24-36 months, as the company expands to 25-50 people, you own the throughput economics of an operation several times larger than the one you start with.
What you own
- The end-to-end data flow — raw content in → automated extraction → human correction and structuring → QA → structured delivery back to the customer. One connected system, and it's yours.
- The production infrastructure — object storage, the metadata database, Label Studio deployment, workflow orchestration, and observability.
- The operational SLAs — turnaround per content-hour, accuracy targets, cost-per-processed-hour budgets — and the incident response when a stage breaks.
- Cost discipline — GPU, storage egress, and annotator-hours kept predictable as volume grows.
- Delivery — datasets shipped in the formats models actually consume (JSONL, Parquet, HuggingFace Datasets, custom client formats).
- The weekly operating picture — throughput, accuracy, cost, and queue depth, reported to the CEO.
Who we're looking for
You are someone who:
-
Has 3-6 years in data / infrastructure / platform engineering or data-ops — with the appetite to grow into the full Data Operations Lead scope as the operation scales. You've built or owned production systems where throughput, cost, and quality had to hold at once. We care about the quality of what you've built, not tenure or title. We want someone younger, hungrier, and ready to grow into the role.
-
Owns end-to-end pipeline / data-flow. You've taken raw content in at one end and shipped verified, structured deliverables out the other — and you owned every stage in between. You can reason about ingest, pre-processing, human-in-the-loop correction, QA, and delivery as one connected system, not a set of disconnected tools.
-
Has real infrastructure depth. Object storage at scale (Backblaze B2 / Cloudflare R2), Postgres metadata DBs with pgvector / Qdrant, Docker / Kubernetes, and a workflow orchestrator (Prefect / Dagster / Airflow — expected at lead level). AWS or GCP. You've run Label Studio in production, and you instrument what you run with observability dashboards (Metabase + Sentry).
-
Has genuine cloud cost discipline. You keep GPU spend, storage egress (especially video), and annotator-hours predictable as volume grows. You can tell us exactly what you instrument and what you alert on — cost is a first-class metric to you, not an afterthought.
-
Designs operational SLAs and runs incidents like an engineer. SLA frameworks (turnaround per content-hour by stage, accuracy targets, cost-per-processed-hour budgets), on-call rotations, postmortems, and capacity planning are second nature. When a stage fails, you're the escalation point — and you change the system so it never fails the same way twice.
-
Ships deliverables in the formats models actually consume. JSONL, Parquet, HuggingFace Datasets, and custom client formats.
-
Vietnamese fluency strongly preferred. Our workforce and our supplier relationships are Vietnamese-native. You'll work with the CEO (English-native) daily AND coordinate the Vietnamese operation daily.
Who we're NOT looking for
To save your time and ours, here are profiles that won't fit:
- Someone who wants to build a custom annotation UI on Day 1. That's over-engineering. Label Studio in production gets you to a shipped first batch; a bespoke tool gets you to a missed deadline.
- Someone who can't articulate the difference between extraction and correction — between what AI pre-processing produces and what human correction and structuring add on top. If that distinction is fuzzy, the cost model and the QA model are both built on sand.
- Someone who has only run English-language operations and dismisses Vietnamese-language workforce nuance. The workforce is Vietnamese-native; the nuance is the job, not a footnote.
- A pure people-manager with no infrastructure or cost depth. This role lives at the intersection of operations and engineering. If you can manage people but can't reason about GPU spend, storage egress, or pipeline architecture, this isn't the seat.
- Someone who needs a settled org with ready-made processes. This is a founding-team role. There are no default processes — you design them. If you need a stable job description and a pre-built operation, this isn't the role for you.
- A senior director with 10+ years who needs a settled organisation and top-of-market cash. You'll do better at a larger company. This is a founding-stage, build-from-zero role where you grow with the company — not a seat for someone optimising for title and top-of-market compensation today.
What you'll do
First 2 weeks
- Absorb the business plan, exec plan, and QA system documents; 1-on-1s with the CEO + Vietnamese co-founder; tour the existing bench-test pipeline with the Pipeline / AI Engineer
- Stand up production-grade infrastructure: object storage, Postgres metadata DB, Label Studio production deployment, Sentry, and initial Metabase dashboards
- Define the first operational SLAs — turnaround per content-hour, accuracy targets, and the cost-per-processed-hour budget
By Day 30 — first sample batch end-to-end
- Operate the first end-to-end sample batch — and lead the incident response if anything breaks
- Stand up the workforce-scheduling and load-balancing model across the annotation tiers
- As one of the founding hires, kick off the searches for the Annotation Programme Manager and the QA Lead together with the Chief of Staff and CEO — you're building the operation, so you help choose the people who'll run its layers
Day 30-90 — pilot + production
- Support delivery of the first full pilot batch to the customer (~Day 60)
- Onboard the Annotation Programme Manager and QA Lead and wire their layers into the data flow
- Ship the production runbook; establish the weekly operational dashboards (throughput, accuracy, cost, queue depth)
6-12 months
- Own the throughput economics — cost per processed hour of media, processed-hours-per-day, batch turnaround — as the operation scales to 10-15 people
- Harden the infrastructure, observability, and on-call rotation for sustained production volume; keep GPU, storage, and annotator-hour costs predictable
- Develop the Vietnamese vendor + supplier network; as the operation scales toward 25-50 people you own its throughput economics and architect the expansion for customers #2 and #3
Compensation
| Component | Detail |
|---|---|
| Cash base | $2,200-3,200 USD/month (52-76M VND) — mid-level founding hire who grows into the full lead scope |
| Performance bonus | A discretionary performance bonus may apply at the company's discretion — no standing band |
| Founding-level equity | Meaningful founding-level equity, drafted by Vietnamese counsel and convertible into real equity if/when a holding entity is established |
| Vesting | 4 years, 1-year cliff (industry standard) |
| Statutory benefits | BHXH 17.5% + BHYT 3% + BHTN 1% (employer-side, fully compliant with Vietnamese law) |
| 13th-month salary (lương tháng 13) | Standard, accrued throughout the year |
| Tết bonus | Minimum 1 month salary; can be more based on performance |
| Learning budget | $1,000-2,000/year for courses, conferences, books |
| Equipment | Company-provided laptop; remote-work equipment budget |
How to apply
Send via email to contact@coreywilton.org (or through the advisor who introduced you to this role):
- CV — concise, focused on data-operations and infrastructure experience
- A short note (200-400 words, Vietnamese or English) answering these 3 questions:
- Walk us through a data pipeline you owned end-to-end — its throughput, the SLA you held it to, the worst incident you handled, and what you changed so it never recurred.
- Given object storage + Postgres + Kubernetes and a fixed cloud budget, how do you keep GPU, storage, and annotator-hour costs predictable as volume grows 5x? What do you instrument and alert on?
- As a founding hire building the data-production operation: what's the first system you'd stand up in month one, and which part of the scope are you most — and least — comfortable owning?
We'll respond within 5 business days to all qualified candidates. Round-1 interview with the CEO (60 min, video). Round-2 with the CEO + Vietnamese co-founder (90 min, video). Offer extended within 7 days of Round-2 if there's a fit.