Layer 01 · Intelligence

Decision
Intelligence, shipped.

ML systems that turn the signals you already collect into the actions your business actually takes — ranking, forecasting, propensity, anomaly. Production-grade, with the boring scaffolding (features, eval, drift, fairness) wired in from day one.

Engagement
8–16 weeks
scope → pilot → prod
Patterns
Rank · Forecast · Propensity · Anomaly
and the ones nobody named yet
Stack
Python · feast · dbt · LightGBM · Torch
your warehouse, your VPC
LIVE · score.stream

Where models earn their keep

The patterns we see across pricing, risk, growth, and ops — and the operational shape of the answer.

01

Dashboards aren't decisions

A chart that tells you churn is up doesn't tell you who to call. DI turns the same data into a ranked queue your team can actually work.

02

Most ML never reaches a user

Models stuck in notebooks, features computed three different ways, no eval after launch. We build the production scaffolding the model lives inside, not just the model.

03

Wrong by 1%, wrong forever

Drift, leakage, and unfair segments don't show up at training time. We wire monitoring and fairness from day one, not as a 'phase 2'.

What we ship, by the person on the hook for the metric

persona

Head of Data Science

Pain
Brilliant notebooks. Zero of them shipping. We need a path from prototype to production.
What we ship
A feature store, model registry, eval harness, and serving stack your team can build on for the next ten models, not just this one.
persona

VP Risk / Fraud

Pain
Rules-based fraud detection is brittle and our analysts are drowning in false positives.
What we ship
A scored decisioning engine — model + policy — with auditable rules, segment-level fairness, and a workbench your investigators actually want to use.
persona

Head of Growth

Pain
Every customer gets the same offer. Half ignore it. We can't tell what's working.
What we ship
Propensity + uplift models wired to your campaign tooling, with live test/holdout discipline so you can prove what's incremental.
persona

COO / Ops

Pain
Forecasts are spreadsheets. Inventory is overstocked or out. Routing is by intuition.
What we ship
Forecasting + optimization models inside your planning workflow — with intervals you can plan around, not point estimates.

Six patterns. One platform underneath.

We compose these into the model your problem actually needs — and we build the platform once, so the next ten models cost a fraction.

Ranking & Recsys

Personalized order on the surfaces that drive revenue.

  • Two-stage retrieval + ranking
  • Bandits for exploration
  • Fairness across catalogue tiers

Forecasting

Probabilistic forecasts your planning workflow can plan around.

  • Hierarchical · per-SKU · per-region
  • Intervals, not just point estimates
  • Backtest harness baked in

Propensity & Uplift

Who to target — and proof you'd not have got the outcome anyway.

  • Causal uplift trees
  • Holdout discipline
  • Wired to your campaign tooling

Anomaly & Risk

Catch the weird, before it becomes the expensive.

  • Streaming + batch detectors
  • Per-segment thresholds
  • Investigator workbench

Feature Platform

One source of truth for features, online and offline, point-in-time correct.

  • Feast / custom store
  • PIT-correct training joins
  • Per-feature lineage

Eval, Drift & Fairness

Proof the model still works — yesterday, today, and per segment.

  • PSI / KS drift dashboards
  • Group + individual fairness
  • Per-cohort error reports

Online path. Offline loop.

Sources → features → model → policy → action on the top lane. Training, drift, and promotion on the bottom. Every model engagement starts from this shape.

Lane 01 · Sources & Features

Events, OLTP, warehouse, third-party — joined into a feature store with point-in-time correctness so training matches serving.

Lane 02 · Models

Champion in production, challenger in shadow. Versioned in a registry, served at p99 < 50ms, canary-deployed by default.

Lane 03 · Policy

Rules layered on the score — thresholds, escalations, overrides. Audited and revertable, so the business can shape the decision without redeploying the model.

Lane 04 · Monitoring & Loop

Drift, fairness, per-segment error, PSI/KS. The bottom lane is what makes the top lane safe in production.

Frame, ship, instrument, expand

Four phases, each with a deliverable that survives without us. Most clients land model #1 in ~12 weeks; model #2 lands in 4.

01
Wk 1–2

Frame & data audit

We pick one decision, define the metric, and audit the data: leakage, label hygiene, point-in-time integrity, fairness segments.

Deliverables
Decision spec
Data audit memo
Eval rubric
02
Wk 3–6 · ACTIVE EXAMPLE

Baseline + pilot model

A baseline (logistic / heuristic) we have to beat, then the pilot model. Featured in your stack, evaluated against the rubric, not against vibes.

Deliverables
Baseline + pilot in registry
Backtest report
Per-segment error breakdown
03
Wk 7–12

Serving + policy

The model goes live in shadow, then canary, then champion. Policy layer wired. Monitoring on. Investigators / ops have a workbench.

Deliverables
Online serving stack
Drift dashboards
Policy rules (auditable)
04
Wk 13+

Handover & expand

Your team owns the loop. We stay on retainer for the second model — which costs a fraction of the first because the platform is in place.

Deliverables
Production runbook
Re-train pipeline
Model #2 backlog

Boring on purpose. Sharp where it counts.

We default to the durable open-source piece. We bring the cutting edge in only where it moves the metric.

Modeling
LightGBM / XGBoostPyTorchscikit-learnProphet · NeuralProphetTFT · DeepAR
Feature Platform
FeastdbtAirflowMaterializeCustom PIT joins
Serving
FastAPIBentoMLSeldonTritonAWS SageMakerVertex Endpoints
Experimentation
MLflowWeights & BiasesOptunaCustom hyper-search
Monitoring & Drift
EvidentlyWhyLabsCustom PSI / KSDatadogGrafana
Warehouse & Compute
Snowflake · BigQuery · RedshiftDatabricksSparkDuckDB

What changes after the model ships

Aggregated across decision-intelligence engagements over the last 18 months.

+0%
Lift on the primary metric
Median across ranking, propensity, and pricing models.
0%
False positives in fraud / risk
At equal or better recall, vs. the rules baseline.
0ms
p50 scoring latency
At production scale, in your VPC.
0x
Time-to-second-model
Once the platform is in place, the next model is a fraction of the cost.
OF
OUTFITKART · COMMERCE

A two-stage ranker on the discovery surface.

Read full case →

Retrieval pulls candidates; an ML ranker reorders by per-user propensity. We instrumented uplift, not just CTR — so the lift we report is the lift you can bank.

2.8×
lift in CTR · validated against holdout
MR
MERIDIAN BANK · RISK

Replacing 200 rules with a scored decisioning engine.

Read full case →

Champion model + thin policy layer for thresholds and overrides. Investigators kept their control; false positives dropped without reducing catch rate.

−68%
false positives at equal recall

What we get asked on the discovery call

Most data teams can build models. Few have shipped a production-grade serving stack with feature parity, drift monitoring, and a policy layer. We bring the platform shape; your team brings the domain. Most engagements end with us as a lighter retainer.

No, but you'll have one when we're done. Whether to use Feast, build a thin one, or extend what you have is one of the first decisions on the table — driven by your stack, not ours.

Per-segment evaluation in the eval rubric from week one — group fairness metrics, error parity checks, and a holdout that includes the rare cohorts. If a model trades performance for fairness, that's a decision the business makes explicitly, not one that's hidden in an aggregate AUC.

PSI / KS on every input feature, score distribution monitoring, and per-segment AUC tracked daily. When drift breaches a threshold, the system alerts and pages a re-train run automatically — with a human in the loop for promotion.

Yes. We deploy into your VPC (AWS, GCP, or Azure) and integrate with your existing observability and IAM. We never become a runtime dependency you can't replace.

A pilot model with platform foundations runs ₹50L–1.2Cr over 8–14 weeks depending on data complexity and serving SLAs. Subsequent models on the same platform are typically ~₹15–30L each.

Q2 2026 · two slots open for Decision Intelligence

Talk to a Decision Intelligence engineer.

Bring the messy bit. We come back with an architecture sketch and a discovery plan inside two business days — no sales theatre.

response within
48h