Ray Data for Batch AI
Back to modules
Course progress50%
article
Batch inference with Ray Data
Load model state once and score large datasets efficiently.
Batch inference with Ray Data
Batch inference is one of Ray Data's most practical entry points. Teams can keep model code in Python while scaling over large datasets and GPU workers.
Use callable classes for model state
Load model weights once per worker by using a class with map_batches.
class FraudScorer:
def __init__(self):
self.model = load_model("/mnt/models/fraud")
def __call__(self, batch):
batch["score"] = self.model.predict_proba(batch[FEATURES])[:, 1]
return batch
scored = features.map_batches(
FraudScorer,
compute=ray.data.ActorPoolStrategy(size=8),
batch_format="pandas",
)
Think in throughput limits
Throughput depends on storage read speed, preprocessing cost, model latency, and write bandwidth. The first tuning pass should identify the current bottleneck before adding more workers.
Release checklist
- Pin model and feature versions.
- Emit row counts before and after filtering.
- Store prediction timestamps and model identifiers.
- Validate output schema before writing to the serving or analytics table.