Skip to content

clinops

Clinical ML Pipeline Toolkit — production-grade data loading, preprocessing, and time-series feature engineering for healthcare AI research.

PyPI version Python 3.12+ License: Apache 2.0 Tests


Every healthcare AI project starts with the same two weeks of plumbing: loading MIMIC tables without hitting memory limits, clipping physiologically impossible values before they corrupt your model, normalizing glucose from mmol/L to mg/dL across sites, building time-series windows that handle clinical missingness correctly, and splitting data without leaking patients across folds. clinops packages those hard-won patterns into a single, well-tested library so your first notebook is actual science.

Modules

Module What it does
clinops.ingest Loaders for MIMIC-IV, MIMIC-III, FHIR R4, and flat CSV/Parquet with schema validation
clinops.preprocess Outlier clipping with physiological bounds, unit normalization, ICD-9→10 mapping
clinops.temporal Sliding/tumbling windows, gap-aware imputation, lag features, cohort alignment
clinops.split Temporal, patient-level, and stratified patient train/test splitting
clinops.monitor Distribution drift detection (PSI + KS) and data quality alerting for production pipelines
clinops.orchestrate GCS/S3 artifact storage and AWS Step Functions pipeline builder

Quickstart

pip install clinops
from clinops.ingest import MimicTableLoader
from clinops.preprocess import ClinicalOutlierClipper
from clinops.temporal import TemporalWindower, ImputationStrategy
from clinops.split import StratifiedPatientSplitter

# Load MIMIC-IV vitals
tbl = MimicTableLoader("/data/mimic-iv-2.2")
charts = tbl.chartevents(subject_ids=[10000032, 10000980])

# Clip physiologically impossible values
charts = ClinicalOutlierClipper(action="clip").fit_transform(charts)

# Build 24-hour windows with 6-hour stride
windows = TemporalWindower(window_hours=24, step_hours=6).fit_transform(
    df=charts,
    id_col="subject_id",
    time_col="charttime",
    feature_cols=["heart_rate", "spo2", "resp_rate"],
)

# Patient-stratified split — no leakage
result = StratifiedPatientSplitter(
    id_col="subject_id",
    outcome_col="hospital_expire_flag",
    test_size=0.2,
).split(windows)

Examples


Installation

Requires Python 3.12+.

pip install clinops           # core
pip install clinops[fhir]     # adds FHIR R4 loader
pip install clinops[gcp]      # adds GCP extras
pip install -e ".[dev]"       # development (includes docs, linting, tests)