This section describes API and protocols related to Batch inference using ODAHU.
ODAHU Batch Inference feature allows user to get inferences using ML model for large datasets that are delivered asyncronously, not via HTTP API, but through other mechanisms.
Currently Batch Inference supports the following ways to delivery data for forecasting:
- Object storage
In future we consider to add ability to process data directly from Kafka topic and other async data sources.
Please also take a look at example.
InferenceService represents the following required entities:
- Predictor docker image that contains predictor code
- Model files location on object storage (directory or .zip / .tar.gz archive)
- Command and Arguments that describe how to execute image
When a user trains a model then they should build an image with code that follows Predictor code protocol and register
this image as well as appropriate model files using
InferenceService entity in ODAHU Platform.
User describes how inference should be triggered using different options in
InferenceJob describes forecast process that was triggered by one of the triggers in
.spec.triggers.webhook is enabled then its possible to run
InferenceJob by making POST request as described
below. By default webhook trigger is enabled. Note, that currently its the only one way to trigger jobs.
Predictor code protocol¶
ODAHU Platform launches docker image provided by user as
.spec.image (InferenceService) and guarantees the
next conventions about input/model location inside container as well as format of input and output data.
|$ODAHU_MODEL||Path in local filesystem that contains all model files from
|$ODAHU_MODEL_INPUT||Path in local filesystem that contains all input files from
|$ODAHU_MODEL_OUTPUT||Path in local filesystem that will be uploaded to
Input and output formats¶
Predictor code must expect input as set of JSON files with extensions
.json located in folder that can be found
$ODAHU_MODEL_INPUT environment variable. These JSON files have structure of
Kubeflow inference request objects.
Predictor code must save results as set of JSON files with extension
.json in the folder that can be found in
$ODAHU_MODEL_INPUT environment variable.
These JSON files must have structure of
Kubeflow inference response objects.
This section helps with deeper understanding of underlying mechanisms.
InferenceJob is implemented as TektonCD TaskRun with 9 steps
- Configure rclone using ODAHU connections described in
- Sync data input from object storage to local fs using rclone
- Sync model from object storage to local fs using rclone
- Validate input to Predict Protocol - Version 2
- Log Model Input to feedback storage
- Run user container with setting
- Validate output to Predict Protocol - Version 2
- Log Model Output to feedback storage
- Upload data from