- MLFlow artifacts storage is now correctly works with cloud storage for Google Cloud and Amazon.
- Set model-name/model-version headers on service mesh level (#496). That looses the requirements to inference servers. Previously any inference server (typically a model is packed into one on Packaging stage) was obligated to include the headers into response for feedback loop to work properly. That rule restricts from using any third-party inference servers (such as NVIDIA Triton), because we cannot control the response headers.
- Removed deprecated fields updateAt/createdAt from core API entities (#394).
- Move to recommended and more high-level way of using Knative which under-the-hood is responsible for a big part of ModelDeployment functionality (#347).
Airflow plugin operators expect a service account’s
passwordfield of Airflow Connection now. previously it expects
`Breaking change!`: You should recreate all Airflow connections for ODAHU server by moving the
extrafield into the
Please do not forget to remove your
extrafield for security reasons.
- Fix & add missing updatedAT/createdAT (#583, #600, #601, #602).
- Training result doesn’t contain commit ID when using object storage as algorythm source (#584).
- RunID is now present for model training with mlflow toolchain (#581).
- InferenceJob objects can now be deleted correctly (#555).
- Deployment roleName changes now applies correctly (#533).
- X-REQUEST-ID header are now correctly handled on service mesh layer to support third-party inference servers (#525).
- Fix packaging deletion via bulk delete command (#416).
Odahu 1.4.0, 26 February 2021¶
- New Play tab on Deployment page provides a way to get deployed model metadata and make inference requests from the UI (#61).
- New Logs tab on Deployment page provides a way to browse logs of deployed model (#45).
- User now can create packaging and deployments based on finished trainings and packagings (#38).
odahuflowctl local pack runcommand added. It allows you disable targets which will be passed to packager process. You can use multiple options at once. For example:
odahuflowctl local pack run ... --disable-target=docker-pull --disable-target=docker-push.
odahuflowctl local pack runcommand are deprecated.
odahuflowctl local pack runbehavior that implicitly disables all targets by default is deprecated.
- Knative doesn’t create multiple releases anymore when using multiple node pools (#434).
- Liveness & readiness probes lowest values are now 0 instead of 1 (#442).
- Correct error code now returned on failed deployment validation (#441).
- Empty uri param is not longer validated for ecr connection type (#440).
- Return correct error when missed uri param passed for git connection type (#436).
- Return correct error when user has insufficient privileges (#444).
- Default branch is now taken for VCS connection if it’s not provided by user (#148).
- Auto-generated predictor value doesn’t show warning on deploy creation (#80).
- Default deploy liveness & readiness delays are unified with server values (#74).
- Deployment doesn’t raise error when valid predictor value passed (#46).
- Sorting for some columns fixed (#48).
- Secrets are now masked on review stage of connection creation (#42).
- Interface is now works as expected with long fields on edit connection page (#65)
Odahu 1.3.0, 07 October 2020¶
- Persistence Agent added to synchronize k8s CRDS into main storage (#268).
- All secrets passed to ODAHU API now should be base64 encoded. Decrypted secrets retrieved from ODAHU API via /connection/:id/decrypted are now also base64 encoded. (#181, #308).
- Positive and negative (for 404 & 409 status codes) API tests via odahuflow SDK added (#247).
- Robot tests will now output pods state after each API call to simplify debugging.
- Refactoring: some abstractions & components were renamed and moved to separate packages to facilitate future development.
- For connection create/update operations ODAHU API will mask secrets in response body.
- Rclone output will not reveal secrets on unit test setup stage anymore.
- Output-dir option path is now absolute (#208).
- Respect artifactNameTemplate for local training result directory name (#193).
- Allow to pass Azure BLOB URI without schema on connection creation (#345)
- Validate model deployment ID to ensure it starts with alphabetic character (#294)
- State of resources now updates correctly after changing in UI (#11).
- User aren’t able to submit training when resource request is bigger than limit ‘(#355).
- Mask secrets on review page during conenction creation process (#42)
- UI now responds correct in case of concurrent deletion of entities (#44).
- Additional validation added to prevent creation of resources with unsupported names (#342, #34).
- Sorting added for training & packaging views (#13, #48).
- reference field become optional for VCS connection (#50).
- Git connection hint fixed (#7).
- All API connection errors now should be correctly handled and retried.
Odahu 1.2.0, 26 June 2020¶
- ODAHU UI:
- ODAHU UI turned into open-source software and now available on github under Apache License Version 2.0. UDAHU UI is an WEB-interface for ODAHU based on React and TypeScript. It provides ODAHU workflows overview and controls, log browsing and entity management.
- Training now will fail if wrong data path or unexisted storage bucket name is provided (#229).
- Training log streaming is now working on log view when using native log viewer (#234).
- ODAHU pods now redeploying during helm chart upgrade (#111).
- ODAHU docker connection now can be created with blank username & password to install from docker public repo (#184).
- ODAHU UI:
- Fix description of replicas of Model Deployment.
- Trim spaces for input values.
- Fix incorrect selection of VCS connection.
- Close ‘ODAHU components’ menu after opening link in it.
Odahu 1.1.0, 16 March 2020¶
- Supported the JupyterHub in our deployment scripts. JupyterHub allows spawning multiple instances of the JupyterLab server. By default, we provide the prebuilt ODAHU JupyterLab plugin in the following Docker images: base-notebook, datascience-notebook, and tensorflow-notebook. To build a custom image, you can use our Docker image template or follow the instructions.
- Added the ability to deploy a model training on GPU nodes. You can find an example of training here. This is one of the official MLFlow examples that classifies flower species from photos.
- ODAHU-Flow has the Connection API that allows managing credentials from Git repositories, cloud storage, docker registries, and so on. The default backend for Connection API is Kubernetes. We integrated the Vault as a storage backend for the backend for Connection API to manage your credentials securely.
- Helm 3:
- We migrated our Helm charts to the Helm 3 version. The main goals were to simplify a deployment process to an Openshift and to get rid of the tiller.
- ODAHU UI:
- ODAHU UI provides a user interface for the ODAHU components in a browser. It allows you to manage and view ODAHU Connections, Trainings, Deployments, and so on.
- Local training and packaging:
- You can train and package an ML model with the odahuflowctl utility using the same ODAHU manifests, as you use for the cluster training and packaging. The whole process is described here.
- Performance improvement training and packaging:
- We fixed multiple performance issues to speed up the training and packaging processes. For our model examples, the duration of training and packaging was reduced by 30%.
- We created the new odahu-infra Git repository, where we placed the following infra custom helm charts: Fluentd, Knative, monitoring, Open Policy Agent, Tekton.
- Preemptible nodes:
- Preemptible nodes are priced lower than standard virtual machines of the same types. But they provide no availability guarantees. We added new deployment options to allow training and packaging pods to be deployed on preemptible nodes.
- Third-parties updates:
- Google Cloud Registry:
- We have experienced multiple problems while using Nexus as a main dev Docker registry. This migration also brings us additional advantages, such as in-depth vulnerability scanning.
- We switched to using Terragrunt for our deployment scripts. That allows reducing the complexity of our terraform modules and deployment scripts.