Kubeflow, a toolkit for running artificial intelligence on Kubernetes containers, got version 1.0. The first stable release took about three years; in 2017 Kubeflow was made open-source by a team of engineers at Google.
Kubeflow is a toolkit that allows organizations to deploy AI workloads on infrastructure powered by container-orchestration framework Kubernetes. Kubeflow 1.0 introduces the first stable versions of a number of key components of the software. These components now meet a “defined level of stability, supportability and upgradeability”, which means that the toolkit is now officially fit for production use.
For example, there is now a stable version of the built-in management console with shortcuts to key functions. One of those features is the Jupyter Notebook controller, which is now also stable. This enables AI teams to create new machine learning models with that tool.
Once the model is ready, users can train it with TensorFlow or PyTorch. Kubeflow 1.0 supports both TFJob and PyTorch Operator, two ways for developers to set up an AI training workflow using a relatively simple script.
The release also includes features for the administrators who manage the infrastructure on which the developers build their AI workloads. There is kfctl, which can automatically deploy Kubeflow in a cloud environment, and features that make it possible to distribute the resources of the environment among the individual developers. Kubeflow 1.0 can also be deployed on Google’s Anthos platform.