A Data & AI Platform for the Hybrid Cloud

Get started »

What is Open Data Hub?

The Open Data Hub is a machine-learning-as-a-service platform built on Red Hat's Kubernetes-based OpenShift® Container Platform, Ceph Object Storage, and Kafka/Strimzi integrating a collection of open source projects. It inherits from upstream efforts such as Kubeflow and is the base of Red Hat's internal data-science and ML service. Data scientists can create models using Jupyter notebooks, and select from popular tools such as TensorFlow™, scikit-learn, Apache Spark™ and more for developing models. Teams can spend more time solving critical business needs and less on installing and maintaining infrastructure with the Open Data Hub.

Open Data Hub is a meta-project that integrates open source projects into a practical solution. It aims to foster collaboration between communities, vendors, user-enterprises, and academics following open source best practices. The open source community can experiment and develop intelligent applications without incurring high costs and having to master the complexity of modern machine learning and artificial intelligence software stacks.

Getting Started

For additional information about the Open Data Hub, read our blogs.

To set up the Open Data Hub, all you need is a running OpenShift® cluster. For storing data and models, we recommend using a S3 object store such as Ceph.

Once your OpenShift and Ceph installations are running, deploy the Open Data Hub components using our Ansible playbooks and OpenShift® deployment templates.

Installation »
Data Hub Parts