Open Data Hub logo

Installing Open Data Hub

Installing Open Data Hub version 2

You can install Open Data Hub version 2 on your OpenShift Container Platform from the OpenShift web console. For information about upgrading the Open Hub Operator, see Upgrading Open Data Hub.

Installing Open Data Hub involves the following tasks:

  1. Installing the Open Data Hub Operator.

  2. Installing Open Data Hub components.

  3. Accessing the Open Data Hub dashboard.

Note

Version 2 of the Open Data Hub Operator represents an alpha release, accessible only on the fast channel. Later releases will change to the rolling channel when the Operator is more stable.

Installing the Open Data Hub Operator version 2

Prerequisites
  • You are using OpenShift Container Platform 4.12 or later.

  • Your OpenShift cluster has a minimum of 16 CPUs and 32GB of memory across all OpenShift worker nodes.

  • You have cluster administrator privileges for your OpenShift Container Platform cluster.

Procedure
  1. Log in to your OpenShift Container Platform as a user with cluster-admin privileges. If you are performing a developer installation on try.openshift.com, you can log in as the kubeadmin user.

  2. Select OperatorsOperatorHub.

  3. On the OperatorHub page, in the Filter by keyword field, enter Open Data Hub Operator.

  4. Click the Open Data Hub Operator tile.

  5. If the Show community Operator window opens, read the information and then click Continue.

  6. Read the information about the Operator and then click Install.

  7. On the Install Operator page, follow these steps:

    1. For Update channel, select fast.

    2. For Version, select the version of the Operator that you want to install.

    3. For Installation mode, leave All namespaces on the cluster (default) selected.

    4. For Installed Namespace, select the openshift-operators namespace.

    5. For Update approval, select automatic or manual updates.

      • Automatic: When a new version of the Operator is available, Operator Lifecycle Manager (OLM) automatically upgrades the running instance of your Operator.

      • Manual: When a new version of the Operator is available, OLM notifies you with an update request that you must manually approve to upgrade the running instance of your Operator.

  8. Click Install. The installation might take a few minutes.

Verification
  • Select OperatorsInstalled Operators to verify that the Open Data Hub Operator is listed with Succeeded status.

Next Step
  • Install Open Data Hub components.

Installing Open Data Hub components

You can use the OpenShift web console to install specific components of Open Data Hub on your cluster when version 2 of the Open Data Hub Operator is already installed on the cluster.

Prerequisites
  • You have installed version 2 of the Open Data Hub Operator.

  • You can log in as a user with cluster-admin privileges.

  • If you want to use the trustyai component, you must enable user workload monitoring as described in Enabling monitoring for user-defined projects.

  • If you want to use the kserve, datasciencepipeline, distributedworkloads, or modelmesh components, you must have already installed the following Operator or Operators for the component. For information about installing an Operator, see Adding Operators to a cluster.

Table 1. Required Operators for components
Component Required Operators Catalog

ksersve

Red Hat OpenShift Serverless Operator, Red Hat OpenShift Service Mesh Operator

Red Hat

datasciencepipeline

Red Hat OpenShift Pipelines Operator

Red Hat

distributedworkloads

CodeFlare Operator

Community

modelmesh

Prometheus Operator

Community

Procedure
  1. Log in to your OpenShift Container Platform as a user with cluster-admin privileges. If you are performing a developer installation on try.openshift.com, you can log in as the kubeadmin user.

  2. Select OperatorsInstalled Operators, and then click the Open Data Hub Operator.

  3. On the Operator details page, click the Data Science Cluster tab, and then click Create DataScienceCluster.

  4. On the Create DataScienceCluster page, configure the DataScienceCluster by using Form view or YAML view. For general information about the supported components, see Tiered Components.

    • Configure by using Form view:

      1. In the Name field, enter a value.

      2. In the Components section, expand each component and set the managementState to Managed or Removed.

    • Configure by using YAML view:

      1. In the spec.components section, for each component shown, set the value of the managementState field to either Managed or Removed.

  5. Click Create.

Verification
  1. Select HomeProjects, and then select the opendatahub project.

  2. On the Project details page, click the Workloads tab and confirm that the Open Data Hub core components are running. For more information, see Tiered Components.

Next Step
  • Access the Open Data Hub dashboard.

Accessing the Open Data Hub dashboard

You can access and share the URL for your Open Data Hub dashboard with other users to let them log in and work on their models.

Prerequisites
  • You have installed the Open Data Hub Operator.

Procedure
  1. In the OpenShift web console, select NetworkingRoutes.

  2. On the Routes page, click the Project list and select the odh project. The page filters to only display routes in the odh project.

  3. In the Location column, copy the URL for the odh-dashboard route.

  4. Give this URL to your users to let them log in to Open Data Hub dashboard.

Verification
  • Confirm that you and your users can log in to the Open Data Hub dashboard by using the URL.

Installing Open Data Hub version 1

You can install Open Data Hub version 1 on your OpenShift Container Platform from the OpenShift web console. For information about installing Open Hub version 2, see Installing Open Data Hub version 2.

Installing Open Data Hub involves the following tasks:

  1. Installing the Open Data Hub Operator.

  2. Creating a new project for your Open Data Hub instance.

  3. Adding an Open Data Hub instance.

  4. Accessing the Open Data Hub dashboard.

Installing the Open Data Hub Operator version 1

Prerequisites
  • You are using OpenShift Container Platform 4.12 or later.

  • Your OpenShift cluster has a minimum of 16 CPUs and 32GB of memory across all OpenShift worker nodes.

  • You can log in as a user with cluster-admin privileges.

Procedure
  1. Log in to your OpenShift Container Platform as a user with cluster-admin privileges. If you are performing a developer installation on try.openshift.com, you can log in as the kubeadmin user.

  2. Select OperatorsOperatorHub.

  3. On the OperatorHub page, in the Filter by keyword field, enter Open Data Hub Operator.

  4. Click the Open Data Hub Operator tile.

  5. If the Show community Operator window opens, read the information and then click Continue.

  6. Read the information about the Operator and then click Install.

  7. On the Install Operator page, follow these steps:

    1. For Update channel, select rolling.

    2. For Version, select the version of the Operator that you want to install.

    3. For Installation mode, leave All namespaces on the cluster (default) selected.

    4. For Installed Namespace, select the openshift-operators namespace.

    5. For Update approval, select automatic or manual updates.

      • Automatic: When a new version of the Operator is available, Operator Lifecycle Manager (OLM) automatically upgrades the running instance of your Operator.

      • Manual: When a new version of the Operator is available, OLM notifies you with an update request that you must manually approve to upgrade the running instance of your Operator.

  8. Click Install. The installation might take a few minutes.

Verification
  • Select OperatorsInstalled Operators to verify that the Open Data Hub Operator is listed with Succeeded status.

Next Step
  • Create a new project for your instance of Open Data Hub.

Creating a new project for your Open Data Hub instance

Create a new project for your Open Data Hub instance so that you can organize and manage your data science work in one place.

Prerequisites
  • You have installed the Open Data Hub Operator.

Procedure
  1. In the OpenShift web console, select HomeProjects.

  2. On the Projects page, click Create Project.

  3. In the Create Project box, follow these steps:

    1. For Name, enter odh.

    2. For Display Name, enter Open Data Hub.

    3. For Description, enter a description.

  4. Click Create.

Verification
  • Select HomeProjects to verify that the odh project is listed with Active status.

Next Step
  • Add an Open Data Hub instance to your project.

Adding an Open Data Hub instance

By adding an Open Data Hub instance to your project, you can access the URL for your Open Data Hub dashboard and share it with data science users.

Prerequisites
  • You have installed the Open Data Hub Operator.

  • You have created a new project for your instance of Open Data Hub.

Procedure
  1. In the OpenShift web console, select OperatorsInstalled Operators.

  2. On the Installed Operators page, click the Project list and select the odh project. The page filters to only display installed operators in the odh project.

  3. Find and click the Open Data Hub Operator to display the details for the currently installed version.

  4. On the KfDef tile, click Create instance. A KfDef object is a specification designed to control provisioning and management of a Kubeflow deployment. A default KfDef object is created when you install Open Data Hub Operator. This default configuration deploys the required Open Data Hub core components. For more information, see Tiered Components.

  5. On the Create KfDef page, leave opendatahub as the name. Click Create to create an Open Data Hub kfdef object named opendatahub and begin the deployment of the components.

Verification
  1. Select OperatorsInstalled Operators.

  2. On the Installed Operators page, click the Project list and select the odh project.

  3. Find and click Open Data Hub Operator.

  4. Click the Kf Def tab and confirm that opendatahub appears.

  5. Select HomeProjects.

  6. On the Projects page, find and select the odh project.

  7. On the Project details page, click the Workloads tab and confirm that the Open Data Hub core components are running. For a description of the components, see Tiered Components.

Next Step
  • Access the Open Data Hub dashboard.

Accessing the Open Data Hub dashboard

You can access and share the URL for your Open Data Hub dashboard with other users to let them log in and work on their models.

Prerequisites
  • You have installed the Open Data Hub Operator.

Procedure
  1. In the OpenShift web console, select NetworkingRoutes.

  2. On the Routes page, click the Project list and select the odh project. The page filters to only display routes in the odh project.

  3. In the Location column, copy the URL for the odh-dashboard route.

  4. Give this URL to your users to let them log in to Open Data Hub dashboard.

Verification
  • Confirm that you and your users can log in to the Open Data Hub dashboard by using the URL.

Working with certificates

Certificates are used by various components in OpenShift Container Platform to validate access to the cluster. For clusters that rely on self-signed certificates, you can add those self-signed certificates to a cluster-wide Certificate Authority (CA) bundle and use the CA bundle in Open Data Hub. You can also use self-signed certificates in a custom CA bundle that is separate from the cluster-wide bundle. Administrators can add a CA bundle, remove a CA bundle from all namespaces, remove a CA bundle from individual namespaces, or manually manage certificate changes instead of the system.

Understanding certificates in Open Data Hub

For OpenShift Container Platform clusters that rely on self-signed certificates, you can add those self-signed certificates to a cluster-wide Certificate Authority (CA) bundle (ca-bundle.crt) and use the CA bundle in Open Data Hub. You can also use self-signed certificates in a custom CA bundle (odh-ca-bundle.crt) that is separate from the cluster-wide bundle.

How CA bundles are injected

After installing Open Data Hub, the Open Data Hub Operator automatically creates an empty odh-trusted-ca-bundle configuration file (ConfigMap), and the Cluster Network Operator (CNO) injects the cluster-wide CA bundle into the odh-trusted-ca-bundle configMap with the label "config.openshift.io/inject-trusted-cabundle". The components deployed in the affected namespaces are responsible for mounting this configMap as a volume in the deployment pods.

apiVersion: v1
kind: ConfigMap
metadata:
  labels:
    app.kubernetes.io/part-of: opendatahub-operator
    config.openshift.io/inject-trusted-cabundle: 'true'
  name: odh-trusted-ca-bundle

After the CNO operator injects the bundle, it updates the ConfigMap with the ca-bundle.crt file containing the certificates.

apiVersion: v1
kind: ConfigMap
metadata:
  labels:
    app.kubernetes.io/part-of: opendatahub-operator
    config.openshift.io/inject-trusted-cabundle: 'true'
  name: odh-trusted-ca-bundle
data:
  ca-bundle.crt: |
    <BUNDLE OF CLUSTER-WIDE CERTIFICATES>

How the ConfigMap is managed

By default, the Open Data Hub Operator manages the odh-trusted-ca-bundle ConfigMap. If you want to manage or remove the odh-trusted-ca-bundle ConfigMap, or add a custom CA bundle (odh-ca-bundle.crt) separate from the cluster-wide CA bundle (ca-bundle.crt), you can use the trustedCABundle property in the Operator’s DSC Initialization (DSCI) object.

spec:
  trustedCABundle:
    managementState: Managed
    customCABundle: ""

In the Operator’s DSCI object, you can set the spec.trustedCABundle.managementState field to the following values:

  • Managed: The Open Data Hub Operator manages the odh-trusted-ca-bundle ConfigMap and adds it to all non-reserved existing and new namespaces (the ConfigMap is not added to any reserved or system namespaces, such as default, openshift-\* or kube-*). The ConfigMap is automatically updated to reflect any changes made to the customCABundle field. This is the default value after installing Open Data Hub.

  • Removed: The Open Data Hub Operator removes the odh-trusted-ca-bundle ConfigMap (if present) and disables the creation of the ConfigMap in new namespaces. If you change this field from Managed to Removed, the odh-trusted-ca-bundle ConfigMap is also deleted from namespaces. This is the default value after upgrading Open Data Hub from 2.7 or earlier versions to 2.

  • Unmanaged: The Open Data Hub Operator does not manage the odh-trusted-ca-bundle ConfigMap, allowing for an administrator to manage it instead. Changing the managementState from Managed to Unmanaged does not remove the odh-trusted-ca-bundle ConfigMap, but the ConfigMap is not updated if you make changes to the customCABundle field.

In the Operator’s DSCI object, you can add a custom certificate to the spec.trustedCABundle.customCABundle field. This adds the odh-ca-bundle.crt file containing the certificates to the odh-trusted-ca-bundle ConfigMap, as shown in the following example:

apiVersion: v1
kind: ConfigMap
metadata:
  labels:
    app.kubernetes.io/part-of: opendatahub-operator
    config.openshift.io/inject-trusted-cabundle: 'true'
  name: odh-trusted-ca-bundle
data:
  ca-bundle.crt: |
    <BUNDLE OF CLUSTER-WIDE CERTIFICATES>
  odh-ca-bundle.crt: |
    <BUNDLE OF CUSTOM CERTIFICATES>

Adding a CA bundle

There are two ways to add a Certificate Authority (CA) bundle to Open Data Hub. You can use one or both of these methods:

  • For OpenShift Container Platform clusters that rely on self-signed certificates, you can add those self-signed certificates to a cluster-wide Certificate Authority (CA) bundle (ca-bundle.crt) and use the CA bundle in Open Data Hub. To use this method, log in to the OpenShift Container Platform as a cluster administrator and follow the steps as described in Configuring the cluster-wide proxy during installation.

  • You can use self-signed certificates in a custom CA bundle (odh-ca-bundle.crt) that is separate from the cluster-wide bundle. To use this method, follow the steps in this section.

Prerequisites
  • You have admin access to the DSCInitialization resources in the OpenShift Container Platform cluster.

  • You installed the OpenShift command line interface (oc) as described in Get Started with the CLI.

  • You are working in a new installation of Open Data Hub. If you upgraded Open Data Hub, see Adding a CA bundle after upgrading.

Procedure
  1. Log in to the OpenShift Container Platform.

  2. Click OperatorsInstalled Operators and then click the Open Data Hub Operator.

  3. Click the DSC Initialization tab.

  4. Click the default-dsci object.

  5. Click the YAML tab.

  6. In the spec section, add the custom certificate to the customCABundle field for trustedCABundle, as shown in the following example:

    spec:
      trustedCABundle:
        managementState: Managed
        customCABundle: |
          -----BEGIN CERTIFICATE-----
          examplebundle123
          -----END CERTIFICATE-----
  7. Click Save.

Verification
  • If you are using a cluster-wide CA bundle, run the following command to verify that all non-reserved namespaces contain the odh-trusted-ca-bundle ConfigMap:

    $ oc get configmaps --all-namespaces -l app.kubernetes.io/part-of=opendatahub-operator | grep odh-trusted-ca-bundle
  • If you are using a custom CA bundle, run the following command to verify that a non-reserved namespace contains the odh-trusted-ca-bundle ConfigMap and that the ConfigMap contains your customCABundle value. In the following command, example-namespace is the non-reserved namespace and examplebundle123 is the customCABundle value.

    $ oc get configmap odh-trusted-ca-bundle -n example-namespace -o yaml | grep examplebundle123

Removing a CA bundle

You can remove a Certificate Authority (CA) bundle from all non-reserved namespaces in Open Data Hub. This process changes the default configuration and disables the creation of the odh-trusted-ca-bundle configuration file (ConfigMap), as described in Understanding certificates in Open Data Hub.

Note
The odh-trusted-ca-bundle ConfigMaps are only deleted from namespaces when you set the managementState of trustedCABundle to Removed; deleting the DSC Initialization does not delete the ConfigMaps.

To remove a CA bundle from a single namespace only, see Removing a CA bundle from a namespace.

Prerequisites
  • You have cluster administrator privileges for your OpenShift Container Platform cluster.

  • You installed the OpenShift command line interface (oc) as described in Get Started with the CLI.

Procedure
  1. In the OpenShift Container Platform web console, click OperatorsInstalled Operators and then click the Open Data Hub Operator.

  2. Click the DSC Initialization tab.

  3. Click the default-dsci object.

  4. Click the YAML tab.

  5. In the spec section, change the value of the managementState field for trustedCABundle to Removed:

    spec:
      trustedCABundle:
        managementState: Removed
  6. Click Save.

Verification
  • Run the following command to verify that the odh-trusted-ca-bundle ConfigMap has been removed from all namespaces:

    $ oc get configmaps --all-namespaces | grep odh-trusted-ca-bundle

    The command should not return any ConfigMaps.

Removing a CA bundle from a namespace

You can remove a custom Certificate Authority (CA) bundle from individual namespaces in Open Data Hub. This process disables the creation of the odh-trusted-ca-bundle configuration file (ConfigMap) for the specified namespace only.

To remove a certificate bundle from all namespaces, see Removing a CA bundle.

Prerequisites
  • You have cluster administrator privileges for your OpenShift Container Platform cluster.

  • You installed the OpenShift command line interface (oc) as described in Get Started with the CLI.

Procedure
  • Run the following command to remove a CA bundle from a namespace. In the following command, example-namespace is the non-reserved namespace.

    $ oc annotate ns example-namespace security.opendatahub.io/inject-trusted-ca-bundle=false
Verification
  • Run the following command to verify that the CA bundle has been removed from the namespace. In the following command, example-namespace is the non-reserved namespace.

    $ oc get configmap odh-trusted-ca-bundle -n example-namespace

    The command should return configmaps "odh-trusted-ca-bundle" not found.

Managing certificates

After installing Open Data Hub, the Open Data Hub Operator creates the odh-trusted-ca-bundle configuration file (ConfigMap) that contains the trusted CA bundle and adds it to all new and existing non-reserved namespaces in the cluster. By default, the Open Data Hub Operator manages the odh-trusted-ca-bundle ConfigMap and automatically updates it if any changes are made to the CA bundle. You can choose to manage the odh-trusted-ca-bundle ConfigMap instead of allowing the Open Data Hub Operator to manage it.

Prerequisites
  • You have cluster administrator privileges for your OpenShift Container Platform cluster.

Procedure
  1. In the OpenShift Container Platform web console, click OperatorsInstalled Operators and then click the Open Data Hub Operator.

  2. Click the DSC Initialization tab.

  3. Click the default-dsci object.

  4. Click the YAML tab.

  5. In the spec section, change the value of the managementState field for trustedCABundle to Unmanaged, as shown:

    spec:
      trustedCABundle:
        managementState: Unmanaged
  6. Click Save.

    Note that changing the managementState from Managed to Unmanaged does not remove the odh-trusted-ca-bundle ConfigMap, but the ConfigMap is not updated if you make changes to the customCABundle field.

Verification
  1. In the spec section, set or change the value of the customCABundle field for trustedCABundle, for example:

    spec:
      trustedCABundle:
        managementState: Unmanaged
        customCABundle: example123
  2. Click Save.

  3. Click WorkloadsConfigMaps.

  4. Select a project from the project list.

  5. Click the odh-trusted-ca-bundle ConfigMap.

  6. Click the YAML tab and verify that the value of the customCABundle field did not update.

Using self-signed certificates with Open Data Hub components

Some Open Data Hub components have additional options or required configuration for self-signed certificates.

Using certificates with data science pipelines

If you want to use self-signed certificates, you have added them to a central Certificate Authority (CA) bundle as described in Understanding certificates in Open Data Hub.

No additional configuration is necessary to use those certificates with data science pipelines.

Providing a CA bundle only for data science pipelines

Perform the following steps to provide a Certificate Authority (CA) bundle just for data science pipelines.

Procedure
  1. Log in to OpenShift Container Platform.

  2. From WorkloadsConfigMaps, create a ConfigMap with the required bundle in the same data science project or namespace as the target data science pipeline:

    kind: ConfigMap
    apiVersion: v1
    metadata:
        name: custom-ca-bundle
    data:
        ca-bundle.crt: |
        # contents of ca-bundle.crt
  3. Add the following snippet to the .spec.apiserver.caBundle field of the underlying Data Science Pipelines Application (DSPA):

    apiVersion: datasciencepipelinesapplications.opendatahub.io/v1alpha1
    kind: DataSciencePipelinesApplication
    metadata:
        name: data-science-pipelines-definition
    spec:
        ...
        apiServer:
        ...
        cABundle:
            configMapName: custom-ca-bundle
            configMapKey: ca-bundle.crt

The pipeline server pod redeploys with the updated bundle and uses it in the newly created pipeline pods.

Verification

Perform the following steps to confirm that your CA bundle was successfully mounted.

  1. Log in to the OpenShift Container Platform console.

  2. Go to the OpenShift Container Platform project that corresponds to the data science project.

  3. Click the Pods tab.

  4. Click the pipeline server pod with the ds-pipeline-pipelines-definition-<hash> prefix.

  5. Click Terminal.

  6. Enter cat /dsp-custom-certs/dsp-ca.crt.

  7. Verify that your CA bundle is present within this file.

You can also confirm that your CA bundle was successfully mounted by using the CLI:

  1. In a terminal window, log in to the OpenShift cluster where Open Data Hub is deployed.

    oc login
  2. Set the dspa value:

    dspa=pipelines-definition
  3. Set the dsProject value, replacing $YOUR_DS_PROJECT with the name of your data science project:

    dsProject=$YOUR_DS_PROJECT
  4. Set the pod value:

    pod=$(oc get pod -n ${dsProject} -l app=ds-pipeline-${dspa} --no-headers | awk '{print $1}')
  5. Display the contents of the /dsp-custom-certs/dsp-ca.crt file:

    oc -n ${dsProject} exec $pod -- cat /dsp-custom-certs/dsp-ca.crt
  6. Verify that your CA bundle is present within this file.

Using certificates with workbenches

Important

Self-signed certificates apply to workbenches that you create after configuring self-signed certificates centrally as described in Understanding certificates in Open Data Hub. There is no change to workbenches that you created before configuring self-signed certificates.

Creating data science pipelines with Elyra and self-signed certificates

To create pipelines using a workbench that contains the Elyra extension and which uses self-signed certificates, see the Workbench workaround for executing a pipeline using Elyra in a disconnected environment knowledgebase article.