Open Data Hub logo

Info alert:Important Notice

Please note that more information about the previous v2 releases can be found here. You can use "Find a release" search bar to search for a particular release.

Installing Open Data Hub

Installing Open Data Hub version 2

You can install Open Data Hub version 2 on OpenShift Container Platform from the OpenShift web console. For information about upgrading the Open Data Hub Operator, see Upgrading Open Data Hub.

Installing Open Data Hub involves the following tasks:

  1. Optional: Configuring custom namespaces.

  2. Installing the Open Data Hub Operator.

  3. Installing Open Data Hub components.

  4. Accessing the Open Data Hub dashboard.

Configuring custom namespaces

By default, Open Data Hub uses predefined namespaces, but you can define a custom namespace for the operator and DSCI.applicationNamespace as needed. Namespaces created by Open Data Hub typically include openshift or redhat in their name. Do not rename these system namespaces because they are required for Open Data Hub to function properly.

Prerequisites
  • You have access to a Open Data Hub cluster with cluster administrator privileges.

  • You have downloaded and installed the OpenShift command-line interface (CLI). See Installing the OpenShift CLI.

Procedure
  1. In a terminal window, if you are not already logged in to your OpenShift cluster as a cluster administrator, log in to the OpenShift CLI as shown in the following example:

    oc login <openshift_cluster_url> -u <admin_username> -p <password>
  2. Enter the following command to create the custom namespace:

    oc create namespace <custom_namespace>
  3. If you are creating a namespace for a DSCI.applicationNamespace, enter the following command to add the correct label:

    oc label namespace <application_namespace> opendatahub.io/application-namespace=true

Installing the Open Data Hub Operator version 2

Prerequisites
  • You are using OpenShift Container Platform 4.14 or later.

  • Your OpenShift cluster has a minimum of 16 CPUs and 32GB of memory across all OpenShift worker nodes.

  • You have cluster administrator privileges for your OpenShift Container Platform cluster.

  • If you are using custom namespaces, you have created and labeled them as required.

  • If you are installing Open Data Hub 2.10.0 or later with data science pipelines, ensure your cluster does not have a separate installation of Argo Workflows that was not installed by Open Data Hub.

    Important

    Data science pipelines 2.0 includes an installation of Argo Workflows. Red Hat does not support direct customer usage of this installation of Argo Workflows.

    If there is an existing installation of Argo Workflows that is not installed by data science pipelines on your cluster, data science pipelines will be disabled after you install Open Data Hub.

    To enable data science pipelines, remove the separate installation of Argo Workflows from your cluster. Data science pipelines will be enabled automatically.

    Argo Workflows resources that are created by Open Data Hub have the following labels in the OpenShift Console under Administration > CustomResourceDefinitions, in the argoproj.io group:

     labels:
        app.kubernetes.io/part-of: data-science-pipelines-operator
        app.opendatahub.io/data-science-pipelines-operator: 'true'
Procedure
  1. Log in to your OpenShift Container Platform as a user with cluster-admin privileges. If you are performing a developer installation on try.openshift.com, you can log in as the kubeadmin user.

  2. Select OperatorsOperatorHub.

  3. On the OperatorHub page, in the Filter by keyword field, enter Open Data Hub Operator.

  4. Click the Open Data Hub Operator tile.

  5. If the Show community Operator window opens, read the information and then click Continue.

  6. Read the information about the Operator and then click Install.

  7. On the Install Operator page, follow these steps:

    1. For Update channel, select fast.

      Note

      Version 2 of the Open Data Hub Operator represents an alpha release, accessible only on the fast channel. Later releases will change to the rolling channel when the Operator is more stable.

    2. For Version, select the version of the Operator that you want to install.

    3. For Installation mode, leave All namespaces on the cluster (default) selected.

    4. For Installed Namespace, select the openshift-operators namespace.

    5. For Update approval, select automatic or manual updates.

      • Automatic: When a new version of the Operator is available, Operator Lifecycle Manager (OLM) automatically upgrades the running instance of your Operator.

      • Manual: When a new version of the Operator is available, OLM notifies you with an update request that you must manually approve to upgrade the running instance of your Operator.

  8. Click Install. The installation might take a few minutes.

Verification
  • Select OperatorsInstalled Operators to verify that the Open Data Hub Operator is listed with Succeeded status.

Next Step
  • Install Open Data Hub components.

Installing Open Data Hub components

You can use the OpenShift web console to install specific components of Open Data Hub on your cluster when version 2 of the Open Data Hub Operator is already installed on the cluster.

Prerequisites
  • You have installed version 2 of the Open Data Hub Operator.

  • You can log in as a user with cluster-admin privileges.

  • If you want to use the trustyai component, you must enable user workload monitoring as described in Configuring monitoring for the multi-model serving platform.

  • If you want to use the kserve, modelmesh, or modelregistry components, you must have already installed the following Operator or Operators for the component. For information about installing an Operator, see Adding Operators to a cluster.

  • If you want to use kserve, you have selected a deployment mode. For more information, see About KServe deployment modes.

Table 1. Required Operators for components
Component Required Operators Catalog

kserve

Red Hat OpenShift Serverless Operator, Red Hat OpenShift Service Mesh Operator, Red Hat Authorino Operator

Red Hat

modelmesh

Prometheus Operator

Community

modelregistry

Red Hat Authorino Operator, Red Hat OpenShift Serverless Operator, Red Hat OpenShift Service Mesh Operator

NOTE: To use the model registry feature, you must install the required Operators in a specific order. For more information, see Configuring the model registry component.

Red Hat

Procedure
  1. Log in to your OpenShift Container Platform as a user with cluster-admin privileges. If you are performing a developer installation on try.openshift.com, you can log in as the kubeadmin user.

  2. Select OperatorsInstalled Operators, and then click the Open Data Hub Operator.

  3. On the Operator details page, click the DSC Initialization tab, and then click Create DSCInitialization.

  4. On the Create DSCInitialization page, configure by using Form view or YAML view. For general information about the supported components, see Tiered Components.

    • Configure by using Form view:

      1. In the Name field, enter a value.

      2. In the Components section, expand each component and set the managementState to Managed or Removed.

    • Configure by using YAML view:

      1. In the spec.components section, for each component shown, set the value of the managementState field to either Managed or Removed.

  5. Click Create.

  6. Wait until the status of the DSCInitialization is Ready.

  7. Click the Data Science Cluster tab, and then click Create DataScienceCluster.

  8. On the Create DataScienceCluster page, configure the DataScienceCluster by using Form view or YAML view. For general information about the supported components, see Tiered Components.

    • Configure by using Form view:

      1. In the Name field, enter a value.

      2. In the Components section, expand each component and set the managementState to Managed or Removed.

    • Configure by using YAML view:

      1. In the spec.components section, for each component shown, set the value of the managementState field to either Managed or Removed.

  9. Click Create.

Verification
  1. Select HomeProjects, and then select the opendatahub project.

  2. On the Project details page, click the Workloads tab and confirm that the Open Data Hub core components are running. For more information, see Tiered Components.

Note: In the Open Data Hub dashboard, users can view the list of the installed Open Data Hub components, their corresponding source (upstream) components, and the versions of the installed components, as described in Viewing installed Open Data Hub components.

Installing Open Data Hub version 1

You can install Open Data Hub version 1 on your OpenShift Container Platform from the OpenShift web console. For information about installing Open Hub version 2, see Installing Open Data Hub version 2.

Installing Open Data Hub involves the following tasks:

  1. Installing the Open Data Hub Operator.

  2. Creating a new project for your Open Data Hub instance.

  3. Adding an Open Data Hub instance.

  4. Accessing the Open Data Hub dashboard.

Installing the Open Data Hub Operator version 1

Prerequisites
  • You are using OpenShift Container Platform 4.14 or later.

  • Your OpenShift cluster has a minimum of 16 CPUs and 32GB of memory across all OpenShift worker nodes.

  • You can log in as a user with cluster-admin privileges.

Procedure
  1. Log in to your OpenShift Container Platform as a user with cluster-admin privileges. If you are performing a developer installation on try.openshift.com, you can log in as the kubeadmin user.

  2. Select OperatorsOperatorHub.

  3. On the OperatorHub page, in the Filter by keyword field, enter Open Data Hub Operator.

  4. Click the Open Data Hub Operator tile.

  5. If the Show community Operator window opens, read the information and then click Continue.

  6. Read the information about the Operator and then click Install.

  7. On the Install Operator page, follow these steps:

    1. For Update channel, select rolling.

    2. For Version, select the version of the Operator that you want to install.

    3. For Installation mode, leave All namespaces on the cluster (default) selected.

    4. For Installed Namespace, select the openshift-operators namespace.

    5. For Update approval, select automatic or manual updates.

      • Automatic: When a new version of the Operator is available, Operator Lifecycle Manager (OLM) automatically upgrades the running instance of your Operator.

      • Manual: When a new version of the Operator is available, OLM notifies you with an update request that you must manually approve to upgrade the running instance of your Operator.

  8. Click Install. The installation might take a few minutes.

Verification
  • Select OperatorsInstalled Operators to verify that the Open Data Hub Operator is listed with Succeeded status.

Next Step
  • Create a new project for your instance of Open Data Hub.

Creating a new project for your Open Data Hub instance

Create a new project for your Open Data Hub instance so that you can organize and manage your data science work in one place.

Prerequisites
  • You have installed the Open Data Hub Operator.

Procedure
  1. In the OpenShift web console, select HomeProjects.

  2. On the Projects page, click Create Project.

  3. In the Create Project box, follow these steps:

    1. For Name, enter odh.

    2. For Display Name, enter Open Data Hub.

    3. For Description, enter a description.

  4. Click Create.

Verification
  • Select HomeProjects to verify that the odh project is listed with Active status.

Next Step
  • Add an Open Data Hub instance to your project.

Adding an Open Data Hub instance

By adding an Open Data Hub instance to your project, you can access the URL for your Open Data Hub dashboard and share it with data science users.

Prerequisites
  • You have installed the Open Data Hub Operator.

  • You have created a new project for your instance of Open Data Hub.

Procedure
  1. In the OpenShift web console, select OperatorsInstalled Operators.

  2. On the Installed Operators page, click the Project list and select the odh project. The page filters to only display installed operators in the odh project.

  3. Find and click the Open Data Hub Operator to display the details for the currently installed version.

  4. On the KfDef tile, click Create instance. A KfDef object is a specification designed to control provisioning and management of a Kubeflow deployment. A default KfDef object is created when you install Open Data Hub Operator. This default configuration deploys the required Open Data Hub core components. For more information, see Tiered Components.

  5. On the Create KfDef page, leave opendatahub as the name. Click Create to create an Open Data Hub kfdef object named opendatahub and begin the deployment of the components.

Verification
  1. Select OperatorsInstalled Operators.

  2. On the Installed Operators page, click the Project list and select the odh project.

  3. Find and click Open Data Hub Operator.

  4. Click the Kf Def tab and confirm that opendatahub appears.

  5. Select HomeProjects.

  6. On the Projects page, find and select the odh project.

  7. On the Project details page, click the Workloads tab and confirm that the Open Data Hub core components are running. For a description of the components, see Tiered Components.

Next Step
  • Access the Open Data Hub dashboard.

Accessing the Open Data Hub dashboard

You can access and share the URL for your Open Data Hub dashboard with other users to let them log in and work on their models.

Prerequisites
  • You have installed the Open Data Hub Operator.

Procedure
  1. In the OpenShift web console, select NetworkingRoutes.

  2. On the Routes page, click the Project list and select the odh project. The page filters to only display routes in the odh project.

  3. In the Location column, copy the URL for the odh-dashboard route.

  4. Give this URL to your users to let them log in to Open Data Hub dashboard.

Verification
  • Confirm that you and your users can log in to the Open Data Hub dashboard by using the URL.

Note: In the Open Data Hub dashboard, users can view the list of the installed Open Data Hub components, their corresponding source (upstream) components, and the versions of the installed components, as described in Viewing installed Open Data Hub components.

Installing the distributed workloads components

To use the distributed workloads feature in Open Data Hub, you must install several components.

Prerequisites
  • You have logged in to OpenShift Container Platform with the cluster-admin role and you can access the data science cluster.

  • You have installed Open Data Hub.

  • You have sufficient resources. In addition to the minimum Open Data Hub resources described in Installing the Open Data Hub Operator version 2, you need 1.6 vCPU and 2 GiB memory to deploy the distributed workloads infrastructure.

  • You have removed any previously installed instances of the CodeFlare Operator.

  • If you want to use graphics processing units (GPUs), you have enabled GPU support. This process includes installing the Node Feature Discovery Operator and the relevant GPU Operator. For more information, see NVIDIA GPU Operator on Red Hat OpenShift Container Platform in the NVIDIA documentation for NVIDIA GPUs and AMD GPU Operator on Red Hat OpenShift Container Platform in the AMD documentation for AMD GPUs.

  • If you want to use self-signed certificates, you have added them to a central Certificate Authority (CA) bundle as described in Understanding how Open Data Hub handles certificates. No additional configuration is necessary to use those certificates with distributed workloads. The centrally configured self-signed certificates are automatically available in the workload pods at the following mount points:

    • Cluster-wide CA bundle:

      /etc/pki/tls/certs/odh-trusted-ca-bundle.crt
      /etc/ssl/certs/odh-trusted-ca-bundle.crt
    • Custom CA bundle:

      /etc/pki/tls/certs/odh-ca-bundle.crt
      /etc/ssl/certs/odh-ca-bundle.crt
Procedure
  1. In the OpenShift Container Platform console, click OperatorsInstalled Operators.

  2. Search for the Open Data Hub Operator, and then click the Operator name to open the Operator details page.

  3. Click the Data Science Cluster tab.

  4. Click the default instance name (for example, default-dsc) to open the instance details page.

  5. Click the YAML tab to show the instance specifications.

  6. Enable the required distributed workloads components. In the spec.components section, set the managementState field correctly for the required components:

    • If you want to use the CodeFlare framework to tune models, enable the codeflare, kueue, and ray components.

    • If you want to use the Kubeflow Training Operator to tune models, enable the kueue and trainingoperator components.

    • The list of required components depends on whether the distributed workload is run from a pipeline or notebook or both, as shown in the following table.

    Table 2. Components required for distributed workloads
    Component Pipelines only Notebooks only Pipelines and notebooks

    codeflare

    Managed

    Managed

    Managed

    dashboard

    Managed

    Managed

    Managed

    datasciencepipelines

    Managed

    Removed

    Managed

    kueue

    Managed

    Managed

    Managed

    ray

    Managed

    Managed

    Managed

    trainingoperator

    Managed

    Managed

    Managed

    workbenches

    Removed

    Managed

    Managed

  7. Click Save. After a short time, the components with a Managed state are ready.

Verification

Check the status of the codeflare-operator-manager, kubeflow-training-operator, kuberay-operator, and kueue-controller-manager pods, as follows:

  1. In the OpenShift Container Platform console, from the Project list, select odh.

  2. Click WorkloadsDeployments.

  3. Search for the codeflare-operator-manager, kubeflow-training-operator, kuberay-operator, and kueue-controller-manager deployments. In each case, check the status as follows:

    1. Click the deployment name to open the deployment details page.

    2. Click the Pods tab.

    3. Check the pod status.

      When the status of the codeflare-operator-manager-<pod-id>, kubeflow-training-operator-<pod-id>, kuberay-operator-<pod-id>, and kueue-controller-manager-<pod-id> pods is Running, the pods are ready to use.

    4. To see more information about each pod, click the pod name to open the pod details page, and then click the Logs tab.

Next Step

Configure the distributed workloads feature as described in Managing distributed workloads.

Accessing the Open Data Hub dashboard

You can access and share the URL for your Open Data Hub dashboard with other users to let them log in and work on their models.

Prerequisites
  • You have installed the Open Data Hub Operator.

Procedure
  1. In the OpenShift web console, select NetworkingRoutes.

  2. On the Routes page, click the Project list and select the odh project. The page filters to only display routes in the odh project.

  3. In the Location column, copy the URL for the odh-dashboard route.

  4. Give this URL to your users to let them log in to Open Data Hub dashboard.

Verification
  • Confirm that you and your users can log in to the Open Data Hub dashboard by using the URL.

Note: In the Open Data Hub dashboard, users can view the list of the installed Open Data Hub components, their corresponding source (upstream) components, and the versions of the installed components, as described in Viewing installed Open Data Hub components.

Working with certificates

When you install Open Data Hub, OpenShift automatically applies a default Certificate Authority (CA) bundle to manage authentication for most Open Data Hub components, such as workbenches and model servers. These certificates are trusted self-signed certificates that help secure communication. However, as a cluster administrator, you might need to configure additional self-signed certificates to use some components, such as the data science pipeline server and object storage solutions. If an Open Data Hub component uses a self-signed certificate that is not part of the existing cluster-wide CA bundle, you have the following options for including the certificate:

  • Add it to the OpenShift cluster-wide CA bundle.

  • Add it to a custom CA bundle, separate from the cluster-wide CA bundle.

As a cluster administrator, you can also change how to manage authentication for Open Data Hub as follows:

  • Manually manage certificate changes, instead of relying on the Open Data Hub Operator to handle them automatically.

  • Remove the cluster-wide CA bundle, either from all namespaces or specific ones. If you prefer to implement a different authentication approach, other than using CA bundles, you can override the default Open Data Hub behavior, as described in Removing the CA bundle from all namespaces and Removing the CA bundle from a single namespace.

Understanding how Open Data Hub handles certificates

After installing Open Data Hub, the Open Data Hub Operator automatically creates an empty odh-trusted-ca-bundle configuration file (ConfigMap). The Cluster Network Operator (CNO) injects the cluster-wide CA bundle into the odh-trusted-ca-bundle configMap with the label "config.openshift.io/inject-trusted-cabundle".

apiVersion: v1
kind: ConfigMap
metadata:
  labels:
    app.kubernetes.io/part-of: opendatahub-operator
    config.openshift.io/inject-trusted-cabundle: 'true'
  name: odh-trusted-ca-bundle

After the CNO operator injects the bundle, it updates the ConfigMap with the contents of the ca-bundle.crt file.

apiVersion: v1
kind: ConfigMap
metadata:
  labels:
    app.kubernetes.io/part-of: opendatahub-operator
    config.openshift.io/inject-trusted-cabundle: 'true'
  name: odh-trusted-ca-bundle
data:
  ca-bundle.crt: |
    <BUNDLE OF CLUSTER-WIDE CERTIFICATES>

The management of CA bundles is configured through the Data Science Cluster Initialization (DSCI) object. Within this object, you can set the spec.trustedCABundle.managementState field to one of the following values:

  • Managed: (Default) The Open Data Hub Operator manages the odh-trusted-ca-bundle ConfigMap and adds it to all non-reserved existing and new namespaces. It does not add the ConfigMap to any reserved or system namespaces, such as default, openshift-\* or kube-*. The Open Data Hub Operator automatically updates the ConfigMap to reflect any changes made to the customCABundle field.

  • Unmanaged: The Open Data Hub administrator manually manages the odh-trusted-ca-bundle ConfigMap, instead of allowing the Operator to manage it. Changing the managementState from Managed to Unmanaged does not remove the odh-trusted-ca-bundle ConfigMap. However, the ConfigMap is no longer automatically updated if changes are made to the customCABundle field.

    The Unmanaged setting is useful if your organization implements a different method for managing trusted CA bundles, such as Ansible automation, and does not want the Open Data Hub Operator to handle certificates automatically. This setting provides greater control, preventing the Operator from overwriting custom configurations.

Adding certificates

If you must use a self-signed certificate that is not part of the existing cluster-wide CA bundle, you have two options for configuring the certificate:

  • Add it to the cluster-wide CA bundle.

    This option is useful when the certificate is needed for secure communication across multiple services or when it’s required by security policies to be trusted cluster-wide. This option ensures that all services and components in the cluster trust the certificate automatically. It simplifies management because the certificate is trusted across the entire cluster, avoiding the need to configure the certificate separately for each service.

  • Add it to a custom CA bundle that is separate from the OpenShift cluster-wide bundle.

    Consider this option for the following scenarios:

    • Limit scope: Only specific services need the certificate, not the whole cluster.

    • Isolation: Keeps custom certificates separate, preventing changes to the global configuration.

    • Avoid global impact: Does not affect services that do not need the certificate.

    • Easier management: Makes it simpler to manage certificates for specific services.

Adding certificates to a cluster-wide CA bundle

You can add a self-signed certificate to a cluster-wide Certificate Authority (CA) bundle (ca-bundle.crt).

When the cluster-wide CA bundle is updated, the Cluster Network Operator (CNO) automatically detects the change and injects the updated bundle into the odh-trusted-ca-bundle ConfigMap, making the certificate available to Open Data Hub components.

Note: By default, the management state for the Trusted CA bundle is Managed (that is, the spec.trustedCABundle.managementState field in the Open Data Hub Operator’s DSCI object is set to Managed). If you change this setting to Unmanaged, you must manually update the odh-trusted-ca-bundle ConfigMap to include the updated cluster-wide CA bundle.

Alternatively, you can add certificates to a custom CA bundle, as described in Adding certificates to a custom CA bundle.

Prerequisites
  • You have created a self-signed certificate and saved the certificate to a file. For example, you have created a certificate using OpenSSL and saved it to a file named example-ca.crt.

  • You have cluster administrator access for the OpenShift cluster where Open Data Hub is installed.

  • You have installed the OpenShift command-line interface (CLI). See Installing the OpenShift CLI.

Procedure
  1. Create a ConfigMap that includes the root CA certificate used to sign the certificate, where </path/to/example-ca.crt> is the path to the CA certificate bundle on your local file system:

    oc create configmap custom-ca \
     	--from-file=ca-bundle.crt=</path/to/example-ca.crt> \
     	-n openshift-config
  2. Update the cluster-wide proxy configuration with the newly-created ConfigMap:

    oc patch proxy/cluster \
        	 --type=merge \
       	 --patch='{"spec":{"trustedCA":{"name":"custom-ca"}}}'
Verification

Run the following command to verify that all non-reserved namespaces contain the odh-trusted-ca-bundle ConfigMap:

oc get configmaps --all-namespaces -l app.kubernetes.io/part-of=opendatahub-operator | grep odh-trusted-ca-bundle
Additional resources

Adding certificates to a custom CA bundle

You can add self-signed certificates to a custom CA bundle that is separate from the OpenShift cluster-wide bundle.

This method is ideal for scenarios where components need access to external resources that require a self-signed certificate. For example, you may need to add self-signed certificates to grant data science pipelines access S3-compatible object storage.

Prerequisites
  • You have created a self-signed certificate and saved the certificate to a file. For example, you have created a certificate using OpenSSL and saved it to a file named example-ca.crt.

  • You have cluster administrator access for the OpenShift cluster where Open Data Hub is installed.

  • You have installed the OpenShift command-line interface (CLI). See Installing the OpenShift CLI.

Procedure
  1. Log in to OpenShift Container Platform.

  2. Click OperatorsInstalled Operators and then click the Open Data Hub Operator.

  3. Click the DSC Initialization tab.

  4. Click the default-dsci object.

  5. Click the YAML tab.

  6. In the spec.trustedCABundle section, add the custom certificate to the customCABundle field, as shown in the following example:

    spec:
      trustedCABundle:
        managementState: Managed
        customCABundle: |
          -----BEGIN CERTIFICATE-----
          examplebundle123
          -----END CERTIFICATE-----
  7. Click Save.

The Open Data Hub Operator automatically updates the ConfigMap to reflect any changes made to the customCABundle field. It adds the odh-ca-bundle.crt file containing the certificates to the odh-trusted-ca-bundle ConfigMap, as shown in the following example:

apiVersion: v1
kind: ConfigMap
metadata:
  labels:
    app.kubernetes.io/part-of: opendatahub-operator
    config.openshift.io/inject-trusted-cabundle: 'true'
  name: odh-trusted-ca-bundle
data:
  ca-bundle.crt: |
    <BUNDLE OF CLUSTER-WIDE CERTIFICATES>
  odh-ca-bundle.crt: |
    <BUNDLE OF CUSTOM CERTIFICATES>
Verification

Run the following command to verify that a non-reserved namespace contains the odh-trusted-ca-bundle ConfigMap and that the ConfigMap contains your customCABundle value. In the following command, example-namespace is the non-reserved namespace and examplebundle123 is the customCABundle value.

oc get configmap odh-trusted-ca-bundle -n example-namespace -o yaml | grep examplebundle123

Using self-signed certificates with Open Data Hub components

Some Open Data Hub components have additional options or required configuration for self-signed certificates.

Accessing S3-compatible object storage with self-signed certificates

To securely connect Open Data Hub components to object storage solutions or databases that are deployed within an OpenShift cluster that uses self-signed certificates, you must provide a certificate authority (CA) certificate. Each namespace includes a ConfigMap named kube-root-ca.crt, which contains the CA certificate of the internal API Server.

Prerequisites
  • You have cluster administrator privileges for your OpenShift Container Platform cluster.

  • You have installed the OpenShift command-line interface (CLI). See Installing the OpenShift CLI.

  • You have deployed an object storage solution or database in your OpenShift cluster.

Procedure
  1. In a terminal window, log in to the OpenShift CLI as shown in the following example:

    oc login api.<cluster_name>.<cluster_domain>:6443 --web
  2. Retrieve the current Open Data Hub trusted CA configuration and store it in a new file:

    oc get dscinitializations.dscinitialization.opendatahub.io default-dsci -o json | jq -r '.spec.trustedCABundle.customCABundle' > /tmp/my-custom-ca-bundles.crt
  3. Add the cluster’s kube-root-ca.crt ConfigMap to the Open Data Hub trusted CA configuration:

    oc get configmap kube-root-ca.crt -o jsonpath="{['data']['ca\.crt']}" >> /tmp/my-custom-ca-bundles.crt
  4. Update the Open Data Hub trusted CA configuration to trust certificates issued by the certificate authorities in kube-root-ca.crt:

    oc patch dscinitialization default-dsci --type='json' -p='[{"op":"replace","path":"/spec/trustedCABundle/customCABundle","value":"'"$(awk '{printf "%s\\n", $0}' /tmp/my-custom-ca-bundles.crt)"'"}]'
Verification
  • You can successfully deploy components that are configured to use object storage solutions or databases that are deployed in the OpenShift cluster. For example, a pipeline server that is configured to use a database deployed in the cluster starts successfully.

Configuring a certificate for data science pipelines

By default, Open Data Hub includes OpenShift cluster-wide certificates in the odh-trusted-ca-bundle ConfigMap. These cluster-wide certificates cover most components, such as workbenches and model servers. However, the pipeline server might require additional Certificate Authority (CA) configuration, especially when interacting with external systems that use self-signed or custom certificates.

You have the following options for adding the certificate for data science pipelines:

Prerequisites
  • You have cluster administrator access for the OpenShift cluster where Open Data Hub is installed.

  • You have created a self-signed certificate and saved the certificate to a file. For example, you have created a certificate using OpenSSL and saved it to a file named example-ca.crt.

  • You have configured a data science pipeline server.

Procedure
  1. Log in to the OpenShift console.

  2. From WorkloadsConfigMaps, create a ConfigMap with the required bundle in the same data science project as the target data science pipeline:

    kind: ConfigMap
    apiVersion: v1
    metadata:
        name: custom-ca-bundle
    data:
        ca-bundle.crt: |
        # contents of ca-bundle.crt
  3. Add the following snippet to the .spec.apiserver.caBundle field of the underlying Data Science Pipelines Application (DSPA):

    apiVersion: datasciencepipelinesapplications.opendatahub.io/v1
    kind: DataSciencePipelinesApplication
    metadata:
        name: data-science-dspa
    spec:
        ...
        apiServer:
        ...
        cABundle:
            configMapName: custom-ca-bundle
            configMapKey: ca-bundle.crt
  4. Save the ConfigMap. The pipeline server pod automatically redeploys with the updated bundle.

Verification

Confirm that your CA bundle was successfully mounted:

  1. Log in to the OpenShift console.

  2. Go to the data science project that has the target data science pipeline.

  3. Click the Pods tab.

  4. Click the pipeline server pod with the ds-pipeline-dspa-<hash> prefix.

  5. Click Terminal.

  6. Enter cat /dsp-custom-certs/dsp-ca.crt.

  7. Verify that your CA bundle is present within this file.

Configuring a certificate for workbenches

Important

By default, self-signed certificates apply to workbenches that you create after configuring cluster-wide certificates. To apply cluster-wide certificates to an existing workbench, stop and then restart the workbench.

Self-signed certificates are stored in /etc/pki/tls/custom-certs/ca-bundle.crt. Workbenches use a preset environment variable that many popular HTTP client packages point to for certificates. For packages that are not included by default, you can provide this certificate path. For example, for the kfp package to connect to the data science pipeline server:

from kfp.client import Client

with open(sa_token_file_path, 'r') as token_file:
    bearer_token = token_file.read()

    client = Client(
        host='https://<GO_TO_ROUTER_OF_DS_PROJECT>/',
        existing_token=bearer_token,
        ssl_ca_cert='/etc/pki/tls/custom-certs/ca-bundle.crt'
    )
    print(client.list_experiments())

Using the cluster-wide CA bundle for the single-model serving platform

By default, the single-model serving platform in Open Data Hub uses a self-signed certificate generated at installation for the endpoints that are created when deploying a server.

If you have configured cluster-wide certificates on your OpenShift cluster, they are used by default for other types of endpoints, such as endpoints for routes.

The following procedure explains how to use the same certificate that you already have for your OpenShift cluster.

Prerequisites
  • You have cluster administrator access for the OpenShift cluster where Open Data Hub is installed.

  • You have configured cluster-wide certificates in OpenShift.

Procedure
  1. Log in to the OpenShift console.

  2. From the list of projects, open the openshift-ingress project.

  3. Click YAML.

  4. Search for "cert" to find a secret with a name that includes "cert". For example, rhods-internal-primary-cert-bundle-secret. The contents of the secret should contain two items that are used for all OpenShift Routes: tls.cert (the certificate) and tls.key (the key).

  5. Copy the reference to the secret.

  6. From the list of projects, open the istio-system project.

  7. Create a YAML file and paste the reference to the secret that you copied from the openshift-ingress YAML file.

  8. Edit the YAML code to keep only the relevant content, as shown in the following example. Replace rhods-internal-primary-cert-bundle-secret with the name of your secret:

    kind: Secret
    apiVersion: v1
    metadata:
    name: rhods-internal-primary-cert-bundle-secret
    data:
    tls.crt: >-
        LS0tLS1CRUd...
    tls.key: >-
        LS0tLS1CRUd...
    type: kubernetes.io/tls
  9. Save the YAML file in the istio-system project.

  10. Navigate to OperatorsInstalled OperatorsOpen Data Hub.

  11. Click Data Science Cluster*, and then click default-dscYAML.

  12. Edit the kserve configuration section to refer to your secret as shown in the following example. Replace rhods-internal-primary-cert-bundle-secret with the name of the secret that you created in Step 8.

    kserve:
    devFlags: {}
    managementState: Managed
    serving:
        ingressGateway:
        certificate:
            secretName: rhods-internal-primary-cert-bundle-secret
            type: Provided
        managementState: Managed
        name: knative-serving

Managing certificates without the Open Data Hub Operator

By default, the Open Data Hub Operator manages the odh-trusted-ca-bundle ConfigMap, which contains the trusted CA bundle and is applied to all non-reserved namespaces in the cluster. The Operator automatically updates this ConfigMap whenever changes are made to the CA bundle.

If your organization prefers to manage trusted CA bundles independently, for example, by using Ansible automation, you can disable this default behavior to prevent automatic updates by the Open Data Hub Operator.

Prerequisites
  • You have cluster administrator privileges for your OpenShift Container Platform cluster.

Procedure
  1. In the OpenShift Container Platform web console, click OperatorsInstalled Operators and then click the Open Data Hub Operator.

  2. Click the DSC Initialization tab.

  3. Click the default-dsci object.

  4. Click the YAML tab.

  5. In the spec section, change the value of the managementState field for trustedCABundle to Unmanaged, as shown:

    spec:
      trustedCABundle:
        managementState: Unmanaged
  6. Click Save.

    Changing the managementState from Managed to Unmanaged prevents automatic updates when the customCABundle field is modified, but does not remove the odh-trusted-ca-bundle ConfigMap.

Verification
  1. In the spec section, set the value of the customCABundle field for trustedCABundle, for example:

    spec:
      trustedCABundle:
        managementState: Unmanaged
        customCABundle: example123
  2. Click Save.

  3. Click WorkloadsConfigMaps.

  4. Select a project from the project list.

  5. Click the odh-trusted-ca-bundle ConfigMap.

  6. Click the YAML tab and verify that the value of the customCABundle field did not update.

Removing the CA bundle

If you prefer to implement a different authentication approach for your Open Data Hub installation, you can override the default behavior by removing the CA bundle.

You have two options for removing the CA bundle:

  • Remove the CA bundle from all non-reserved projects in Open Data Hub.

  • Remove the CA bundle from a specific project.

Removing the CA bundle from all namespaces

You can remove a Certificate Authority (CA) bundle from all non-reserved namespaces in Open Data Hub. This process changes the default configuration and disables the creation of the odh-trusted-ca-bundle configuration file (ConfigMap), as described in Understanding certificates in Open Data Hub.

Note
The odh-trusted-ca-bundle ConfigMaps are only deleted from namespaces when you set the managementState of trustedCABundle to Removed; deleting the DSC Initialization does not delete the ConfigMaps.

To remove a CA bundle from a single namespace only, see Removing the CA bundle from a single namespace.

Prerequisites
  • You have cluster administrator privileges for your OpenShift Container Platform cluster.

  • You installed the OpenShift command line interface (oc) as described in Installing the OpenShift CLI.

Procedure
  1. In the OpenShift Container Platform web console, click OperatorsInstalled Operators and then click the Open Data Hub Operator.

  2. Click the DSC Initialization tab.

  3. Click the default-dsci object.

  4. Click the YAML tab.

  5. In the spec section, change the value of the managementState field for trustedCABundle to Removed:

    spec:
      trustedCABundle:
        managementState: Removed
  6. Click Save.

Verification
  • Run the following command to verify that the odh-trusted-ca-bundle ConfigMap has been removed from all namespaces:

    oc get configmaps --all-namespaces | grep odh-trusted-ca-bundle

    The command should not return any ConfigMaps.

Removing the CA bundle from a single namespace

You can remove a custom Certificate Authority (CA) bundle from individual namespaces in Open Data Hub. This process disables the creation of the odh-trusted-ca-bundle configuration file (ConfigMap) for the specified namespace only.

To remove a certificate bundle from all namespaces, see Removing the CA bundle from all namespaces.

Prerequisites
  • You have cluster administrator privileges for your OpenShift Container Platform cluster.

  • You installed the OpenShift command line interface (oc) as described in Installing the OpenShift CLI.

Procedure
  • Run the following command to remove a CA bundle from a namespace. In the following command, example-namespace is the non-reserved namespace.

    oc annotate ns example-namespace security.opendatahub.io/inject-trusted-ca-bundle=false
Verification
  • Run the following command to verify that the CA bundle has been removed from the namespace. In the following command, example-namespace is the non-reserved namespace.

    oc get configmap odh-trusted-ca-bundle -n example-namespace

    The command should return configmaps "odh-trusted-ca-bundle" not found.

Viewing logs and audit records

As a cluster administrator, you can use the Open Data Hub Operator logger to monitor and troubleshoot issues. You can also use OpenShift Container Platform audit records to review a history of changes made to the Open Data Hub Operator configuration.

Configuring the Open Data Hub Operator logger

You can change the log level for Open Data Hub Operator components by setting the .spec.devFlags.logmode flag for the DSC Initialization/DSCI custom resource during runtime. If you do not set a logmode value, the logger uses the INFO log level by default.

The log level that you set with .spec.devFlags.logmode applies to all components, not just those in a Managed state.

The following table shows the available log levels:

Log level Stacktrace level Verbosity Output Timestamp type

devel or development

WARN

INFO

Console

Epoch timestamps

"" (or no logmode value set)

ERROR

INFO

JSON

Human-readable timestamps

prod or production

ERROR

INFO

JSON

Human-readable timestamps

Logs that are set to devel or development generate in a plain text console format. Logs that are set to prod, production, or which do not have a level set generate in a JSON format.

Prerequisites
  • You have administrator access to the DSCInitialization resources in the OpenShift Container Platform cluster.

  • You installed the OpenShift command line interface (oc) as described in Installing the OpenShift CLI.

Procedure
  1. Log in to the OpenShift Container Platform as a cluster administrator.

  2. Click OperatorsInstalled Operators and then click the Open Data Hub Operator.

  3. Click the DSC Initialization tab.

  4. Click the default-dsci object.

  5. Click the YAML tab.

  6. In the spec section, update the .spec.devFlags.logmode flag with the log level that you want to set.

    apiVersion: dscinitialization.opendatahub.io/v1
    kind: DSCInitialization
    metadata:
      name: default-dsci
    spec:
      devFlags:
        logmode: development
  7. Click Save.

You can also configure the log level from the OpenShift CLI by using the following command with the logmode value set to the log level that you want.

oc patch dsci default-dsci -p '{"spec":{"devFlags":{"logmode":"development"}}}' --type=merge
Verification
  • If you set the component log level to devel or development, logs generate more frequently and include logs at WARN level and above.

  • If you set the component log level to prod or production, or do not set a log level, logs generate less frequently and include logs at ERROR level or above.

Viewing the Open Data Hub Operator log

  1. Log in to the OpenShift CLI.

  2. Run the following command:

    oc get pods -l name=opendatahub-operator -o name -n openshift-operators |  xargs -I {} oc logs -f {} -n openshift-operators

    The operator pod log opens.

Viewing audit records

Cluster administrators can use OpenShift Container Platform auditing to see changes made to the Open Data Hub Operator configuration by reviewing modifications to the DataScienceCluster (DSC) and DSCInitialization (DSCI) custom resources. Audit logging is enabled by default in standard OpenShift Container Platform cluster configurations. For more information, see Viewing audit logs in the OpenShift Container Platform documentation.

The following example shows how to use the OpenShift Container Platform audit logs to see the history of changes made (by users) to the DSC and DSCI custom resources.

Prerequisites
  • You have cluster administrator privileges for your OpenShift Container Platform cluster.

  • You installed the OpenShift command line interface (oc) as described in Installing the OpenShift CLI.

Procedure
  1. In a terminal window, if you are not already logged in to your OpenShift Container Platform cluster as a cluster administrator, log in to the OpenShift Container Platform CLI as shown in the following example:

    $ oc login <openshift_cluster_url> -u <admin_username> -p <password>
  2. To access the full content of the changed custom resources, set the OpenShift Container Platform audit log policy to WriteRequestBodies or a more comprehensive profile. For more information, see About audit log policy profiles.

  3. Fetch the audit log files that are available for the relevant control plane nodes. For example:

    oc adm node-logs --role=master --path=kube-apiserver/ \
      | awk '{ print $1 }' | sort -u \
      | while read node ; do
          oc adm node-logs $node --path=kube-apiserver/audit.log < /dev/null
        done \
      | grep opendatahub > /tmp/kube-apiserver-audit-opendatahub.log
  4. Search the files for the DSC and DSCI custom resources. For example:

    jq 'select((.objectRef.apiGroup == "dscinitialization.opendatahub.io"
                    or .objectRef.apiGroup == "datasciencecluster.opendatahub.io")
                  and .user.username != "system:serviceaccount:redhat-ods-operator:redhat-ods-operator-controller-manager"
                  and .verb != "get" and .verb != "watch" and .verb != "list")' < /tmp/kube-apiserver-audit-opendatahub.log
Verification
  • The commands return relevant log entries.