Open Data Hub logo

Info alert:Important Notice

Please note that more information about the previous v2 releases can be found here. You can use "Find a release" search bar to search for a particular release.

Installing Open Data Hub

Installing Open Data Hub version 2

You can install Open Data Hub version 2 on your OpenShift Container Platform from the OpenShift web console. For information about upgrading the Open Hub Operator, see Upgrading Open Data Hub.

Installing Open Data Hub involves the following tasks:

  1. Installing the Open Data Hub Operator.

  2. Installing Open Data Hub components.

  3. Accessing the Open Data Hub dashboard.

Installing the Open Data Hub Operator version 2

Prerequisites
  • You are using OpenShift Container Platform 4.14 or later.

  • Your OpenShift cluster has a minimum of 16 CPUs and 32GB of memory across all OpenShift worker nodes.

  • You have cluster administrator privileges for your OpenShift Container Platform cluster.

  • If you are installing Open Data Hub 2.10.0 or later with data science pipelines, ensure your cluster does not have a separate installation of Argo Workflows that was not installed by Open Data Hub.

    Important

    Data science pipelines 2.0 includes an installation of Argo Workflows. Open Data Hub does not support direct customer usage of this installation of Argo Workflows.

    If there is an existing installation of Argo Workflows that is not installed by data science pipelines on your cluster, data science pipelines will be disabled after you install Open Data Hub.

    To enable data science pipelines, remove the separate installation of Argo Workflows from your cluster. Data science pipelines will be enabled automatically.

    Argo Workflows resources that are created by Open Data Hub have the following labels in the OpenShift Console under Administration > CustomResourceDefinitions, in the argoproj.io group:

     labels:
        app.kubernetes.io/part-of: data-science-pipelines-operator
        app.opendatahub.io/data-science-pipelines-operator: 'true'
Procedure
  1. Log in to your OpenShift Container Platform as a user with cluster-admin privileges. If you are performing a developer installation on try.openshift.com, you can log in as the kubeadmin user.

  2. Select OperatorsOperatorHub.

  3. On the OperatorHub page, in the Filter by keyword field, enter Open Data Hub Operator.

  4. Click the Open Data Hub Operator tile.

  5. If the Show community Operator window opens, read the information and then click Continue.

  6. Read the information about the Operator and then click Install.

  7. On the Install Operator page, follow these steps:

    1. For Update channel, select fast.

      Note

      Version 2 of the Open Data Hub Operator represents an alpha release, accessible only on the fast channel. Later releases will change to the rolling channel when the Operator is more stable.

    2. For Version, select the version of the Operator that you want to install.

    3. For Installation mode, leave All namespaces on the cluster (default) selected.

    4. For Installed Namespace, select the openshift-operators namespace.

    5. For Update approval, select automatic or manual updates.

      • Automatic: When a new version of the Operator is available, Operator Lifecycle Manager (OLM) automatically upgrades the running instance of your Operator.

      • Manual: When a new version of the Operator is available, OLM notifies you with an update request that you must manually approve to upgrade the running instance of your Operator.

  8. Click Install. The installation might take a few minutes.

Verification
  • Select OperatorsInstalled Operators to verify that the Open Data Hub Operator is listed with Succeeded status.

Next Step
  • Install Open Data Hub components.

Installing Open Data Hub components

You can use the OpenShift web console to install specific components of Open Data Hub on your cluster when version 2 of the Open Data Hub Operator is already installed on the cluster.

Prerequisites
  • You have installed version 2 of the Open Data Hub Operator.

  • You can log in as a user with cluster-admin privileges.

  • If you want to use the trustyai component, you must enable user workload monitoring as described in Configuring monitoring for the multi-model serving platform.

  • If you want to use the kserve, modelmesh, or modelregistry components, you must have already installed the following Operator or Operators for the component. For information about installing an Operator, see Adding Operators to a cluster.

Table 1. Required Operators for components
Component Required Operators Catalog

kserve

Red Hat OpenShift Serverless Operator, Red Hat OpenShift Service Mesh Operator, Red Hat Authorino Operator

Red Hat

modelmesh

Prometheus Operator

Community

modelregistry

Red Hat Authorino Operator, Red Hat OpenShift Serverless Operator, Red Hat OpenShift Service Mesh Operator

NOTE: To use the model registry feature, you must install the required Operators in a specific order. For more information, see Configuring the model registry component.

Red Hat

Procedure
  1. Log in to your OpenShift Container Platform as a user with cluster-admin privileges. If you are performing a developer installation on try.openshift.com, you can log in as the kubeadmin user.

  2. Select OperatorsInstalled Operators, and then click the Open Data Hub Operator.

  3. On the Operator details page, click the DSC Initialization tab, and then click Create DSCInitialization.

  4. On the Create DSCInitialization page, configure by using Form view or YAML view. For general information about the supported components, see Tiered Components.

    • Configure by using Form view:

      1. In the Name field, enter a value.

      2. In the Components section, expand each component and set the managementState to Managed or Removed.

    • Configure by using YAML view:

      1. In the spec.components section, for each component shown, set the value of the managementState field to either Managed or Removed.

  5. Click Create.

  6. Wait until the status of the DSCInitialization is Ready.

  7. Click the Data Science Cluster tab, and then click Create DataScienceCluster.

  8. On the Create DataScienceCluster page, configure the DataScienceCluster by using Form view or YAML view. For general information about the supported components, see Tiered Components.

    • Configure by using Form view:

      1. In the Name field, enter a value.

      2. In the Components section, expand each component and set the managementState to Managed or Removed.

    • Configure by using YAML view:

      1. In the spec.components section, for each component shown, set the value of the managementState field to either Managed or Removed.

  9. Click Create.

Verification
  1. Select HomeProjects, and then select the opendatahub project.

  2. On the Project details page, click the Workloads tab and confirm that the Open Data Hub core components are running. For more information, see Tiered Components.

Next Step
  • Access the Open Data Hub dashboard.

Installing Open Data Hub version 1

You can install Open Data Hub version 1 on your OpenShift Container Platform from the OpenShift web console. For information about installing Open Hub version 2, see Installing Open Data Hub version 2.

Installing Open Data Hub involves the following tasks:

  1. Installing the Open Data Hub Operator.

  2. Creating a new project for your Open Data Hub instance.

  3. Adding an Open Data Hub instance.

  4. Accessing the Open Data Hub dashboard.

Installing the Open Data Hub Operator version 1

Prerequisites
  • You are using OpenShift Container Platform 4.14 or later.

  • Your OpenShift cluster has a minimum of 16 CPUs and 32GB of memory across all OpenShift worker nodes.

  • You can log in as a user with cluster-admin privileges.

Procedure
  1. Log in to your OpenShift Container Platform as a user with cluster-admin privileges. If you are performing a developer installation on try.openshift.com, you can log in as the kubeadmin user.

  2. Select OperatorsOperatorHub.

  3. On the OperatorHub page, in the Filter by keyword field, enter Open Data Hub Operator.

  4. Click the Open Data Hub Operator tile.

  5. If the Show community Operator window opens, read the information and then click Continue.

  6. Read the information about the Operator and then click Install.

  7. On the Install Operator page, follow these steps:

    1. For Update channel, select rolling.

    2. For Version, select the version of the Operator that you want to install.

    3. For Installation mode, leave All namespaces on the cluster (default) selected.

    4. For Installed Namespace, select the openshift-operators namespace.

    5. For Update approval, select automatic or manual updates.

      • Automatic: When a new version of the Operator is available, Operator Lifecycle Manager (OLM) automatically upgrades the running instance of your Operator.

      • Manual: When a new version of the Operator is available, OLM notifies you with an update request that you must manually approve to upgrade the running instance of your Operator.

  8. Click Install. The installation might take a few minutes.

Verification
  • Select OperatorsInstalled Operators to verify that the Open Data Hub Operator is listed with Succeeded status.

Next Step
  • Create a new project for your instance of Open Data Hub.

Creating a new project for your Open Data Hub instance

Create a new project for your Open Data Hub instance so that you can organize and manage your data science work in one place.

Prerequisites
  • You have installed the Open Data Hub Operator.

Procedure
  1. In the OpenShift web console, select HomeProjects.

  2. On the Projects page, click Create Project.

  3. In the Create Project box, follow these steps:

    1. For Name, enter odh.

    2. For Display Name, enter Open Data Hub.

    3. For Description, enter a description.

  4. Click Create.

Verification
  • Select HomeProjects to verify that the odh project is listed with Active status.

Next Step
  • Add an Open Data Hub instance to your project.

Adding an Open Data Hub instance

By adding an Open Data Hub instance to your project, you can access the URL for your Open Data Hub dashboard and share it with data science users.

Prerequisites
  • You have installed the Open Data Hub Operator.

  • You have created a new project for your instance of Open Data Hub.

Procedure
  1. In the OpenShift web console, select OperatorsInstalled Operators.

  2. On the Installed Operators page, click the Project list and select the odh project. The page filters to only display installed operators in the odh project.

  3. Find and click the Open Data Hub Operator to display the details for the currently installed version.

  4. On the KfDef tile, click Create instance. A KfDef object is a specification designed to control provisioning and management of a Kubeflow deployment. A default KfDef object is created when you install Open Data Hub Operator. This default configuration deploys the required Open Data Hub core components. For more information, see Tiered Components.

  5. On the Create KfDef page, leave opendatahub as the name. Click Create to create an Open Data Hub kfdef object named opendatahub and begin the deployment of the components.

Verification
  1. Select OperatorsInstalled Operators.

  2. On the Installed Operators page, click the Project list and select the odh project.

  3. Find and click Open Data Hub Operator.

  4. Click the Kf Def tab and confirm that opendatahub appears.

  5. Select HomeProjects.

  6. On the Projects page, find and select the odh project.

  7. On the Project details page, click the Workloads tab and confirm that the Open Data Hub core components are running. For a description of the components, see Tiered Components.

Next Step
  • Access the Open Data Hub dashboard.

Accessing the Open Data Hub dashboard

You can access and share the URL for your Open Data Hub dashboard with other users to let them log in and work on their models.

Prerequisites
  • You have installed the Open Data Hub Operator.

Procedure
  1. In the OpenShift web console, select NetworkingRoutes.

  2. On the Routes page, click the Project list and select the odh project. The page filters to only display routes in the odh project.

  3. In the Location column, copy the URL for the odh-dashboard route.

  4. Give this URL to your users to let them log in to Open Data Hub dashboard.

Verification
  • Confirm that you and your users can log in to the Open Data Hub dashboard by using the URL.

Installing the distributed workloads components

To use the distributed workloads feature in Open Data Hub, you must install several components.

Prerequisites
  • You have logged in to OpenShift Container Platform with the cluster-admin role and you can access the data science cluster.

  • You have installed Open Data Hub.

  • You have sufficient resources. In addition to the minimum Open Data Hub resources described in Installing the Open Data Hub Operator version 2, you need 1.6 vCPU and 2 GiB memory to deploy the distributed workloads infrastructure.

  • You have removed any previously installed instances of the CodeFlare Operator.

  • If you want to use graphics processing units (GPUs), you have enabled GPU support. This process includes installing the Node Feature Discovery Operator and the NVIDIA GPU Operator. For more information, see NVIDIA GPU Operator on Red Hat OpenShift Container Platform in the NVIDIA documentation.

  • If you want to use self-signed certificates, you have added them to a central Certificate Authority (CA) bundle as described in Understanding certificates in Open Data Hub. No additional configuration is necessary to use those certificates with distributed workloads. The centrally configured self-signed certificates are automatically available in the workload pods at the following mount points:

    • Cluster-wide CA bundle:

      /etc/pki/tls/certs/odh-trusted-ca-bundle.crt
      /etc/ssl/certs/odh-trusted-ca-bundle.crt
    • Custom CA bundle:

      /etc/pki/tls/certs/odh-ca-bundle.crt
      /etc/ssl/certs/odh-ca-bundle.crt
Procedure
  1. In the OpenShift Container Platform console, click OperatorsInstalled Operators.

  2. Search for the Open Data Hub Operator, and then click the Operator name to open the Operator details page.

  3. Click the Data Science Cluster tab.

  4. Click the default instance name (for example, default-dsc) to open the instance details page.

  5. Click the YAML tab to show the instance specifications.

  6. Enable the required distributed workloads components. In the spec.components section, set the managementState field correctly for the required components:

    • If you want to use the CodeFlare framework to tune models, enable the codeflare, kueue, and ray components.

    • If you want to use the Kubeflow Training Operator to tune models, enable the kueue and trainingoperator components.

    • The list of required components depends on whether the distributed workload is run from a pipeline or notebook or both, as shown in the following table.

    Table 2. Components required for distributed workloads
    Component Pipelines only Notebooks only Pipelines and notebooks

    codeflare

    Managed

    Managed

    Managed

    dashboard

    Managed

    Managed

    Managed

    datasciencepipelines

    Managed

    Removed

    Managed

    kueue

    Managed

    Managed

    Managed

    ray

    Managed

    Managed

    Managed

    trainingoperator

    Managed

    Managed

    Managed

    workbenches

    Removed

    Managed

    Managed

  7. Click Save. After a short time, the components with a Managed state are ready.

Verification

Check the status of the codeflare-operator-manager, kuberay-operator, and kueue-controller-manager pods, as follows:

  1. In the OpenShift Container Platform console, from the Project list, select odh.

  2. Click WorkloadsDeployments.

  3. Search for the codeflare-operator-manager, kuberay-operator, and kueue-controller-manager deployments. In each case, check the status as follows:

    1. Click the deployment name to open the deployment details page.

    2. Click the Pods tab.

    3. Check the pod status.

      When the status of the codeflare-operator-manager-<pod-id>, kuberay-operator-<pod-id>, and kueue-controller-manager-<pod-id> pods is Running, the pods are ready to use.

    4. To see more information about each pod, click the pod name to open the pod details page, and then click the Logs tab.

Next Step

Configure the distributed workloads feature as described in Managing distributed workloads.

Accessing the Open Data Hub dashboard

You can access and share the URL for your Open Data Hub dashboard with other users to let them log in and work on their models.

Prerequisites
  • You have installed the Open Data Hub Operator.

Procedure
  1. In the OpenShift web console, select NetworkingRoutes.

  2. On the Routes page, click the Project list and select the odh project. The page filters to only display routes in the odh project.

  3. In the Location column, copy the URL for the odh-dashboard route.

  4. Give this URL to your users to let them log in to Open Data Hub dashboard.

Verification
  • Confirm that you and your users can log in to the Open Data Hub dashboard by using the URL.

Working with certificates

Certificates are used by various components in OpenShift Container Platform to validate access to the cluster. For clusters that rely on self-signed certificates, you can add those self-signed certificates to a cluster-wide Certificate Authority (CA) bundle and use the CA bundle in Open Data Hub. You can also use self-signed certificates in a custom CA bundle that is separate from the cluster-wide bundle. Administrators can add a CA bundle, remove a CA bundle from all namespaces, remove a CA bundle from individual namespaces, or manually manage certificate changes instead of the system.

Understanding certificates in Open Data Hub

For OpenShift Container Platform clusters that rely on self-signed certificates, you can add those self-signed certificates to a cluster-wide Certificate Authority (CA) bundle (ca-bundle.crt) and use the CA bundle in Open Data Hub. You can also use self-signed certificates in a custom CA bundle (odh-ca-bundle.crt) that is separate from the cluster-wide bundle.

How CA bundles are injected

After installing Open Data Hub, the Open Data Hub Operator automatically creates an empty odh-trusted-ca-bundle configuration file (ConfigMap), and the Cluster Network Operator (CNO) injects the cluster-wide CA bundle into the odh-trusted-ca-bundle configMap with the label "config.openshift.io/inject-trusted-cabundle". The components deployed in the affected namespaces are responsible for mounting this configMap as a volume in the deployment pods.

apiVersion: v1
kind: ConfigMap
metadata:
  labels:
    app.kubernetes.io/part-of: opendatahub-operator
    config.openshift.io/inject-trusted-cabundle: 'true'
  name: odh-trusted-ca-bundle

After the CNO operator injects the bundle, it updates the ConfigMap with the ca-bundle.crt file containing the certificates.

apiVersion: v1
kind: ConfigMap
metadata:
  labels:
    app.kubernetes.io/part-of: opendatahub-operator
    config.openshift.io/inject-trusted-cabundle: 'true'
  name: odh-trusted-ca-bundle
data:
  ca-bundle.crt: |
    <BUNDLE OF CLUSTER-WIDE CERTIFICATES>

How the ConfigMap is managed

By default, the Open Data Hub Operator manages the odh-trusted-ca-bundle ConfigMap. If you want to manage or remove the odh-trusted-ca-bundle ConfigMap, or add a custom CA bundle (odh-ca-bundle.crt) separate from the cluster-wide CA bundle (ca-bundle.crt), you can use the trustedCABundle property in the Operator’s DSC Initialization (DSCI) object.

spec:
  trustedCABundle:
    managementState: Managed
    customCABundle: ""

In the Operator’s DSCI object, you can set the spec.trustedCABundle.managementState field to the following values:

  • Managed: The Open Data Hub Operator manages the odh-trusted-ca-bundle ConfigMap and adds it to all non-reserved existing and new namespaces (the ConfigMap is not added to any reserved or system namespaces, such as default, openshift-\* or kube-*). The ConfigMap is automatically updated to reflect any changes made to the customCABundle field. This is the default value after installing Open Data Hub.

  • Unmanaged: The Open Data Hub Operator does not manage the odh-trusted-ca-bundle ConfigMap, allowing for an administrator to manage it instead. Changing the managementState from Managed to Unmanaged does not remove the odh-trusted-ca-bundle ConfigMap, but the ConfigMap is not updated if you make changes to the customCABundle field.

In the Operator’s DSCI object, you can add a custom certificate to the spec.trustedCABundle.customCABundle field. This adds the odh-ca-bundle.crt file containing the certificates to the odh-trusted-ca-bundle ConfigMap, as shown in the following example:

apiVersion: v1
kind: ConfigMap
metadata:
  labels:
    app.kubernetes.io/part-of: opendatahub-operator
    config.openshift.io/inject-trusted-cabundle: 'true'
  name: odh-trusted-ca-bundle
data:
  ca-bundle.crt: |
    <BUNDLE OF CLUSTER-WIDE CERTIFICATES>
  odh-ca-bundle.crt: |
    <BUNDLE OF CUSTOM CERTIFICATES>

Adding a CA bundle

There are two ways to add a Certificate Authority (CA) bundle to Open Data Hub. You can use one or both of these methods:

  • For OpenShift Container Platform clusters that rely on self-signed certificates, you can add those self-signed certificates to a cluster-wide Certificate Authority (CA) bundle (ca-bundle.crt) and use the CA bundle in Open Data Hub. To use this method, log in to the OpenShift Container Platform as a cluster administrator and follow the steps as described in Configuring the cluster-wide proxy during installation.

  • You can use self-signed certificates in a custom CA bundle (odh-ca-bundle.crt) that is separate from the cluster-wide bundle. To use this method, follow the steps in this section.

Prerequisites
  • You have admin access to the DSCInitialization resources in the OpenShift Container Platform cluster.

  • You installed the OpenShift command line interface (oc) as described in Installing the OpenShift CLI.

  • You are working in a new installation of Open Data Hub. If you upgraded Open Data Hub, see Adding a CA bundle after upgrading.

Procedure
  1. Log in to the OpenShift Container Platform.

  2. Click OperatorsInstalled Operators and then click the Open Data Hub Operator.

  3. Click the DSC Initialization tab.

  4. Click the default-dsci object.

  5. Click the YAML tab.

  6. In the spec section, add the custom certificate to the customCABundle field for trustedCABundle, as shown in the following example:

    spec:
      trustedCABundle:
        managementState: Managed
        customCABundle: |
          -----BEGIN CERTIFICATE-----
          examplebundle123
          -----END CERTIFICATE-----
  7. Click Save.

Verification
  • If you are using a cluster-wide CA bundle, run the following command to verify that all non-reserved namespaces contain the odh-trusted-ca-bundle ConfigMap:

    $ oc get configmaps --all-namespaces -l app.kubernetes.io/part-of=opendatahub-operator | grep odh-trusted-ca-bundle
  • If you are using a custom CA bundle, run the following command to verify that a non-reserved namespace contains the odh-trusted-ca-bundle ConfigMap and that the ConfigMap contains your customCABundle value. In the following command, example-namespace is the non-reserved namespace and examplebundle123 is the customCABundle value.

    $ oc get configmap odh-trusted-ca-bundle -n example-namespace -o yaml | grep examplebundle123

Removing a CA bundle

You can remove a Certificate Authority (CA) bundle from all non-reserved namespaces in Open Data Hub. This process changes the default configuration and disables the creation of the odh-trusted-ca-bundle configuration file (ConfigMap), as described in Understanding certificates in Open Data Hub.

Note
The odh-trusted-ca-bundle ConfigMaps are only deleted from namespaces when you set the managementState of trustedCABundle to Removed; deleting the DSC Initialization does not delete the ConfigMaps.

To remove a CA bundle from a single namespace only, see Removing a CA bundle from a namespace.

Prerequisites
  • You have cluster administrator privileges for your OpenShift Container Platform cluster.

  • You installed the OpenShift command line interface (oc) as described in Installing the OpenShift CLI.

Procedure
  1. In the OpenShift Container Platform web console, click OperatorsInstalled Operators and then click the Open Data Hub Operator.

  2. Click the DSC Initialization tab.

  3. Click the default-dsci object.

  4. Click the YAML tab.

  5. In the spec section, change the value of the managementState field for trustedCABundle to Removed:

    spec:
      trustedCABundle:
        managementState: Removed
  6. Click Save.

Verification
  • Run the following command to verify that the odh-trusted-ca-bundle ConfigMap has been removed from all namespaces:

    $ oc get configmaps --all-namespaces | grep odh-trusted-ca-bundle

    The command should not return any ConfigMaps.

Removing a CA bundle from a namespace

You can remove a custom Certificate Authority (CA) bundle from individual namespaces in Open Data Hub. This process disables the creation of the odh-trusted-ca-bundle configuration file (ConfigMap) for the specified namespace only.

To remove a certificate bundle from all namespaces, see Removing a CA bundle.

Prerequisites
  • You have cluster administrator privileges for your OpenShift Container Platform cluster.

  • You installed the OpenShift command line interface (oc) as described in Installing the OpenShift CLI.

Procedure
  • Run the following command to remove a CA bundle from a namespace. In the following command, example-namespace is the non-reserved namespace.

    $ oc annotate ns example-namespace security.opendatahub.io/inject-trusted-ca-bundle=false
Verification
  • Run the following command to verify that the CA bundle has been removed from the namespace. In the following command, example-namespace is the non-reserved namespace.

    $ oc get configmap odh-trusted-ca-bundle -n example-namespace

    The command should return configmaps "odh-trusted-ca-bundle" not found.

Managing certificates

After installing Open Data Hub, the Open Data Hub Operator creates the odh-trusted-ca-bundle configuration file (ConfigMap) that contains the trusted CA bundle and adds it to all new and existing non-reserved namespaces in the cluster. By default, the Open Data Hub Operator manages the odh-trusted-ca-bundle ConfigMap and automatically updates it if any changes are made to the CA bundle. You can choose to manage the odh-trusted-ca-bundle ConfigMap instead of allowing the Open Data Hub Operator to manage it.

Prerequisites
  • You have cluster administrator privileges for your OpenShift Container Platform cluster.

Procedure
  1. In the OpenShift Container Platform web console, click OperatorsInstalled Operators and then click the Open Data Hub Operator.

  2. Click the DSC Initialization tab.

  3. Click the default-dsci object.

  4. Click the YAML tab.

  5. In the spec section, change the value of the managementState field for trustedCABundle to Unmanaged, as shown:

    spec:
      trustedCABundle:
        managementState: Unmanaged
  6. Click Save.

    Note that changing the managementState from Managed to Unmanaged does not remove the odh-trusted-ca-bundle ConfigMap, but the ConfigMap is not updated if you make changes to the customCABundle field.

Verification
  1. In the spec section, set or change the value of the customCABundle field for trustedCABundle, for example:

    spec:
      trustedCABundle:
        managementState: Unmanaged
        customCABundle: example123
  2. Click Save.

  3. Click WorkloadsConfigMaps.

  4. Select a project from the project list.

  5. Click the odh-trusted-ca-bundle ConfigMap.

  6. Click the YAML tab and verify that the value of the customCABundle field did not update.

Accessing S3-compatible object storage with self-signed certificates

To use object storage solutions or databases that are deployed in an OpenShift cluster that uses self-signed certificates, you must configure Open Data Hub to trust the cluster’s certificate authority (CA).

Each namespace has a ConfigMap called kube-root-ca.crt that contains the CA certificates of the internal API Server. Use the following steps to configure Open Data Hub to trust the certificates issued by kube-root-ca.crt.

Alternatively, you can add a custom CA bundle by using the OpenShift console, as described in Adding a CA bundle.

Prerequisites
  • You have cluster administrator privileges for your OpenShift Container Platform cluster.

  • You have downloaded and installed the OpenShift command-line interface (CLI). See Installing the OpenShift CLI.

  • You have an object storage solution or database deployed in your OpenShift cluster.

Procedure
  1. In a terminal window, log in to the OpenShift CLI as shown in the following example:

    oc login api.<cluster_name>.<cluster_domain>:6443 --web
  2. Run the following command to fetch the current Open Data Hub trusted CA configuration and store it in a new file:

    oc get dscinitializations.dscinitialization.opendatahub.io default-dsci -o json | jq -r '.spec.trustedCABundle.customCABundle' > /tmp/my-custom-ca-bundles.crt
  3. Add the cluster’s kube-root-ca.crt ConfigMap to the Open Data Hub trusted CA configuration:

    oc get configmap kube-root-ca.crt -o jsonpath="{['data']['ca\.crt']}" >> /tmp/my-custom-ca-bundles.crt
  4. Update the Open Data Hub trusted CA configuration to trust certificates issued by the certificate authorities in kube-root-ca.crt:

    oc patch dscinitialization default-dsci --type='json' -p='[{"op":"replace","path":"/spec/trustedCABundle/customCABundle","value":"'"$(awk '{printf "%s\\n", $0}' /tmp/my-custom-ca-bundles.crt)"'"}]'
Verification
  • You can successfully deploy components that are configured to use object storage solutions or databases that are deployed in the OpenShift cluster. For example, a pipeline server that is configured to use a database deployed in the cluster starts successfully.

Using self-signed certificates with Open Data Hub components

Some Open Data Hub components have additional options or required configuration for self-signed certificates.

Using certificates with data science pipelines

If you want to use self-signed certificates, you have added them to a central Certificate Authority (CA) bundle as described in Understanding certificates in Open Data Hub.

No additional configuration is necessary to use those certificates with data science pipelines.

Providing a CA bundle only for data science pipelines

Perform the following steps to provide a Certificate Authority (CA) bundle just for data science pipelines.

Procedure
  1. Log in to OpenShift Container Platform.

  2. From WorkloadsConfigMaps, create a ConfigMap with the required bundle in the same data science project or namespace as the target data science pipeline:

    kind: ConfigMap
    apiVersion: v1
    metadata:
        name: custom-ca-bundle
    data:
        ca-bundle.crt: |
        # contents of ca-bundle.crt
  3. Add the following snippet to the .spec.apiserver.caBundle field of the underlying Data Science Pipelines Application (DSPA):

    apiVersion: datasciencepipelinesapplications.opendatahub.io/v1
    kind: DataSciencePipelinesApplication
    metadata:
        name: data-science-dspa
    spec:
        ...
        apiServer:
        ...
        cABundle:
            configMapName: custom-ca-bundle
            configMapKey: ca-bundle.crt

The pipeline server pod redeploys with the updated bundle and uses it in the newly created pipeline pods.

Verification

Perform the following steps to confirm that your CA bundle was successfully mounted.

  1. Log in to the OpenShift Container Platform console.

  2. Go to the OpenShift Container Platform project that corresponds to the data science project.

  3. Click the Pods tab.

  4. Click the pipeline server pod with the ds-pipeline-dspa-<hash> prefix.

  5. Click Terminal.

  6. Enter cat /dsp-custom-certs/dsp-ca.crt.

  7. Verify that your CA bundle is present within this file.

You can also confirm that your CA bundle was successfully mounted by using the CLI:

  1. In a terminal window, log in to the OpenShift cluster where Open Data Hub is deployed.

    oc login
  2. Set the dspa value:

    dspa=dspa
  3. Set the dsProject value, replacing $YOUR_DS_PROJECT with the name of your data science project:

    dsProject=$YOUR_DS_PROJECT
  4. Set the pod value:

    pod=$(oc get pod -n ${dsProject} -l app=ds-pipeline-${dspa} --no-headers | awk '{print $1}')
  5. Display the contents of the /dsp-custom-certs/dsp-ca.crt file:

    oc -n ${dsProject} exec $pod -- cat /dsp-custom-certs/dsp-ca.crt
  6. Verify that your CA bundle is present within this file.

Using certificates with workbenches

Important

Self-signed certificates apply by default to workbenches that you create after configuring the certificates centrally as described in Understanding certificates in Open Data Hub. To apply centrally configured certificates to an existing workbench, stop and then restart the workbench.

Self-signed certificates are stored in /etc/pki/tls/custom-certs/ca-bundle.crt. Workbenches are preset with an environment variable that points packages to this path, and that covers many popular HTTP client packages. For packages that are not included by default, you can provide this certificate path. For example, for the kfp package to connect to the data science pipeline server:

from kfp.client import Client

with open(sa_token_file_path, 'r') as token_file:
    bearer_token = token_file.read()

    client = Client(
        host='https://<GO_TO_ROUTER_OF_DS_PROJECT>/',
        existing_token=bearer_token,
        ssl_ca_cert='/etc/pki/tls/custom-certs/ca-bundle.crt'
    )
    print(client.list_experiments())
Creating data science pipelines with Elyra and self-signed certificates

To create pipelines using a workbench that contains the Elyra extension and which uses self-signed certificates, see the Workbench workaround for executing a pipeline using Elyra in a disconnected environment knowledgebase article.

Configuring the Open Data Hub Operator logger

You can adjust the log level of Open Data Hub Operator components to increase or reduce log verbosity to suit your use case.

Configuring the Open Data Hub Operator logger

You can change the log level for Open Data Hub Operator components by setting the .spec.devFlags.logmode flag for the DSC Initialization/DSCI custom resource during runtime. If you do not set a logmode value, the logger uses the INFO log level by default.

The log level that you set with .spec.devFlags.logmode applies to all components, not just those in a Managed state.

The following table shows the available log levels:

Log level Stacktrace level Verbosity Output Timestamp type

devel or development

WARN

INFO

Console

Epoch timestamps

"" (or no logmode value set)

ERROR

INFO

JSON

Human-readable timestamps

prod or production

ERROR

INFO

JSON

Human-readable timestamps

Logs that are set to devel or development generate in a plain text console format. Logs that are set to prod, production, or which do not have a level set generate in a JSON format.

Prerequisites
  • You have admin access to the DSCInitialization resources in the OpenShift Container Platform cluster.

  • You installed the OpenShift command line interface (oc) as described in Installing the OpenShift CLI.

Procedure
  1. Log in to the OpenShift Container Platform as a cluster administrator.

  2. Click OperatorsInstalled Operators and then click the Open Data Hub Operator.

  3. Click the DSC Initialization tab.

  4. Click the default-dsci object.

  5. Click the YAML tab.

  6. In the spec section, update the .spec.devFlags.logmode flag with the log level that you want to set.

    apiVersion: dscinitialization.opendatahub.io/v1
    kind: DSCInitialization
    metadata:
      name: default-dsci
    spec:
      devFlags:
        logmode: development
  7. Click Save.

You can also configure the log level from the OpenShift CLI by using the following command with the logmode value set to the log level that you want.

oc patch dsci default-dsci -p '{"spec":{"devFlags":{"logmode":"development"}}}' --type=merge
Verification
  • If you set the component log level to devel or development, logs generate more frequently and include logs at WARN level and above.

  • If you set the component log level to prod or production, or do not set a log level, logs generate less frequently and include logs at ERROR level or above.

Viewing the Open Data Hub Operator log

  1. Log in to the OpenShift CLI.

  2. Run the following command:

    oc get pods -l name=opendatahub-operator -o name -n openshift-operators |  xargs -I {} oc logs -f {} -n openshift-operators

    The operator pod log opens.