Info alert:Important Notice

Please note that more information about the previous v2 releases can be found here. You can use "Find a release" search bar to search for a particular release.

Working in your data science IDE

Table of Contents

Accessing your workbench IDE
Working in JupyterLab
Working in code-server

In Open Data Hub, when you create a workbench, you select a workbench image that includes an IDE (integrated development environment) for developing your machine learning (ML) models.

You can use the following data science IDEs for developing ML models with Open Data Hub:

JupyterLab
code-server

Accessing your workbench IDE

To access a workbench IDE, use the link provided in the Open Data Hub interface.

Prerequisite

You have created a data science project and a workbench.

Procedure

From the Open Data Hub dashboard, click Data science projects.
Click the name of the project that contains the workbench.
Click the Workbenches tab.
If the status of the workbench is Running, skip to the next step.

If the status of the workbench is Stopped, in the Status column for the workbench, click Start.

The Status column changes from Stopped to Starting when the workbench server is starting, and then to Running when the workbench has successfully started.
Click the open icon () next to the workbench.

Verification

A new browser window opens for the workbench IDE.

Working in JupyterLab

JupyterLab is a web-based interactive development environment for Jupyter notebooks, code, and data. You can configure and arrange workflows in data science and machine learning. JupyterLab is an open source web application that supports over 40 programming languages, including Python and R.

Creating and importing Jupyter notebooks

You can create a blank Jupyter notebook or import a Jupyter notebook in JupyterLab from several different sources.

Creating a Jupyter notebook

You can create a Jupyter notebook from an existing notebook container image to access its resources and properties. The Workbench control panel contains a list of available container images that you can run as a single-user workbench.

Prerequisites

Ensure that you have logged in to Open Data Hub.
Ensure that you have launched your workbench and logged in to JupyterLab.
The workbench image exists in a registry, image stream, and is accessible.

Procedure

Click File → New → Notebook.
If prompted, select a kernel for your Jupyter notebook from the list.

If you want to use a kernel, click Select. If you do not want to use a kernel, click No Kernel.

Verification

Check that the notebook file is visible in the JupyterLab interface.

Uploading an existing notebook file to JupyterLab from local storage

You can load an existing notebook file from local storage into JupyterLab to continue work, or adapt a project for a new use case.

Prerequisites

Credentials for logging in to JupyterLab.
You have a launched and running workbench based on a JupyterLab image.
A notebook file exists in your local storage.

Procedure

In the File Browser in the left sidebar of the JupyterLab interface, click Upload Files ().
Locate and select the notebook file and then click Open.

The file is displayed in the File Browser.

Verification

The notebook file appears in the File Browser in the left sidebar of the JupyterLab interface.
You can open the notebook file in JupyterLab.

Additional resources

Collaborating on Jupyter notebooks by using Git

Collaborating on Jupyter notebooks by using Git

If your files are stored in Git version control, you can clone a Git repository to work with them in JupyterLab. When you are ready, you can push your changes back to the Git repository so that others can review or use your models.

Uploading an existing notebook file from a Git repository by using JupyterLab

You can use the JupyterLab user interface to clone a Git repository into your workspace to continue your work or integrate files from an external project.

Prerequisites

You have a launched and running workbench based on a JupyterLab image.
Read access for the Git repository you want to clone.

Procedure

Copy the HTTPS URL for the Git repository.
- In GitHub, click ⤓ Code → HTTPS and then click the Copy URL to clipboard icon.
- In GitLab, click Code and then click the Copy URL icon under Clone with HTTPS.
In the JupyterLab interface, click the Git Clone button ().

You can also click Git → Clone a repository in the menu, or click the Git icon () and click the Clone a repository button.

The Clone a repo dialog appears.
Enter the HTTPS URL of the repository that contains your notebook file.
Click CLONE.
If prompted, enter your username and password for the Git repository.

Verification

Check that the contents of the repository are visible in the file browser in JupyterLab, or run the ls command in the terminal to verify that the repository shows as a directory.

Uploading an existing notebook file to JupyterLab from a Git repository by using the CLI

You can use the command line interface to clone a Git repository into your workspace to continue your work or integrate files from an external project.

Prerequisites

You have a launched and running workbench based on a JupyterLab image.

Procedure

Copy the HTTPS URL for the Git repository.
- In GitHub, click ⤓ Code → HTTPS and then click the Copy URL to clipboard icon.
- In GitLab, click Code and then click the Copy URL icon under Clone with HTTPS.
In JupyterLab, click File → New → Terminal to open a terminal window.

Enter the git clone command:

git clone <git-clone-URL>

Replace git-clone-URL> with the HTTPS URL, for example:

[1234567890@jupyter-nb-jdoe ~]$ git clone https://github.com/example/myrepo.git
Cloning into myrepo...
remote: Enumerating objects: 11, done.
remote: Counting objects: 100% (11/11), done.
remote: Compressing objects: 100% (10/10), done.
remote: Total 2821 (delta 1), reused 5 (delta 1), pack-reused 2810
Receiving objects: 100% (2821/2821), 39.17 MiB | 23.89 MiB/s, done.
Resolving deltas: 100% (1416/1416), done.

Verification

Check that the contents of the repository are visible in the file browser in JupyterLab, or run the ls command in the terminal to verify that the repository shows as a directory.

Updating your project with changes from a remote Git repository

You can pull changes made by other users into your data science project from a remote Git repository.

Prerequisites

You have a launched and running workbench based on a JupyterLab image.
You have credentials for logging in to Jupyter.
You have configured the remote Git repository.
You have permissions to pull files from the remote Git repository to your local repository.
You have imported the Git repository into JupyterLab, and the contents of the repository are visible in the file browser in JupyterLab.

Procedure

In the JupyterLab interface, click the Git button ().
Click the Pull latest changes button ().

Verification

You can view the changes pulled from the remote repository on the History tab in the Git pane.

Pushing project changes to a Git repository

To build and deploy your application in a production environment, upload your work to a remote Git repository.

Prerequisites

You have opened a Jupyter notebook in the JupyterLab interface.
You have added the relevant Git repository to your workbench.
You have permission to push changes to the relevant Git repository.
You have installed the Git version control extension.

Procedure

Click File → Save All to save any unsaved changes.
Click the Git icon () to open the Git pane in the JupyterLab interface.
Confirm that your changed files appear under Changed.

If your changed files appear under Untracked, click Git → Simple Staging to enable a simplified Git process.
Commit your changes.
1. Ensure that all files under Changed have a blue checkmark beside them.
2. In the Summary field, enter a brief description of the changes you made.
3. Click Commit.
Click Git → Push to Remote to push your changes to the remote repository.
When prompted, enter your Git credentials and click OK.

Verification

Your most recently pushed changes are visible in the remote Git repository.

Managing Python packages

In JupyterLab, you can view the Python packages that are installed on your workbench image and install additional packages.

Viewing Python packages installed on your workbench

You can check which Python packages are installed on your workbench and which version of the package you have by running the pip tool in a notebook cell.

Prerequisites

Procedure

Enter the following in a new cell in your Jupyter notebook:
```
!pip list
```
Run the cell.

Verification

The output shows an alphabetical list of all installed Python packages and their versions. For example, if you use the pip list command immediately after creating a workbench that uses the Minimal image, the first packages shown are similar to the following:

Package                           Version
--------------------------------- ----------
aiohttp                           3.7.3
alembic                           1.5.2
appdirs                           1.4.4
argo-workflows                    3.6.1
argon2-cffi                       20.1.0
async-generator                   1.10
async-timeout                     3.0.1
attrdict                          2.0.1
attrs                             20.3.0
backcall                          0.2.0

Installing Python packages on your workbench

You can install Python packages that are not part of the default workbench by adding the package and the version to a requirements.txt file and then running the pip install command in a notebook cell.

Note	Although you can install packages directly, it is recommended that you use a `requirements.txt` file so that the packages stated in the file can be easily re-used across different workbenches.

Prerequisites

Procedure

Create a new text file using one of the following methods:
- Click + to open a new launcher and then click Text file.
- Click File → New → Text File.
Rename the text file to requirements.txt.
1. Right-click the name of the file and then click Rename Text. The Rename File dialog opens.
2. Enter requirements.txt in the New Name field and then click Rename.
Add the packages to install to the requirements.txt file.
```
altair
```
You can specify the exact version to install by using the == (equal to) operator, for example:
```
altair==4.1.0
```
Specifying exact package versions to enhance the stability of your workbench over time is recommended. New package versions can introduce undesirable or unexpected changes in your environment’s behavior. To install multiple packages at the same time, place each package on a separate line.
Install the packages in requirements.txt to your server by using a notebook cell.
1. Create a new notebook cell and enter the following command:
  !pip install -r requirements.txt
2. Run the cell by pressing Shift and Enter.
Important

The pip install command installs the package on your workbench. However, you must run the import statement in a code cell to use the package in your code.

import altair

Verification

Confirm that the packages in the requirements.txt file appear in the list of packages installed on the workbench. See Viewing Python packages installed on your workbench for details.

Troubleshooting common problems in workbenches for users

If you are seeing errors in Open Data Hub related to Jupyter, your Jupyter notebooks, or your workbench, read this section to understand what could be causing the problem.

I see a 403: Forbidden error when I log in to Jupyter

Problem

If your cluster administrator has configured Open Data Hub user groups, your username might not be added to the default user group or the default administrator group for Open Data Hub.

Resolution

Contact your cluster administrator so that they can add you to the correct group/s.

My workbench does not start

Problem

The OpenShift Container Platform cluster that hosts your workbench might not have access to enough resources, or the workbench pod may have failed.

Resolution

Check the logs in the Events section in OpenShift for error messages associated with the problem. For example:

Server requested
2021-10-28T13:31:29.830991Z [Warning] 0/7 nodes are available: 2 Insufficient memory,
2 node(s) had taint {node-role.kubernetes.io/infra: }, that the pod didn't tolerate, 3 node(s) had taint {node-role.kubernetes.io/master: },
that the pod didn't tolerate.

Contact your cluster administrator with details of any relevant error messages so that they can perform further checks.

I see a database or disk is full error or a no space left on device error when I run my notebook cells

Problem

You might have run out of storage space on your workbench.

Resolution

Contact your cluster administrator so that they can perform further checks.

Working in code-server

Code-server is a web-based interactive development environment supporting multiple programming languages, including Python, for working with Jupyter notebooks. With the code-server workbench image, you can customize your workbench environment to meet your needs using a variety of extensions to add new languages, themes, debuggers, and connect to additional services. For more information, see code-server in GitHub.

Note	Elyra-based pipelines are not available with the code-server workbench image.

Creating code-server workbenches

You can create a blank Jupyter notebook or import a Jupyter notebook in code-server from several different sources.

Creating a workbench

When you create a workbench, you specify an image (an IDE, packages, and other dependencies). You can also configure connections, cluster storage, and add container storage.

Prerequisites

You have logged in to Open Data Hub.
You have created a project.

Procedure

From the Open Data Hub dashboard, click Data science projects.

The Data science projects page opens.
Click the name of the project that you want to add the workbench to.

A project details page opens.
Click the Workbenches tab.
Click Create workbench.

The Create workbench page opens.
In the Name field, enter a unique name for your workbench.
Optional: If you want to change the default resource name for your workbench, click Edit resource name.

The resource name is what your resource is labeled in OpenShift. Valid characters include lowercase letters, numbers, and hyphens (-). The resource name cannot exceed 30 characters, and it must start with a letter and end with a letter or number.

Note: You cannot change the resource name after the workbench is created. You can edit only the display name and the description.
Optional: In the Description field, enter a description for your workbench.
In the Workbench image section, complete the fields to specify the workbench image to use with your workbench.

From the Image selection list, select a workbench image that suits your use case. A workbench image includes an IDE and Python packages (reusable code). If project-scoped images exist, the Image selection list includes subheadings to distinguish between global images and project-scoped images.

Optionally, click View package information to view a list of packages that are included in the image that you selected.

If the workbench image has multiple versions available, select the workbench image version to use from the Version selection list. To use the latest package versions, Red Hat recommends that you use the most recently added image.

Note
You can change the workbench image after you create the workbench.

In the Deployment size section, select one of the following options, depending on whether the hardware profiles feature is enabled.

If the hardware profiles feature is not enabled:
1. From the Container size list, select the appropriate size for the size of the model that you want to train or tune.
  
  For example, to run the example fine-tuning job described in Fine-tuning a model by using Kubeflow Training, select Medium.
2. From the Accelerator list, select a suitable accelerator profile for your workbench.
  
  If project-scoped accelerator profiles exist, the Accelerator list includes subheadings to distinguish between global accelerator profiles and project-scoped accelerator profiles.

If the hardware profiles feature is enabled:

From the Hardware profile list, select a suitable hardware profile for your workbench.

If project-scoped hardware profiles exist, the Hardware profile list includes subheadings to distinguish between global hardware profiles and project-scoped hardware profiles.

The hardware profile specifies the number of CPUs and the amount of memory allocated to the container, setting the guaranteed minimum (request) and maximum (limit) for both.

If you want to change the default values, click Customize resource requests and limit and enter new minimum (request) and maximum (limit) values.

Important

By default, the hardware profiles feature is not enabled: hardware profiles are not shown in the dashboard navigation menu or elsewhere in the user interface. In addition, user interface components associated with the deprecated accelerator profiles functionality are still displayed. To show the Settings → Hardware profiles option in the dashboard navigation menu, and the user interface components associated with hardware profiles, set the disableHardwareProfiles value to false in the OdhDashboardConfig custom resource (CR) in OpenShift Container Platform. For more information about setting dashboard configuration options, see Customizing the dashboard.

Optional: In the Environment variables section, select and specify values for any environment variables.

Setting environment variables during the workbench configuration helps you save time later because you do not need to define them in the body of your workbenches, or with the IDE command line interface.

If you are using S3-compatible storage, add these recommended environment variables:
- AWS_ACCESS_KEY_ID specifies your Access Key ID for Amazon Web Services.
- AWS_SECRET_ACCESS_KEY specifies your Secret access key for the account specified in AWS_ACCESS_KEY_ID.
Open Data Hub stores the credentials as Kubernetes secrets in a protected namespace if you select Secret when you add the variable.
In the Cluster storage section, configure the storage for your workbench. Select one of the following options:
- Create new persistent storage to create storage that is retained after you shut down your workbench. Complete the relevant fields to define the storage:
  1. Enter a name for the cluster storage.
  2. Enter a description for the cluster storage.
  3. Select a storage class for the cluster storage.
    
    Note
    You cannot change the storage class after you add the cluster storage to the workbench.
  4. For storage classes that support multiple access modes, select an Access mode to define how the volume can be accessed. For more information, see About persistent storage.
    
    Only the access modes that have been enabled for the storage class by your cluster and Open Data Hub administrators are visible.
  5. Under Persistent storage size, enter a new size in gibibytes or mebibytes.
- Use existing persistent storage to reuse existing storage and select the storage from the Persistent storage list.
Optional: You can add a connection to your workbench. A connection is a resource that contains the configuration parameters needed to connect to a data source or sink, such as an object storage bucket. You can use storage buckets for storing data, models, and pipeline artifacts. You can also use a connection to specify the location of a model that you want to deploy.

In the Connections section, use an existing connection or create a new connection:
- Use an existing connection as follows:
  
  Click Attach existing connections.
  
  From the Connection list, select a connection that you previously defined.
- Create a new connection as follows:
  
  Click Create connection. The Add connection dialog appears.
  
  From the Connection type drop-down list, select the type of connection. The Connection details section appears.
  
  If you selected S3 compatible object storage in the preceding step, configure the connection details:
  
  In the Connection name field, enter a unique name for the connection.
  
  Optional: In the Description field, enter a description for the connection.
  
  In the Access key field, enter the access key ID for the S3-compatible object storage provider.
  
  In the Secret key field, enter the secret access key for the S3-compatible object storage account that you specified.
  
  In the Endpoint field, enter the endpoint of your S3-compatible object storage bucket.
  
  In the Region field, enter the default region of your S3-compatible object storage account.
  
  In the Bucket field, enter the name of your S3-compatible object storage bucket.
  
  Click Create.
  
  If you selected URI in the preceding step, configure the connection details:
  
  In the Connection name field, enter a unique name for the connection.
  
  Optional: In the Description field, enter a description for the connection.
  
  In the URI field, enter the Uniform Resource Identifier (URI).
  
  Click Create.
Click Create workbench.

Verification

The workbench that you created appears on the Workbenches tab for the project.
Any cluster storage that you associated with the workbench during the creation process appears on the Cluster storage tab for the project.
The Status column on the Workbenches tab displays a status of Starting when the workbench server is starting, and Running when the workbench has successfully started.
Optional: Click the open icon () to open the IDE in a new window.

Uploading an existing notebook file to code-server from local storage

You can load an existing notebook file from local storage into code-server to continue work, or adapt a project for a new use case.

Prerequisites

You have a running code-server workbench.
You have a notebook file in your local storage.

Procedure

In your code-server window, from the Activity Bar, select the menu icon () → File → Open File.
In the Open File dialog, click the Show Local button.
Locate and select the notebook file and then click Open.

The file is displayed in the code-server window.
Save the file and then push the changes to your repository.

Verification

The notebook file appears in the code-server Explorer view.
You can open the notebook file in the code-server window.

Collaborating on workbenches in code-server by using Git

If your files are stored in Git version control, you can clone a Git repository to work with them in code-server. When you are ready, you can push your changes back to the Git repository so that others can review or use your models.

Uploading an existing notebook file from a Git repository by using code-server

You can use the code-server user interface to clone a Git repository into your workspace to continue your work or integrate files from an external project.

Prerequisites

You have a running code-server workbench.
You have read access for the Git repository you want to clone.

Procedure

Copy the HTTPS URL for the Git repository.
- In GitHub, click ⤓ Code → HTTPS and then click the Copy URL to clipboard icon.
- In GitLab, click Code and then click the Copy URL icon under Clone with HTTPS.
In your code-server window, from the Activity Bar, select the menu icon () → View → Command Palette.
In the Command Palette, enter Git: Clone, and then select Git: Clone from the list.
Paste the HTTPS URL of the repository that contains your notebook file, and then press Enter.
If prompted, enter your username and password for the Git repository.
Select a folder to clone the repository into, and then click OK.
When the repository is cloned, a dialog appears asking if you want to open the cloned repository. Click Open in the dialog.

Verification

Check that the contents of the repository are visible in the code-server Explorer view, or run the ls command in the terminal to verify that the repository shows as a directory.

Uploading an existing notebook file to code-server from a Git repository by using the CLI

You can use the command line interface to clone a Git repository into your workspace to continue your work or integrate files from an external project.

Prerequisites

You have a running code-server workbench.

Procedure

Copy the HTTPS URL for the Git repository.
- In GitHub, click ⤓ Code → HTTPS and then click the Copy URL to clipboard icon.
- In GitLab, click Code and then click the Copy URL icon under Clone with HTTPS.
In your code-server window, from the Activity Bar, select the menu icon () → Terminal → New Terminal to open a terminal window.

Enter the git clone command:

git clone <git-clone-URL>

Replace <git-clone-URL> with the HTTPS URL, for example:

$ git clone https://github.com/example/myrepo.git
Cloning into myrepo...
remote: Enumerating objects: 11, done.
remote: Counting objects: 100% (11/11), done.
remote: Compressing objects: 100% (10/10), done.
remote: Total 2821 (delta 1), reused 5 (delta 1), pack-reused 2810
Receiving objects: 100% (2821/2821), 39.17 MiB | 23.89 MiB/s, done.
Resolving deltas: 100% (1416/1416), done.

Verification

Check that the contents of the repository are visible in the code-server Explorer view, or run the ls command in the terminal to verify that the repository shows as a directory.

Updating your project in code-server with changes from a remote Git repository

You can pull changes made by other users into your workbench from a remote Git repository.

Prerequisites

You have configured the remote Git repository.
You have imported the Git repository into code-server, and the contents of the repository are visible in the Explorer view in code-server.
You have permissions to pull files from the remote Git repository to your local repository.
You have a running code-server workbench.

Procedure

In your code-server window, from the Activity Bar, click the Source Control icon ().
Click the Views and More Actions button (…), and then select Pull.

Verification

You can view the changes pulled from the remote repository in the Source Control pane.

Pushing project changes in code-server to a Git repository

To build and deploy your application in a production environment, upload your work to a remote Git repository.

Prerequisites

You have a running code-server workbench.
You have added the relevant Git repository in code-server.
You have permission to push changes to the relevant Git repository.
You have installed the Git version control extension.

Procedure

In your code-server window, from the Activity Bar, select the menu icon () → File → Save All to save any unsaved changes.
Click the Source Control icon () to open the Source Control pane.
Confirm that your changed files appear under Changes.
Next to the Changes heading, click the Stage All Changes button (+).

The staged files move to the Staged Changes section.
In the Message field, enter a brief description of the changes you made.
Next to the Commit button, click the More Actions… button, and then click Commit & Sync.
If prompted, enter your Git credentials and click OK.

Verification

Your most recently pushed changes are visible in the remote Git repository.

Managing Python packages in code-server

In code-server, you can view the Python packages that are installed on your workbench image and install additional packages.

Viewing Python packages installed on your code-server workbench

You can check which Python packages are installed on your workbench and which version of the package you have by running the pip tool in a terminal window.

Prerequisites

You have a running code-server workbench.

Procedure

In your code-server window, from the Activity Bar, select the menu icon () → Terminal → New Terminal to open a terminal window.
Enter the pip list command.
```
pip list
```

Verification

Package                  Version
------------------------ ----------
asttokens                2.4.1
boto3                    1.34.162
botocore                 1.34.162
cachetools               5.5.0
certifi                  2024.8.30
charset-normalizer       3.4.0
comm                     0.2.2
contourpy                1.3.0
cycler                   0.12.1
debugpy                  1.8.7

Installing Python packages on your code-server workbench

You can install Python packages that are not part of the default workbench image by adding the package and the version to a requirements.txt file and then running the pip install command in a terminal window.

Note	Although you can install packages directly, it is recommended that you use a `requirements.txt` file so that the packages stated in the file can be easily re-used across different workbenches.

Prerequisites

You have a running code-server workbench.

Procedure

In your code-server window, from the Activity Bar, select the menu icon () → File → New Text File to create a new text file.
Add the packages to install to the text file.
```
altair
```
You can specify the exact version to install by using the == (equal to) operator, for example:
```
altair==4.1.0
```
Specifying exact package versions to enhance the stability of your workbench over time is recommended. New package versions can introduce undesirable or unexpected changes in your environment’s behavior. To install multiple packages at the same time, place each package on a separate line.
Save the text file as requirements.txt.
From the Activity Bar, select the menu icon () → Terminal → New Terminal to open a terminal window.
Install the packages in requirements.txt to your server by using the following command:
```
pip install -r requirements.txt
```
Important

The pip install command installs the package on your workbench. However, you must run the import statement to use the package in your code.

import altair

Verification

Confirm that the packages in the requirements.txt file appear in the list of packages installed on the workbench. See Viewing Python packages installed on your code-server workbench for details.

Installing extensions with code-server

With the code-server workbench image, you can customize your code-server environment by using extensions to add new languages, themes, and debuggers, and to connect to additional services. You can also enhance the efficiency of your data science work with extensions for syntax highlighting, auto-indentation, and bracket matching.

For details about the third-party extensions that you can install with code-server, see the Open VSX Registry.

Prerequisites

You are logged in to Open Data Hub.
You have created a data science project that has a code-server workbench.

Procedure

From the Open Data Hub dashboard, click Data science projects.

The Data science projects page opens.
Click the name of the project containing the code-server workbench you want to start.

A project details page opens.
Click the Workbenches tab.
If the status of the workbench that you want to use is Running, skip to the next step.

If the status of the workbench is Stopped, in the Status column for the workbench, click Start.

The Status column changes from Stopped to Starting when the workbench server is starting, and then to Running when the workbench has successfully started.
Click the open icon () next to the workbench.

The code-server window opens.
In the Activity Bar, click the Extensions icon ().
Search for the name of the extension you want to install.
Click Install to add the extension to your code-server environment.

Verification

In the Browser - Installed list on the Extensions panel, confirm that the extension you installed is listed.

QUICK LINKS

STAY IN TOUCH