git clone <git-clone-URL>
Info alert:Important Notice
Please note that more information about the previous v2 releases can be found here. You can use "Find a release" search bar to search for a particular release.
Working in your data science IDE
In Open Data Hub, when you create a workbench, you select a workbench image that includes an IDE (integrated development environment) for developing your machine learning models.
Open Data Hub supports the following data science IDEs for developing ML models:
-
JupyterLab
-
code-server
Accessing your workbench IDE
To access a workbench IDE, use the link provided in the Open Data Hub interface.
-
You have created a data science project and a workbench.
-
From the Open Data Hub dashboard, click Data Science Projects.
-
Click the name of the project that contains the workbench.
-
If the status of the workbench is Running, skip to Step 4.
Otherwise, click the action menu (⋮) beside the workbench that you want to start, and click Start.
The Status column changes from Stopped to Starting when the workbench server is starting, and then to Running when the workbench has successfully started.
-
Click Open.
-
A new browser window opens for the workbench IDE.
Working in JupyterLab
JupyterLab is the latest web-based interactive development environment for notebooks, code, and data. You can configure and arrange workflows in data science and machine learning. JupyterLab is an open-source web application that supports over 40 programming languages, including Python and R.
Creating and importing notebooks
You can create a blank notebook or import a notebook from several different sources.
Creating a new notebook
You can create a new Jupyter notebook from an existing notebook container image to access its resources and properties. The Notebook server control panel contains a list of available container images that you can run as a single-user notebook server.
-
Ensure that you have logged in to Open Data Hub.
-
Ensure that you have launched your notebook server and logged in to JupyterLab.
-
The notebook image exists in a registry, image stream, and is accessible.
-
Click File → New → Notebook.
-
If prompted, select a kernel for your notebook from the list.
If you want to use a kernel, click Select. If you do not want to use a kernel, click No Kernel.
-
Check that the notebook file is visible in the JupyterLab interface.
Uploading an existing notebook file from local storage
You can load an existing notebook from local storage into JupyterLab to continue work, or adapt a project for a new use case.
-
Credentials for logging in to JupyterLab.
-
A launched and running Jupyter notebook server.
-
A notebook file exists in your local storage.
-
In the File Browser in the left sidebar of the JupyterLab interface, click Upload Files ().
-
Locate and select the notebook file and click Open.
The file is displayed in the File Browser.
-
The notebook file displays in the File Browser in the left sidebar of the JupyterLab interface.
-
You can open the notebook file in JupyterLab.
Additional resources
Collaborating on notebooks by using Git
If your notebooks or other files are stored in Git version control, you can import them from a Git repository onto your notebook server to work with them in JupyterLab. When you are ready, you can push your changes back to the Git repository so that others can review or use your models.
Uploading an existing notebook file from a Git repository by using JupyterLab
You can use the JupyterLab user interface to clone a Git repository into your workspace to continue your work or integrate files from an external project.
-
A launched and running Jupyter notebook server.
-
Read access for the Git repository you want to clone.
-
Copy the HTTPS URL for the Git repository.
-
On GitHub, click ⤓ Code → HTTPS and click the Clipboard button.
-
On GitLab, click Clone and click the Clipboard button under Clone with HTTPS.
-
-
In the JupyterLab interface, click the Git Clone button ().
You can also click Git → Clone a repository in the menu, or click the Git icon () and click the Clone a repository button.
The Clone a repo dialog appears.
-
Enter the HTTPS URL of the repository that contains your notebook.
-
Click CLONE.
-
If prompted, enter your username and password for the Git repository.
-
Check that the contents of the repository are visible in the file browser in JupyterLab, or run the ls command in the terminal to verify that the repository is shown as a directory.
Uploading an existing notebook file from a Git repository by using the command line interface
You can use the command line interface to clone a Git repository into your workspace to continue your work or integrate files from an external project.
-
A launched and running Jupyter notebook server.
-
Copy the HTTPS URL for the Git repository.
-
On GitHub, click ⤓ Code → HTTPS and click the Clipboard button.
-
On GitLab, click Clone and click the Clipboard button under Clone with HTTPS.
-
-
In JupyterLab, click File → New → Terminal to open a terminal window.
-
Enter the
git clone
command.Replace `<git-clone-URL>` with the HTTPS URL, for example:
[1234567890@jupyter-nb-jdoe ~]$ git clone https://github.com/example/myrepo.git Cloning into myrepo... remote: Enumerating objects: 11, done. remote: Counting objects: 100% (11/11), done. remote: Compressing objects: 100% (10/10), done. remote: Total 2821 (delta 1), reused 5 (delta 1), pack-reused 2810 Receiving objects: 100% (2821/2821), 39.17 MiB | 23.89 MiB/s, done. Resolving deltas: 100% (1416/1416), done.
-
Check that the contents of the repository are visible in the file browser in JupyterLab, or run the ls command in the terminal to verify that the repository is shown as a directory.
Updating your project with changes from a remote Git repository
You can pull changes made by other users into your data science project from a remote Git repository.
-
You have configured the remote Git repository.
-
You have already imported the Git repository into JupyterLab, and the contents of the repository are visible in the file browser in JupyterLab.
-
You have permissions to pull files from the remote Git repository to your local repository.
-
You have credentials for logging in to Jupyter.
-
You have a launched and running Jupyter server.
-
In the JupyterLab interface, click theĀ Git button ().
-
Click the Pull latest changes button ().
-
You can view the changes pulled from the remote repository on the History tab in the Git pane.
Pushing project changes to a Git repository
To build and deploy your application in a production environment, upload your work to a remote Git repository.
-
You have opened a notebook in the JupyterLab interface.
-
You have already added the relevant Git repository to your notebook server.
-
You have permission to push changes to the relevant Git repository.
-
You have installed the Git version control extension.
-
Click File → Save All to save any unsaved changes.
-
Click the Git icon () to open the Git pane in the JupyterLab interface.
-
Confirm that your changed files appear under Changed.
If your changed files appear under Untracked, click Git → Simple Staging to enable a simplified Git process.
-
Commit your changes.
-
Ensure that all files under Changed have a blue checkmark beside them.
-
In the Summary field, enter a brief description of the changes you made.
-
Click Commit.
-
-
Click Git → Push to Remote to push your changes to the remote repository.
-
When prompted, enter your Git credentials and click OK.
-
Your most recently pushed changes are visible in the remote Git repository.
Managing Python packages
In JupyterLab, you can view the Python packages that are installed on your notebook image and install additional packages.
Viewing Python packages installed on your notebook server
You can check which Python packages are installed on your notebook server and which version of the package you have by running the pip
tool in a notebook cell.
-
Log in to JupyterLab and open a notebook.
-
Enter the following in a new cell in your notebook:
!pip list
-
Run the cell.
-
The output shows an alphabetical list of all installed Python packages and their versions. For example, if you use this command immediately after creating a notebook server that uses the Minimal image, the first packages shown are similar to the following:
Package Version --------------------------------- ---------- aiohttp 3.7.3 alembic 1.5.2 appdirs 1.4.4 argo-workflows 3.6.1 argon2-cffi 20.1.0 async-generator 1.10 async-timeout 3.0.1 attrdict 2.0.1 attrs 20.3.0 backcall 0.2.0
Installing Python packages on your notebook server
You can install Python packages that are not part of the default notebook server by adding the package and the version to a requirements.txt
file and then running the pip install
command in a notebook cell.
Note
|
You can also install packages directly, but using a requirements.txt file so that the packages stated in the file can be easily re-used across different notebooks is recommended. In addition, using a requirements.txt file is also useful when using a S2I build to deploy a model.
|
-
Log in to JupyterLab and open a notebook.
-
Create a new text file using one of the following methods:
-
Click + to open a new launcher and click Text file.
-
Click File → New → Text File.
-
-
Rename the text file to
requirements.txt
.-
Right-click on the name of the file and click Rename Text. The Rename File dialog opens.
-
Enter
requirements.txt
in the New Name field and click Rename.
-
-
Add the packages to install to the
requirements.txt
file.altair
You can specify the exact version to install by using the
==
(equal to) operator, for example:altair==4.1.0
Specifying exact package versions to enhance the stability of your notebook server over time is recommended. New package versions can introduce undesirable or unexpected changes in your environment’s behavior. To install multiple packages at the same time, place each package on a separate line.
-
Install the packages in
requirements.txt
to your server using a notebook cell.-
Create a new cell in your notebook and enter the following command:
!pip install -r requirements.txt
-
Run the cell by pressing Shift and Enter.
ImportantThis command installs the package on your notebook server, but you must still run the
import
directive in a code cell to use the package in your code.import altair
-
-
Confirm that the packages in
requirements.txt
appear in the list of packages installed on the notebook server. See Viewing Python packages installed on your notebook server for details.
Troubleshooting common problems in Jupyter for users
If you are seeing errors in Open Data Hub related to Jupyter, your notebooks, or your notebook server, read this section to understand what could be causing the problem.
- I see a 403: Forbidden error when I log in to Jupyter
-
Problem
If your administrator has configured specialized user groups for Open Data Hub, your username might not be added to the default user group or the default administrator group for Open Data Hub.
ResolutionContact your administrator so that they can add you to the correct group/s.
- My notebook server does not start
-
Problem
The OpenShift Container Platform cluster that hosts your notebook server might not have access to enough resources, or the Jupyter pod may have failed.
ResolutionCheck the logs in the Events section in OpenShift for error messages associated with the problem. For example:
Server requested 2021-10-28T13:31:29.830991Z [Warning] 0/7 nodes are available: 2 Insufficient memory, 2 node(s) had taint {node-role.kubernetes.io/infra: }, that the pod didn't tolerate, 3 node(s) had taint {node-role.kubernetes.io/master: }, that the pod didn't tolerate.
Contact your administrator with details of any relevant error messages so that they can perform further checks.
- I see a database or disk is full error or a no space left on device error when I run my notebook cells
-
Problem
You might have run out of storage space on your notebook server.
ResolutionContact your administrator so that they can perform further checks.
Working in code-server
Open Data Hub includes the code-server workbench image.
For more information on code-server, see code-server in GitHub.
Note
|
Elyra-based pipelines are not available with the code-server workbench image. |
Installing extensions with code-server
With the code-server workbench image, you can customize your code-server environment by using extensions to add new languages, themes, and debuggers, and to connect to additional services. You can also enhance the efficiency of your data science work with extensions for syntax highlighting, auto-indentation, and bracket matching.
For details about the third-party extensions that you can install with code-server, see the Open VSX Registry.
-
You are logged in to Open Data Hub.
-
If you use Open Data Hub groups, you are part of the user group or admin group (for example,
odh-users
orodh-admins
) in OpenShift. -
You have created a data science project that has a code-server workbench.
-
From the Open Data Hub dashboard, click Data Science Projects.
The Data Science Projects page opens.
-
Click the name of the project containing the code-server workbench you want to start.
A project details page opens.
-
Click the Workbenches tab.
-
If the workbench that you want to use is not already running, click the action menu (⋮) beside the workbench, and click Start.
The Status column changes from Stopped to Starting when the workbench server is starting, and then to Running when the workbench has successfully started.
-
After the workbench has started, click Open to open the workbench notebook.
-
In the Activity Bar, click the Extensions icon ().
-
Search for the name of the extension you want to install.
-
Click Install to add the extension to your code-server environment.
-
In the Browser - Installed list on the Extensions panel, you see the extension that you installed.