Installation

Pre-requisites

As a component of Open Data Hub, the AI Library requires the same pre-requisites as Open Data Hub.

Installing ODH requires OpenShift 3.11 or 4.x. Documentation for OpenShift can be located (here). All screenshots and instructions are from OpenShift 4.1. For the purposes of this quick start, we used try.openshift.com on AWS. Tutorials have also been tested on Code Ready Containers with 16GB of RAM.

External Components:

AI Library uses S3 Storage and has currently been tested with Ceph.

  • Ceph Storage

Open Data Hub Components

In addition, several ODH components are required and can be installed simultaneously or beforehand.

  • Seldon
  • Argo
  • JupyterHub (recommended but not required)

Enabling the AI Library

Installation of the AI Library can be done during the initial installation of ODH or enabled afterwards.

  1. Navigate to the ODH deployment.
    • Navigate to Installed Operators
    • Select Open Data Hub Operator
    • Click the Open Data Hub Tab
    • Under Open Data Hubs select your deployment. ODH List
  2. Edit the ODH deployment’s setting YAML. ODH YAML

  3. Set the pre-requisites and the ai-library settings to odh_deploy: true and set the s3 credentials appropriately. You can leave the other defaults or customized as you wish. The required settings should look somewhat like the following:
    aicoe-jupyterhub:
      odh_deploy: true
    seldon:
      odh_deploy: true
    argo:
      odh_deploy: true
    ai-library:
      odh_deploy: true
      s3_endpoint: 'https://ceph.storage'
      s3_access: 'access-key'
      s3_secret: 'secret-key'
      s3_bucket: 'my-bucket'
      s3_region: 'blank-for-ceph'
    
  4. Verify the installation.
    1. Navigate to your project’s status
    2. You should see several deployments including Seldon and AI Library services ODH YAML
    3. Using curl, browser, or other client test a route to one of the AI Library services. For example, on CRC to test linear regression:
      curl -k https://linear-regression-odh.apps-crc.testing/
      

      A response of Hello World!! indicates the service is alive and ready.

Installing Sample Models and Data

The AI Library deploys the appropriate endpoints, but it does not deploy sample models and data by default. Sample data and models are kept in a sample-models GitLab repository. In order to try some of these models, they must be copied to the Ceph storage location accessible to AI Library. Use your favorite method to do so. Below details how to do so using the s3cmd tool.

  1. Install s3cmd cli
    pip3 install s3cmd
    
  2. Configure the credentials either as environment variables or in s3cmd config file
    export ACCESS_KEY=
    export SECRET_KEY=
    export HOST=
    

    (or)

    s3cmd --configure
    
  3. Check connectivity by using the following command to list existing buckets
    s3cmd ls --host=$HOST --access_key=$ACCESS_KEY --secret_key=$SECRET_KEY
    
  4. Create a new bucket to copy data in to,
    s3cmd mb s3://AI-LIBRARY --host=$HOST --access_key=$ACCESS_KEY --secret_key=$SECRET_KEY
    
  5. Clone the sample data set locally
    git clone ​ https://gitlab.com/opendatahub/sample-models.git
    cd sample-models
    
  6. Sync the required directory/files to your s3 backend.
    s3cmd sync <MODEL-DIRECTORY> s3://AI-LIBRARY/ --host=$HOST --access_key=$ACCESS_KEY --secret_key=$SECRET_KEY
    
  7. Once copied, list the contents of the bucket AI-LIBRARY to check files have been copied
    s3cmd ls s3://AI-LIBRARY --host=$HOST --access_key=$ACCESS_KEY --secret_key=$SECRET_KEY --recursive