Open Data Hub logo

Info alert:Important Notice

The Open Data Hub documentation and the opendatahub-documentation repository are archived as of March 2026. To see the latest documentation, go to: Red Hat OpenShift AI Self-Managed documentation.

Customize models to build gen AI applications

Learn how to customize a model, from setting up your development environment to building and deploying a model specific to your domain-specific use case.

Overview of the model customization workflow

Red Hat AI model customization empowers you to tailor artificial intelligence models to your unique data and operational requirements. The model customization process involves the training or fine-tuning of pre-existing models with proprietary datasets, followed by their deployment with specific configurations on the Open Data Hub platform. This comprehensive approach is facilitated by a powerful suite of integrated toolkits that streamline and accelerate the development of generative AI applications.

The workflow for customizing models includes the following tasks:

Set up your working environment

Ensure reliable and secure access to supported libraries with the Red Hat Hosted Python index. For details, see Set up your working environment.

Prepare your data for AI consumption

To prepare your data, use Docling, a powerful Python library to transform unstructured data (such as text documents, images, and audio files) into structured formats that models can consume. For details, see Prepare your data for AI consumption.

To automate data processing tasks, you can build Kubeflow Pipelines (KFP), see Automate data processing steps by building AI pipelines.

Generate synthetic data

Use the Red Hat AI Synthetic Data Generation (SDG) Hub framework to build, compose, and scale synthetic data pipelines with modular blocks. With the SDG Hub, you can extend your synthetic data pipelines with custom blocks to fit your domain, replace ad hoc scripts with the SDG Hub repeatable framework, and scale data generation with asynchronous execution and monitoring. For details, see Generate synthetic data.

Train a model by using your prepared data

After you prepare your data, use the Red Hat AI Training Hub to simplify and accelerate the process of fine-tuning and customizing a foundation model by using your own data.

You can extend a base notebook to use distributed training across multiple nodes by using the Kubeflow Training Operator (KFTO). The KFTO abstracts the underlying infrastructure complexity of distributed training and fine-tuning of models. The iterative process of fine-tuning significantly reduces the time and resources required compared to training models from scratch.

Serve and consume a customized model

After you customize a model, you can serve your customized models as APIs (Application Programming Interfaces). Serving a model as an API enables seamless integration into existing or newly developed applications.

Learn more about serving and consuming a customized model in Deploying models on the model serving platform.

Set up your working environment

To set up your working environment for customizing models, complete these tasks:

  1. For disconnected environments, mirror the Red Hat index.

  2. Create a custom workbench image that is based on a base image that is configured to use the Red Hat Python index and install packages. Install JupyterLab in your custom workbench image so that you can run example notebooks.

  3. From your running workbench, import example notebooks.

About the Red Hat Python index

Red Hat AI includes a maintained Python package index that provides secure and reliable access to supported libraries, with full support for disconnected environments. For details about Red Hat support for the Python package, see Support philosophy: A secure platform.

Table 2.1 lists the images that are configured to use the Red Hat Python index.

Table 1. Images configured to use the Red Hat Python index
Accelerator UBI9 List of packages Registry URL Catalog URL

CPU

https://console.redhat.com/api/pypi/public-rhai/rhoai/3.3/cpu-ubi9/simple/

registry.redhat.io/rhai/base-image-cpu-rhel9:3.3.0-1771879589

https://catalog.redhat.com/en/software/containers/rhai/base-image-cpu-rhel9/690377f9d1c73dd1e81192f0

CUDA 12.9

https://console.redhat.com/api/pypi/public-rhai/rhoai/3.3/cuda12.9-ubi9/simple/

registry.redhat.io/rhai/base-image-cuda-12.9-rhel9:3.3.0-1771879204

https://catalog.redhat.com/en/software/containers/rhai/base-image-cuda-12.9-rhel9/697afa2896a99d8531fb3212

CUDA 13.0

https://console.redhat.com/api/pypi/public-rhai/rhoai/3.3/cuda13.0-ubi9/simple/

registry.redhat.io/rhai/base-image-cuda-13.0-rhel9:3.3.0-1771879618

https://catalog.redhat.com/en/software/containers/rhai/base-image-cuda-13.0-rhel9/697afa2871e1ee36cfe24ce5

ROCm

https://console.redhat.com/api/pypi/public-rhai/rhoai/3.3/rocm6.4-ubi9/simple/

registry.redhat.io/rhai/base-image-rocm-6.4-rhel9:3.3.0-1771879436

https://catalog.redhat.com/en/software/containers/rhai/base-image-rocm-6.4-rhel9/697afa29b684f43afdb760ef

Notes:

  • NVIDIA CUDA, AMD GPU, and AMD ROCm RPM repositories are configured, but disabled.

  • The images listed in Table 2.1 have RHEL RPM repositories enabled. A RHEL RPM is a package file used for the Red Hat Package Manager system on Red Hat Enterprise Linux (RHEL). An RPM file contains all the necessary components for an application, such as executable files, configuration files, and documentation. It simplifies the process of distributing, installing, and managing software by bundling everything into a single, standalone file.

    You can install additional RPMs, but you must have a Red Hat Extended Update Support (EUS) subscription and you must run your container image in root mode (for example, podman run --user 0).

    For more information about Red Hat Package Manager, see Introduction to RPM.

Mirror the Python index for your disconnected environment

If you are using a disconnected environment, use the following code example to access the Red Hat Python index content and copy it locally. You can then upload the packages into your own internal hosting service:

#!/bin/bash -x

URL=https://console.redhat.com/api/pypi/public-rhai/rhoai/3.3/cuda12.9-ubi9/simple/

 wget \
     --verbose \                 # Show detailed progress
     --mirror \                  # Mirror the directory structure
     --continue \                # Resume partial downloads
     --no-host-directories \     # Don't create host-named directories
     --cut-dirs=4 \              # Remove first 4 path segments
     $URL

Install packages and JupyterLab

To ensure reliable and secure access to supported libraries, start your model customization workflow by creating a workbench image that is based on a Red Hat base image that is configured to use the Red Hat Python index. These base images are listed in Table 2.1.

Note: When you create a custom workbench image that is based on one of the images listed in Table 2.1, install JupyterLab. You can use JupyterLab to run the example model customization notebooks.

For guidance on custom workbenches, see Creating a custom workbench image from your own image.

When you use one of the images listed in Table 2.1 as a base image, both pip and uv commands are pre-configured to use the Red Hat Python index and system trust store for HTTPS.

When you run a pip install command, it installs the package version referenced in the Red Hat Python index, ensuring that you are installing a version of the library that is secure and reliably accessible.

For example, use the following commands to install the model customization libraries:

  • Install the data processing library:

    pip install docling
  • Install the synthetic data generation library:

    pip install sdg-hub
  • Install the model training library:

    pip install training-hub

    Install the model training library with CUDA support:

    pip install training-hub[cuda]

    Note: For additional options and details for installing the model training library, see Training Hub installation guidelines.

Import example notebooks

To get started with customizing your models, you can run provided example notebooks and scripts. The following table lists the Git repositories that provide example notebooks for each model customization component.

For a comprehensive tutorial that demonstrates an AI/ML workflow, see the Knowledge Tuning example on the Red Hat AI examples site.

The Knowledge Tuning tutorial is a curated collection of Jupyter notebooks that includes examples of using Docling to process data, Training Hub to fine-tune a model on that data, and KServe to deploy the final model for a Question and Answer application.

Table 2. Model customization example notebooks
Model customization component Git clone example repository Branch Directory

Data processing using Docling

https://github.com/opendatahub-io/data-processing.git

stable

notebooks/

Synthetic data generation

https://github.com/Red-Hat-AI-Innovation-Team/sdg_hub.git

main

examples

Training

https://github.com/Red-Hat-AI-Innovation-Team/training_hub.git

main

examples

End-to-end example for model customization with these components

https://github.com/red-hat-data-services/red-hat-ai-examples.git

main

knowledge-tuning

Clone an example Git repository

Follow these steps to clone a Git repository from the JupyterLab environment provided with your Open Data Hub workbench.

Prerequisites
  • You have the https URL and branch for one of the example Git repositories listed in Table 2.2.

Procedure
  1. From the Open Data Hub dashboard, go to the project where you created a workbench.

  2. Click the link for your workbench. If prompted, log in and allow JupyterLab to authorize your user.

    Your JupyterLab environment window opens.

    The file-browser window shows the files and directories that are saved inside your own personal space in Open Data Hub .

  3. Bring the content of an example Git repo inside your JupyterLab environment:

    1. On the toolbar, click the Git Clone icon.

    2. Enter a Git https URL.

    3. Select the Include submodules option, and then click Clone.

  4. If you want to use a branch other than main (for example, the data processing example repo uses the stable-3.0 branch), change the branch:

    1. In the left navigation bar, click the Git icon, and then click Current Branch to expand the branches and tags selector panel.

    2. On the Branches tab, in the Filter field, enter the branch name.

    3. Select the branch.

      The current branch changes to the branch that you selected.

Verification

  • In the file browser, double-click the newly-created directory to see the example files.

Prepare your data for AI consumption

To prepare your data, use Docling to transform unstructured data (such as text documents, images, and audio files) into structured formats that models can consume.

To automate data processing tasks, you can build Kubeflow Pipelines (KFP). For examples of pre-built pipelines for unstructured data processing with Docling, see https://github.com/opendatahub-io/data-processing.

Process data by using Docling

Docling is the Python library that you use to prepare unstructured data (like PDFs and images) for consumption by large language models.

Explore the data processing examples

To get started with data processing with Docling, explore the provided examples.

Prerequisites
Procedure
  1. To access the data processing examples, use one of the following methods to clone the data processing Git repository:

  2. Go to the notebooks directory to learn how to use Docling for the following tasks:

    • Convert - Change unstructured documents (PDF files) to structured format (Markdown), with and without vision-language model (VLM)

    • Chunk - Split documents into smaller, semantically meaningful pieces

    • Extract information - Use template formats to extract specific data fields from documents like invoices.

    • Select subsets - Reduce the size of your dataset. The algorithm analyzes an input dataset and reduces it in size, while ensuring data diversity and coverage.

    Tutorials - An example notebook that provides a complete, end-to-end workflow for preparing a dataset of documents for a RAG (Retrieval-Augmented Generation) system.

Additional resources

Automate data processing steps by building AI pipelines

With Kubeflow Pipelines (KFP), you can automate complex, multi-step Docling data processing tasks into scalable workflows.

With the KFP Software Development Kit (SDK), you can define custom components and stitch them together into a complete pipeline. The SDK allows you to fully control and automate Docling conversion tasks with specific parameters.

Note: You can build a custom runtime image to ensure that all required Docling dependencies are present for pipeline execution. For information on how to run a Docling pipeline with a custom image see the Docling Pipeline documentation.

Explore the Kubeflow Pipeline examples

To get started with Kubeflow Pipelines, explore the provided examples. You can download and modify the example code to quickly create a Docling data processing or model training pipeline.

Prerequisites
Procedure
  1. To access the Kubeflow Pipeline examples, run the following command to clone the data processing Git repository:

    git clone https://github.com/opendatahub-io/data-processing -b stable
  2. Go to the kubeflow-pipelines directory, which contains the following tested examples for running Docling as a scalable pipeline. For instructions on how to import, configure, and run the examples, see the README file and the Red Hat AI Working with AI pipelines guide.

    • Standard Pipeline: For converting standard documents that contain text and structured elements. For more information, see the Standard Conversion Pipelines documentation.

    • VLM (Vision Language Model): For converting highly complex or difficult-to-parse documents, such as those with custom instructions or complex layouts, or to add image descriptors. For more information, see the VLM Pipelines documentation.

NOTE: If you want to use a Red Hat container image in one of the above pipelines, replace the image on this line with the URL for the Red Hat Docling container image and recompile the pipeline that you want to use.

Generate synthetic data

When you customize a model for your enterprise, you must generate high-quality synthetic data to augment your dataset, improve model robustness, and cover edge cases.

Red Hat provides the Synthetic Data Generation (SDG) Hub, a modular Python framework for building synthetic data generation pipelines by using composable blocks and flows. Each block performs a specific task, such as LLM chat, parse text, evaluate, or transform data. Flows chain blocks together to create complex data generation pipelines that include validation and parameter management. A flow (data generation pipeline) is a YAML specification that defines an instance of a data generation algorithm.

Explore the SDG Hub examples

To get started with SDG Hub, explore the provided examples.

Prerequisites
Procedure
  1. To access the SDG Hub examples, clone the SDG Hub Git repository:

  2. Go to the examples directory to view the notebooks and YAML files for these use cases:

    • Knowledge tuning - Generate data to fine-tune a model on enterprise documents so that the resulting trained model can accurately recall relevant content and facts in response to user queries. This example provides a complete walkthrough of data generation and preparation for training.

    • Enhanced Knowledge Tuning - This version of the knowledge pipeline introduces data scaling and improved prompting strategies. While the previous pipeline focused on scaling the number of questions per summary, this iteration focuses on generating n summaries per document, where n is a configurable parameter. Increasing summary token count leads to superior memorization.

      This pipeline formalizes data generation by converting documents into Augmentations. It implements the following augmentation instances:

      • Thematic Summaries - for capturing high-level ideas.

      • Knowledge Relationships - for identifying connections between segments.

      • Atomic Facts - for isolating granular details.

        For each summary, the system generates three question-answer pairs; excess pairs are discarded during post-processing. This architecture is modular; you can integrate additional augmentation types by editing existing flows or adding new ones.

    • Text analysis - Generate data for teaching models to extract meaningful insights from text in structured format. Create custom blocks and extend existing flows for new applications.

      Each use case directory includes a README file that provides details for each use case — such as instructions, performance notes, and configuration tips.

    • RAG Evaluation - Reliable, repeatable evaluation is essential before shipping RAG or agentic updates to production. SDG Hub offers a flow for generating data for evaluating RAG systems at scale. The input of the flow is user documents and the output dataset is post-processed to work with the RAGAS framework. The flow creates question-answer pairs with ground truth context for evaluating RAG systems. This flow simplifies data generation for RAG workflows.

  3. When you run the example notebooks, consider the following information:

    • Data generation time and statistics: The total time to generate data depends on both the maximum concurrency supported by your endpoint and the complexity of the running flow. Longer flows, such as the flows in the Knowledge Generation notebooks, take more time to complete because they produce a large number of summaries and Q&A pairs, each of which undergoes verification within the pipeline.

    • LLM endpoint requirements: For running flows in the Knowledge Generation notebooks, Red Hat recommends that you set the following values:

      • Set NUMBER_OF_SUMMARIES to a minimum of 10.

      • To achieve reasonable data generation times and avoid timeouts, use an endpoint that supports a maximum concurrency of at least 50.

      • Extend LiteLLM’s request timeout by setting the environment variable LITELLM_REQUEST_TIMEOUT.

Additional resources

Performance benchmarks for knowledge tuning

To get an estimate of the total time a flow will take, you can run the dry_run function and set enable_time_estimation to true.

For example, tests that use the gpt-oss-120b LLM on 4x H100 GPUs with the QuALITY dataset (266 articles) showed significant variance between flows.

  • The estimated generation times for the full dataset were approximately 15.12 hours for Extractive Summary and 12.99 hours for Detailed Summary, both of which were evaluated with 50 completions per summary (N=50).

  • In contrast, the Key Facts and Document Based flows, which generated only a single summary per document, completed in approximately 0.35 and 0.44 hours, respectively.

  • Additionally, analysis of the Extractive Summary flow highlights that the steepest time reductions occurred between concurrency levels 10 and 30, with returns observed to diminish significantly beyond 50 in this specific configuration.

Guided example - Build a KFP pipeline for SDG

You can generate synthetic data for domain-specific model customization by using a Kubeflow Pipeline (KFP) on Open Data Hub. The Domain Customization Data Generation using Kubeflow Pipelines (KFP) is a guided example.

Prerequisites
Procedure
  1. Run the following command to clone the (org-name) AI examples repository that includes the KFP pipeline for knowledge tuning example.

    git clone https://github.com/red-hat-data-services/red-hat-ai-examples
  2. Navigate to the examples/domain_customization_kfp_pipeline directory.

  3. Follow the instructions in the README file to run the example:

    1. Configure an environment variable (.env) file, provide your model endpoint, and store the file as a Kubernetes secret. The KFP pipeline consumes the secret as environment variables.

    2. Generate the KFP pipeline YAML file.

    3. Upload the YAML file to OpenShift AI and deploy the pipeline.

Verification

The example pipeline generates three types of document augmentations and four types of QA on top of 3 augmentation and 1 original document. It stores the generated data in the Cloud Object Storage (COS) bucket that is linked through the pipeline server.

Train the model by using your prepared data

To train the model, you can use the Red Hat Training Hub and the Kubeflow Training Operator (KFTO).

You can simplify and accelerate the process of fine-tuning and customizing a foundation model by using your own data. The Red Hat Training Hub is an algorithm-focused interface for common LLM training, continual learning, and reinforcement learning techniques.

Explore Training Hub examples

The Training Hub repository hosts multiple cookbooks for using different LLM algorithms such as Supervised Fine-tuning (SFT), Orthogonal Subspace Fine-tuning (OSFT)/Continual Learning, and Low-Rank Adaptation (LoRA)/Quantized Low-Rank Adaptation (QLoRA). OSFT is a training algorithm built by the Red Hat AI Innovation team. With OSFT, you can continually post-train a fine-tuned model to expand its knowledge on new data. You can tinker with the Training Hub cookbooks from a workbench within your Open Data Hub project.

To get started with Training Hub, explore the provided examples.

Prerequisites
Procedure
  1. To access Training Hub examples, clone the Training Hub Git repository:

    • To clone the https://github.com/Red-Hat-AI-Innovation-Team/training_hub.git repository from JupyterLab, follow the steps in Clone an example Git repository.

    • To create a local clone of the repository, run the following command:

      git clone https://github.com/Red-Hat-AI-Innovation-Team/training_hub
  2. Go to the examples directory to view Training Hub notebooks, Python scripts, and documentation.

    • For a quick overview and descriptions of the supported algorithms and features, with links to examples and getting started code, see the top-level README file.

    • For detailed parameter documentation, see the docs directory.

    • For hands-on learning with the interactive notebooks, see the notebooks directory.

    • For pre-written, configurable python scripts to run training algorithms with various language models, see the scripts directory.

Training Hub algorithm and model support matrix

To simplify tuning for enterprise customers, Training Hub supports multiple backends and exposes a unified API surface to access the latest training algorithms from different backends.

The following table lists Training Hub algorithm and model support matrix.

Table 3. Training Hub algorithm and model support matrix

Algorithm

Backend

Supported Model Architectures

Supervised Fine-tuning (SFT)

instructlab.training

GPTOssForCausalLM (GPT OSS 20B/120B)

LlamaForCausalLM (Llama 3 Models)

Qwen2ForCausalLM (Qwen 2.5 models)

Qwen3ForCausalLM (Qwen 3 models)

GraniteForCausalLM (Granite 3 models)

GraniteMoeHybridForCausalLM (Granite 4 models)

Phi3ForCausalLM (Phi 3 and 4 models)

MistralForCausalLM (Mistral models)

Orthogonal Subspace Fine-tuning (OSFT)

mini-trainer

Same as SFT

Low-Rank Adaptation (LoRA) /Quantized Low-Rank Adaptation (QLoRA)

Unsloth

GPTOssForCausalLM (GPT OSS 20B/120B) (QLoRA ONLY)

LlamaForCausalLM (Llama 3 Models)

Qwen2ForCausalLM (Qwen 2.5 models)

Qwen3ForCausalLM (Qwen 3 models)

GraniteForCausalLM (Granite 3 models)

GraniteMoeHybridForCausalLM (Granite 4 models)

MistralForCausalLM (Mistral models)

NOTE: If you experience an issue with the model classes listed for OSFT with use_liger=True, try setting use_liger=False. Liger kernels are supported for most model architectures, but some newer architectures might experience errors or instability if not fully supported. For up-to-date support information, see the Liger-Kernel GitHub repository.

Estimate memory usage

To learn how to estimate the amount of memory you need for running and training a specific model, as well as whether your configured GPUs can support the model, use the memory estimator. The memory estimator currently supports only Supervised Fine-tuning (SFT) and Orthogonal Subspace Fine-tuning (OSFT) algorithms. See the following example files in the Training Hub Git repository:

  • For the Memory Estimator API, see the src/training_hub/profiling/memory_estimator.py file.

  • For an example notebook that uses the API, see notebooks/memory_estimator_example.ipynb file.

Compare the performance of OSFT, SFT, and LoRA training algorithms

You can use the Orthogonal Subspace Fine-Tuning (OSFT), Supervised Fine-Tuning (SFT), and Low-Rank Adaptation (LoRA) algorithms in Training Hub.

Use SFT to fine-tune a model on supervised datasets with support for:

  • Single-node and multi-node distributed training

  • Configurable training parameters, for example, epochs, batch size, and learning rate.

  • InstructLab-Training backend integration

Use OSFT to fine-tune a model while controlling how much of its existing behavior to preserve, with support for:

  • Single-node and multi-node distributed training

  • Configurable training parameters (for example, epochs, batch size, learning rate)

  • RHAI Innovation Mini-Trainer backend integration

Use LoRA for parameter-efficient fine-tuning with significantly reduced memory requirements, with support for:

  • Training low-rank adaptation matrices instead of full model weights

  • Unsloth backend integration

  • QLoRA variant for further memory reduction (Float4)

The examples/docs directory contains information and examples for how to use each algorithm.

Here is a performance comparison of using OSFT, SFT, and LoRA in Training Hub.

NOTE: When scaling the usage of Liger Kernels for all methods, some amount of fixed overhead memory is added to all methods that do not use Liger Kernels.

  • Memory scaling: OSFT adds additional memory overhead to the model storage due to its unique matrices, roughly about 1.25-1.5x that of the normal model storage in SFT. However, the rest of OSFT memory scales linearly with the unfreeze rank ratio (URR). The URR is a hyperparameter for OSFT that is a value between 0 and 1. It represents the fraction of the matrix rank that is unfrozen and updated during fine-tuning.

    A rough comparison is: OSFT Memory ~ 3 x r x SFT Memory, where r is the URR unfreeze rank ratio, the fraction of the matrix being fine-tuned. At URR = 1/3, OSFT and SFT have similar memory usage.

    In most post-training setups, URR values below 1/3 are sufficient for learning new tasks, making OSFT notably lighter in memory.

    Like SFT, LoRA requires a fixed amount of overhead memory to store the base model, intermediate activations, and outputs. The rest of the memory needed for LoRA scales linearly based on the LoRA rank (lora_r) parameter. The lora_r value is an integer, ideally no more than the size of any of the model’s weight dimensions, that determines how many rows should be used in each of LoRA’s approximated matrices.

    You should keep the lora_r value as low as possible. As lora_r approaches 0, the memory that LoRA uses should approach 1/4 * SFT. While it is difficult to precisely compare SFT and LoRA, LoRA’s memory usage should begin to reach or exceed that of SFT’s if the value of lora_r is more than 3/8 of the size of the hidden dimensionality. Note that the memory used by LoRA in Training Hub is further reduced by the fact that LoRA uses Float16 as its main datatype. QLoRA uses Float4 instead. Note that when using QLoRA, you must briefly place the Float16 model onto the GPU, which can bottleneck memory usage.

  • Training time: On datasets of equal size, OSFT typically takes about twice as long per phase. However, because OSFT does not require replay buffers from past tasks, unlike SFT, the total training time across multiple phases or tasks is lower with clear benefits as the number of tasks grows. Because OSFT supports continual learning without maintaining or reusing old data, it enables lighter, single-pass end-to-end runs.

Distribute training jobs by using the Kubeflow Training Operator

If you want to implement distributed training across multiple nodes to meet the needs of your training workloads, you can use the Kubeflow Training Operator (KFTO). KFTO abstracts the underlying infrastructure complexity of distributed training and fine-tuning of models. The iterative process of fine-tuning significantly reduces the time and resources required compared to training models from scratch.

Learn more about KFTO in the following Open Data Hub documentation:

Distributed fine-tuning with Training Hub and Kubeflow Training Operator

The Kubeflow Training Operator (KFTO) supports distributed fine-tuning by using Training Hub, abstracting the complexity of distributed training. It seamlessly manages scaling and orchestration for you, allowing you to focus on your domain-specific fine-tuning logic by using the simplified Training Hub APIs.

For a comprehensive tutorial on Fine Tuning with Training Hub leveraging distributed nodes with KFTO, follow these guided examples:

End-to-end model customization workflow

You can implement end-to-end workflows by using notebooks and the Kubeflow Training Operator for distributed training or by using AI pipelines.

  • Notebook workflow examples

    For a comprehensive notebook tutorial that demonstrates an AI/ML workflow, see the Knowledge Tuning example on the Red Hat AI examples site.

    The Knowledge Tuning tutorial is a curated collection of Jupyter notebooks that includes examples of using Docling to process data, Training Hub to fine-tune a model on that data, and KServe to deploy the final model for a Question and Answer application.

  • AI pipeline example

    You can run an end-to-end model customization workflow by using a fine-tuning AI pipeline, as shown in the Fine-tuning pipelines on Red Hat OpenShift AI guided example. The AI pipelines in this example use Training Hub algorithms to fine-tune a model, evaluate it, and register it.

Support philosophy: A secure platform

Our primary goal is to provide a secure and reliable platform for serving and customizing models on Open Data Hub.

The Python packages for model customization (such as docling, sdg-hub, and training-hub) are key components of this platform.

Our support strategy is focused on the integrity of the platform and the secure delivery of these tools, rather than providing direct, standalone support for the individual Python packages themselves.

What is supported

  • Installation on Open Data Hub: We fully support the successful installation of these packages from the Red Hat AI Python index onto a supported Open Data Hub environment when you use the provided base images.

  • The Platform: The underlying Open Data Hub platform, including its components and infrastructure, is fully supported according to its own lifecycle policy.

What is not supported

  • Issues arising from the use of these packages, for example, to build custom flows or applications.

  • Mixing packages outside of the packages provided with the Red Hat AI Python Index base images.

The primary benefit of this strategy is a secure software supply chain. By using the Red Hat AI Python Index, you are guaranteed:

  • Red Hat Builds: You are using Red Hat builds of Python libraries built and delivered by Red Hat and our partners. These builds ensure provenance because Red Hat pulls, scans, and builds all dependencies for the packages.

  • Trusted Source: The index provides a trusted, secure, and reliable source for your generative AI workflows, especially critical for disconnected (air-gapped) environments.

  • Platform Integrity: You can be confident that the tools are tested and intended for use on the Open Data Hub platform.

For deeper technical questions or contributions related to the packages themselves, we encourage users to engage with the upstream open-source communities.