Models as a Service

This repository features an example of how you can set up 3scale and Red Hat SSO in front of models served by OpenShift AI to offer your users a portal through which they can register and get access keys to the models' endpoints.

Although not a reference architecture (there are many ways to implement this type of solution), this can serve as starting point to create such a service in your environment.

Further implementation could feature quotas, rate limits, different plans, billing,...

Architecture Overview

Screenshots

Portal:

Services:

Service detail:

Statistics:

Deployment

Model Serving

Manually through the UI

The following is an example on how to copy and serve models using OpenShift AI. Adapt to the models you want to use.

In OpenShift, create your projet, in this example llm-hosting.
In the namespace YAML definition of the project, add the label modelmesh-enabled: 'false'
In the project, create an RGW Object Bucket Claim. This will create the S3 storage space to store the models. Adapt to your own S3 storage if needed.
Switch to OpenShift AI dashboard and create a Data Connection models with the information from the OBC.
In OpenShift AI, under any of your projects, create and launch an ODH-TEC workbench using the above data connection:
Using ODH-TEC, import the following models from HuggingFace (don't forget to enter your HuggingFace Token in ODH-TEC Settings!):
From the OpenShift Console, deploy the different model servers using the following RuntimeConfigurations and InferenceServers:

Automatically with GitOps

We have shared a repository with the different configurations we use to manage our own OpenShift AI cluster through GitOps. In this folder of the repository you will find the YAML files used to create the different InferenceService instances we use to serve models.

Red Hat SSO

In this example we are using Red Hat SSO as the authentication backend for the 3scale Developer Portal. Other backends are supported if you prefer (Github and Auth0).

Create the project rh-sso.
Deploy the Red Hat Single Sign-On operator in the rh-sso namespace.
Create a Keycloak instance using keycloak.yaml.
Create a rhoai Keycloak Realm using keycloakrealm-maas.yaml.
Open the Red Hat Single Sign-on console (route in the Routes section, access credentials in Secrets->credentials-rh-sso).
Switch to the Rhoai realm:
In the Clients section, create a new client named 3scale, of type openid-connect:
Adjust the following:
- Access Type: confidential
- Enable only Standard Flow, leave all other toggle to off.
- For the moment, set Valid Redirect URLs to *.
- From the Credentials section, take note of the Secret.
- In the Mappers sections, create two new mappers:
  - Click on Add Builtin, select email verified and click on Add selected. You will end up with this mapping configuration:
  - Click on Create, enter Name org_name, Mapper Type User Attribute, User Attribute email, Token Claim Name org_name, Claim JSON type string, first 3 switches to on.
- In this configuration, the organization name for a user will be the same as the user email. This is to achieve full separation of the accounts. Adjust to your likings.
Create an IdentityProvider to connect your Realm to Red Hat authentication system. The important sections are Trust Email to enable, and set Sync Mode to import.

3Scale

Requirements

OpenShift Data Foundation deployed to be able to create an RWX volume for 3Scale system storage.

3Scale Deployment

We will start by creating the project and setting up the policy artifacts needed for tokens counting with LLMs.

Create the project 3scale.
Open a Terminal and login to OpenShift.

Switch to the folder deployment/3scale/llm_metrics_policy and run the following command:

oc create secret generic llm-metrics \
    -n 3scale \
--from-file=./apicast-policy.json \
--from-file=./custom_metrics.lua \
--from-file=./init.lua \
--from-file=./llm.lua \
--from-file=./portal_client.lua \
--from-file=./response.lua \
&& oc label secret llm-metrics apimanager.apps.3scale.net/watched-by=apimanager

Deploy the Red Hat Integration-3scale operator in the 3scale namespace only!
Using the deployed operator, create a Custom Policy Definition instance using deployment/3scale/llm-metrics-policy.yaml.
Using the deployed operator, create an APIManager instance using deployment/3scale/apimanager.yaml.
Wait for all the Deployments (15) to finish.

Configuration

Base configuration

Open the 3Scale administration portal for the RHOAI provider. It will be the Route starting with https://maas-admin-apps....
The credentials are stored in the Secret system-seed (ADMIN_USER and ADMIN_PASSWORD).
You will be greeted by the Wizard that you can directly close:

.
In the Account Settings sections (top menu):
- In Overview, adjust the Account Details to your provider name and Timezone.
Let's start by doing some cleanup:
- In the Products section, click on the default API product:
- In the top right, click on edit:
- In the Backends section, delete the default API Backend:

Backends and Products

We will start by adding the different Backends to our models:
- In the Backends section, create a new one for Granite. The Private Base URL is the one from the Service exposed by the model:
- Do the same for the other models/endpoints.
We can now create the Products. There will be one for each Backend.
For each product, apply the following configurations:
- In Integration->Settings, change the Auth user key field content to Authorization and the Credentials location field to As HTTP Basic Authentication (click on Update product at the bottom to save):
- Link the corresponding Backend
- Add the Policies in this order:
  1. CORS Request Handling:
    1. ALLOW_HEADERS: Authorization, Content-type, Accept.
    2. allow_origine: *
    3. allow_credentials: checked
  2. Optionally LLM Monitor for OpenAI-Compatible token usage. See Readme for information and configuration.
  3. 3scale APIcast
- Add the Methods and the corresponding Mapping Rules: create one pair for each API method/path.
- From the Integration->Configuration menu, promote the configuration to staging then production.
- Along the way you can cleanup the unwanted default Products and Backends.

Plans

For each Product, from the Applications->Application Plans menu, create a new Application Plan.
Once created, leave the Default plan to "No plan selected" so that users can choose their services for their applications, and publish it:
In Applications->Settings->Usage Rules, set the Default Plan to Default. This will allow the users to see the different available Products.

Portal configuration

Switch to the Audience section from the top menu.
In Developer Portal->Settings->Domains and Access, remove the Developer Portal Access Code.
In Developer Portal->Settings->SSO Integrations, create a new SSO Integration: of type Red Hat Single Sign On.
- Client: 3scale
- Client secret: ************
- Realm: https://keycloak-rh-sso.apps.prod.rhoai.rh-aiservices-bu.com/auth/realms/maas (adjust to your cluster domain name).
- Published ticked.
- Once created, edit the RH-SSO to tick the checkbox Always approve accounts...
- You can now test the authentication flow.

Portal content

This implementation of Models as a Service requires some customization of the Developer Portal, for example to be able to display the endpoint URLs or the model name to use, not only the API key. Also, some additional pages have been added to the portal, like configuration information or usage examples. Those modifications usually additional or modified resources like images, css and HTML snippets that should be uploaded for this purpose.

To automate the process, we are using the unofficial 3scale CMS CLI to apply the configuration that has been exported in deployment/3scale/portal.

As the access to the 3scale Admin REST APIs is protected, we need to get an access-token as well as the host first

export ACCESS_TOKEN=`oc get secret system-seed -o json -n 3scale | jq -r '.data.ADMIN_ACCESS_TOKEN' | base64 -d`
export ADMIN_HOST=`oc get route -n 3scale | grep maas-admin | awk '{print $2}'`

For convenience we set an alias first and then launch the 3scale CMS tool

alias cms='podman run --userns=keep-id:uid=185 -it --rm -v ./deployment/3scale/portal:/cms:Z ghcr.io/fwmotion/3scale-cms:latest'
cms -k --access-token=${ACCESS_TOKEN} ${ACCESS_TOKEN} https://${ADMIN_HOST}/ upload -u

There is also a download option in case you want to do changes manually on the 3scale CMS.

Automatic configuration using the operator

In this folder of our cluster configuration repository, you will find the different configuration files (subfolders organized by models) you can use with the 3Scale operator to automatically create and configure Backends and Products.

Adjustments may have to be done after this automated configuration as not all parameters are handled by the operator, like the Default plan that is applied to a Product.

Name		Name	Last commit message	Last commit date
Latest commit History 32 Commits
deployment		deployment
docs		docs
img		img
tests		tests
tools		tools
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Models as a Service

Architecture Overview

Screenshots

Deployment

Model Serving

Manually through the UI

Automatically with GitOps

Red Hat SSO

3Scale

Requirements

3Scale Deployment

Configuration

Base configuration

Backends and Products

Plans

Portal configuration

Portal content

Automatic configuration using the operator

About

Uh oh!

Releases

Packages

Contributors 3

Uh oh!

Languages

License

rh-aiservices-bu/models-aas

Folders and files

Latest commit

History

Repository files navigation

Models as a Service

Architecture Overview

Screenshots

Deployment

Model Serving

Manually through the UI

Automatically with GitOps

Red Hat SSO

3Scale

Requirements

3Scale Deployment

Configuration

Base configuration

Backends and Products

Plans

Portal configuration

Portal content

Automatic configuration using the operator

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Uh oh!

Languages

Packages