This repository features an example of how you can set up 3scale and Red Hat SSO in front of models served by OpenShift AI to offer your users a portal through which they can register and get access keys to the models' endpoints.
Although not a reference architecture (there are many ways to implement this type of solution), this can serve as starting point to create such a service in your environment.
Further implementation could feature quotas, rate limits, different plans, billing,...
Portal:
Services:
Service detail:
Statistics:
The following is an example on how to copy and serve models using OpenShift AI. Adapt to the models you want to use.
-
In OpenShift, create your projet, in this example
llm-hosting
. -
In the namespace YAML definition of the project, add the label
modelmesh-enabled: 'false'
-
In the project, create an RGW Object Bucket Claim. This will create the S3 storage space to store the models. Adapt to your own S3 storage if needed.
-
Switch to OpenShift AI dashboard and create a Data Connection
models
with the information from the OBC. -
In OpenShift AI, under any of your projects, create and launch an ODH-TEC workbench using the above data connection:
-
Using ODH-TEC, import the following models from HuggingFace (don't forget to enter your HuggingFace Token in ODH-TEC Settings!):
-
From the OpenShift Console, deploy the different model servers using the following RuntimeConfigurations and InferenceServers:
We have shared a repository with the different configurations we use to manage our own OpenShift AI cluster through GitOps. In this folder of the repository you will find the YAML files used to create the different InferenceService instances we use to serve models.
In this example we are using Red Hat SSO as the authentication backend for the 3scale Developer Portal. Other backends are supported if you prefer (Github and Auth0).
-
Create the project
rh-sso
. -
Deploy the Red Hat Single Sign-On operator in the
rh-sso
namespace. -
Create a Keycloak instance using keycloak.yaml.
-
Create a
rhoai
Keycloak Realm using keycloakrealm-maas.yaml. -
Open the Red Hat Single Sign-on console (route in the Routes section, access credentials in Secrets->
credentials-rh-sso
). -
Switch to the Rhoai realm:
-
In the Clients section, create a new client named
3scale
, of typeopenid-connect
: -
Adjust the following:
-
Access Type:
confidential
-
Enable only Standard Flow, leave all other toggle to off.
-
For the moment, set Valid Redirect URLs to
*
. -
From the Credentials section, take note of the Secret.
-
In the Mappers sections, create two new mappers:
-
In this configuration, the organization name for a user will be the same as the user email. This is to achieve full separation of the accounts. Adjust to your likings.
-
-
Create an IdentityProvider to connect your Realm to Red Hat authentication system. The important sections are
Trust Email
to enable, and setSync Mode
to import.
- OpenShift Data Foundation deployed to be able to create an RWX volume for 3Scale system storage.
We will start by creating the project and setting up the policy artifacts needed for tokens counting with LLMs.
-
Create the project
3scale
. -
Open a Terminal and login to OpenShift.
-
Switch to the folder deployment/3scale/llm_metrics_policy and run the following command:
oc create secret generic llm-metrics \ -n 3scale \ --from-file=./apicast-policy.json \ --from-file=./custom_metrics.lua \ --from-file=./init.lua \ --from-file=./llm.lua \ --from-file=./portal_client.lua \ --from-file=./response.lua \ && oc label secret llm-metrics apimanager.apps.3scale.net/watched-by=apimanager
-
Deploy the Red Hat Integration-3scale operator in the
3scale
namespace only! -
Using the deployed operator, create a Custom Policy Definition instance using deployment/3scale/llm-metrics-policy.yaml.
-
Using the deployed operator, create an APIManager instance using deployment/3scale/apimanager.yaml.
-
Wait for all the Deployments (15) to finish.
-
Open the 3Scale administration portal for the RHOAI provider. It will be the Route starting with
https://maas-admin-apps...
. -
The credentials are stored in the Secret
system-seed
(ADMIN_USER
andADMIN_PASSWORD
). -
You will be greeted by the Wizard that you can directly close:
-
In the Account Settings sections (top menu):
-
Let's start by doing some cleanup:
-
We will start by adding the different
Backends
to our models: -
We can now create the
Products
. There will be one for each Backend. -
For each product, apply the following configurations:
-
In
Integration->Settings
, change theAuth user key
field content toAuthorization
and theCredentials location
field toAs HTTP Basic Authentication
(click onUpdate product
at the bottom to save): -
Link the corresponding Backend
-
Add the Policies in this order:
- CORS Request Handling:
- ALLOW_HEADERS:
Authorization
,Content-type
,Accept
. - allow_origine: *
- allow_credentials: checked
- ALLOW_HEADERS:
- Optionally LLM Monitor for OpenAI-Compatible token usage. See Readme for information and configuration.
- 3scale APIcast
- CORS Request Handling:
-
Add the Methods and the corresponding Mapping Rules: create one pair for each API method/path.
-
From the Integration->Configuration menu, promote the configuration to staging then production.
-
Along the way you can cleanup the unwanted default Products and Backends.
-
-
For each Product, from the Applications->Application Plans menu, create a new Application Plan.
-
Once created, leave the Default plan to "No plan selected" so that users can choose their services for their applications, and publish it:
-
In Applications->Settings->Usage Rules, set the Default Plan to
Default
. This will allow the users to see the different available Products.
- Switch to the Audience section from the top menu.
- In Developer Portal->Settings->Domains and Access, remove the Developer Portal Access Code.
- In Developer Portal->Settings->SSO Integrations, create a new SSO Integration: of type Red Hat Single Sign On.
-
Client:
3scale
-
Client secret: ************
-
Realm:
https://keycloak-rh-sso.apps.prod.rhoai.rh-aiservices-bu.com/auth/realms/maas
(adjust to your cluster domain name). -
Published
ticked. -
Once created, edit the RH-SSO to tick the checkbox
Always approve accounts...
-
You can now test the authentication flow.
-
This implementation of Models as a Service requires some customization of the Developer Portal, for example to be able to display the endpoint URLs or the model name to use, not only the API key. Also, some additional pages have been added to the portal, like configuration information or usage examples. Those modifications usually additional or modified resources like images, css and HTML snippets that should be uploaded for this purpose.
To automate the process, we are using the unofficial 3scale CMS CLI to apply the configuration that has been exported in deployment/3scale/portal.
As the access to the 3scale Admin REST APIs is protected, we need to get an access-token as well as the host first
export ACCESS_TOKEN=`oc get secret system-seed -o json -n 3scale | jq -r '.data.ADMIN_ACCESS_TOKEN' | base64 -d`
export ADMIN_HOST=`oc get route -n 3scale | grep maas-admin | awk '{print $2}'`
For convenience we set an alias first and then launch the 3scale CMS tool
alias cms='podman run --userns=keep-id:uid=185 -it --rm -v ./deployment/3scale/portal:/cms:Z ghcr.io/fwmotion/3scale-cms:latest'
cms -k --access-token=${ACCESS_TOKEN} ${ACCESS_TOKEN} https://${ADMIN_HOST}/ upload -u
There is also a download option in case you want to do changes manually on the 3scale CMS.
In this folder of our cluster configuration repository, you will find the different configuration files (subfolders organized by models) you can use with the 3Scale operator to automatically create and configure Backends and Products.
Adjustments may have to be done after this automated configuration as not all parameters are handled by the operator, like the Default plan that is applied to a Product.