OCI Generative AI On-Demand Models – From Setup to Chat App

September 21, 2025

OCI Generative AI On-Demand Models – From Setup to Chat App

Generative AI is transforming how organizations build intelligent applications, from interactive assistants to automated knowledge systems. Oracle Cloud Infrastructure (OCI) makes this power accessible through its Generative AI On-Demand Models, including options like Cohere’s Command R+ and Meta’s Llama 3.3.

On-Demand Model are economical:

In this guide, we’ll take you through the complete journey — starting with configuring access and locating the right model OCID, and ending with a fully functional chat application built using the OCI Python SDK and Streamlit. By the end, you’ll know exactly how to move from setup to implementation and bring Generative AI into your own applications.

Getting Started

Before writing any code, you must configure OCI credentials to allow your application to call Generative AI services.

1. Generate an API Key

Log in to the OCI Console.
Click your profile icon → User Settings.
Under Resources, select API Keys.
Click Add API Key and either:
- Generate a new key pair in OCI (download the private key .pem), or
- Upload your own public key (if you already created one with openssl).

After adding the key, OCI will show you:

User OCID
Tenancy OCID
Fingerprint
tenancy
region

Copy these values.

2. Save the Private Key

If OCI generated the key, download the .pem file and place it under ~/.oci/oci_api_key.pem.
Restrict access:


chmod 600 ~/.oci/oci_api_key.pem

3. Update the OCI Config File
Create or edit ~/.oci/config and add a section, for example:
[DEFAULT]
user=ocid1.user.oc1..aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
fingerprint=60:11:15:19:15:11:11:11:11:11:11:11:11:11:11:11
tenancy=ocid1.tenancy.oc1..aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
region=us-chicago-1
key_file=/root/.oci/oci_api_key.pem

4. Fix File Permissions (if needed)
If you see warnings like:
Permissions on ~/.oci/config are too open
fix with:
oci setup repair-file-permissions --file ~/.oci/config
oci setup repair-file-permissions --file ~/.oci/oci_api_key.pem

Find a Model OCID
Each Generative AI model (e.g. Cohere Command R+, Meta Llama 3.3) has a unique OCID. You’ll need this OCID in your application.
Run the following command to list available models in your region:
oci generative-ai model-collection list-models \
  --compartment-id <your_compartment_ocid> \
  --region us-chicago-1

Sample output :
{
  "data": {
    "items": [
      {
        "base-model-id": null,
        "capabilities": [
          "UNKNOWN_ENUM_VALUE"
        ],
        "compartment-id": null,
        "defined-tags": {},
        "display-name": "meta.llama-guard-4-12b",
        "fine-tune-details": null,
        "freeform-tags": {},
        "id": "ocid1.generativeaimodel.oc1.us-chicago-1.amaaaaaask7dceyaf4q5ji7iw7k3h6ol4a6lgpk7jnjuc5xlq55z4kxyfecq",
        "is-long-term-supported": true,
        "lifecycle-details": "Creating Base Model",
        "lifecycle-state": "ACTIVE",
        "model-metrics": null,
        "system-tags": {},
        "time-created": "2025-08-19T19:44:09.634000+00:00",
        "time-dedicated-retired": null,
        "time-deprecated": "2025-08-01T00:00:00+00:00",
        "time-on-demand-retired": null,
        "type": "BASE",
        "vendor": "meta",
        "version": "1.0.0"
      },
      {
        "base-model-id": null,
        "capabilities": [
          "CHAT"
        ],
        "compartment-id": null,
        "defined-tags": {},
        "display-name": "xai.grok-4",
        "fine-tune-details": null,
        "freeform-tags": {},
        "id": "ocid1.generativeaimodel.oc1.us-chicago-1.amaaaaaask7dceya3bsfz4ogiuv3yc7gcnlry7gi3zzx6tnikg6jltqszm2q",
        "is-long-term-supported": true,
        "lifecycle-details": "Base Model created",
        "lifecycle-state": "ACTIVE",
        "model-metrics": null,
        "system-tags": {},
        "time-created": "2025-07-22T02:38:53.272000+00:00",
        "time-dedicated-retired": null,
        "time-deprecated": null,
        "time-on-demand-retired": null,
        "type": "BASE",
        "vendor": "xai",
        "version": "1.0.0"
      },
      {
        "base-model-id": null,
        "capabilities": [
          "CHAT"
        ],
        "compartment-id": null,
        "defined-tags": {},
        "display-name": "cohere.command-latest",
        "fine-tune-details": null,
        "freeform-tags": {},
        "id": "ocid1.generativeaimodel.oc1.us-chicago-1.aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa",
        "is-long-term-supported": true,
        "lifecycle-details": "Base Model created",
        "lifecycle-state": "ACTIVE",
        "model-metrics": null,
        "system-tags": {},
        "time-created": "2025-07-02T17:50:29.440000+00:00",
        "time-dedicated-retired": null,
        "time-deprecated": null,
        "time-on-demand-retired": null,
        "type": "BASE",
        "vendor": "cohere",
        "version": "1.7"
      },
      {
        "base-model-id": null,
        "capabilities": [
          "CHAT"
        ],
        "compartment-id": null,
        "defined-tags": {},
        "display-name": "cohere.command-plus-latest",
        "fine-tune-details": null,
        "freeform-tags": {},
        "id": "ocid1.generativeaimodel.oc1.us-chicago-1.aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa",
        "is-long-term-supported": true,
        "lifecycle-details": "Base Model created",
        "lifecycle-state": "ACTIVE",
        "model-metrics": null,
        "system-tags": {},
        "time-created": "2025-07-02T17:50:29.414000+00:00",
        "time-dedicated-retired": null,
        "time-deprecated": null,
        "time-on-demand-retired": null,
        "type": "BASE",
        "vendor": "cohere",
        "version": "1.6"
      }

From the output, copy the id value for the model you want. For example:
Cohere Command R+
"display-name": "cohere.command-r-plus-08-2024",
"id": "ocid1.generativeaimodel.oc1.us-chicago-1.amaaaaaask7dceyaodm6rdyxmdzlddweh4amobzoo4fatlao2pwnekexmosq"
Meta Llama 3.3 70B Instruct
"display-name": "meta.llama-3.3-70b-instruct",
"id": "ocid1.generativeaimodel.oc1.us-chicago-1.amaaaaaask7dceyazz5xnau6rie75wc2imyk4z54b6rg3z6rpbdlhox4cm7a"
You’ll use this OCID in the Python app.

Try the OCI Playground and get sample Code

Before writing your own application, it’s a good idea to experiment with the OCI Generative AI Playground:
Go to the OCI Console → Analytics & AI → Generative AI -> Playground.
Select a model (e.g., Cohere Command R+ or Meta Llama 3.3).
Enter a sample prompt and test your use case (chat, Q&A, summarization, etc.).
Adjust parameters such as Max Tokens, Temperature, Top P, and Top K.
Once you’re satisfied with the results, click View Code

The Playground lets you download code in:
Languages : Java,Python,TypeScript
Frameworks : Python-LangChain, Python-LlamaIndex

🐍 Application Code
Here’s the Streamlit-based sample chat app (chat_app.py): 
(change the compartment_id, endpoint URL – update it to your region – and model_id in the code below)

https://github.com/narasimharaok-cloud9/Reusable_GenerativeAI/blob/main/chat_app.py


import streamlit as st
import oci

# ---------------------------
# OCI Config Setup
# ---------------------------
compartment_id = "ocid1.tenancy.oc1..aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa"
CONFIG_PROFILE = "DEFAULT"
config = oci.config.from_file("~/.oci/config", CONFIG_PROFILE)

endpoint = "https://inference.generativeai.us-chicago-1.oci.oraclecloud.com"

generative_ai_inference_client = oci.generative_ai_inference.GenerativeAiInferenceClient(
    config=config,
    service_endpoint=endpoint,
    retry_strategy=oci.retry.NoneRetryStrategy(),
    timeout=(10, 240),
)

# ---------------------------
# Streamlit UI Setup
# ---------------------------
st.set_page_config(page_title="Oracle Generative AI Chat", page_icon="💬")
st.title("💬 Oracle Generative AI - Chat")

if "messages" not in st.session_state:
    st.session_state["messages"] = []

for msg in st.session_state["messages"]:
    with st.chat_message(msg["role"]):
        st.markdown(msg["content"])

if prompt := st.chat_input("Type your message here..."):
    st.session_state["messages"].append({"role": "user", "content": prompt})
    with st.chat_message("user"):
        st.markdown(prompt)

    with st.chat_message("assistant"):
        with st.spinner("Thinking..."):
            chat_request = oci.generative_ai_inference.models.CohereChatRequest()
            chat_request.message = prompt
            chat_request.max_tokens = 800
            chat_request.temperature = 0.7
            chat_request.frequency_penalty = 1
            chat_request.top_p = 0.75
            chat_request.top_k = 0

            chat_detail = oci.generative_ai_inference.models.ChatDetails()

            chat_detail.serving_mode = oci.generative_ai_inference.models.OnDemandServingMode(
                model_id="ocid1.generativeaimodel.oc1.us-chicago-1.aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa"  # Cohere Command R+ (replace with your choice)
            )
            chat_detail.chat_request = chat_request
            chat_detail.compartment_id = compartment_id

            response = generative_ai_inference_client.chat(chat_detail)
            reply = response.data.chat_response.text

            st.markdown(reply)

    st.session_state["messages"].append({"role": "assistant", "content": reply})

Running the Application
Install dependencies:
pip install oci streamlit

Run the app:
streamlit run chat_app.py

Open http://localhost:8501 or replace it with required port to interact with your chatbot.

Conclusion
With OCI Generative AI’s on-demand models, developers can quickly prototype enterprise-ready AI assistants without managing infrastructure. 
By combining the OCI Python SDK with Streamlit, you get an interactive chat UI that supports multiple turns, conversation history, and flexible model selection.

Author: Narasimharao Karanam
:

Search This Blog

Oracle Cloud Infrastructure - What's New