Import Your Own Models into OCI Generative AI - A Major Leap Forward for Enterprise AI

Oracle Cloud Infrastructure (OCI) continues to accelerate innovation in the AI space. With the latest release - “Import Your Own Models into OCI Generative AI” — enterprises and developers now have unprecedented flexibility to bring their preferred open-source or third-party AI models directly into OCI Generative AI and operationalize them at scale.

This capability makes OCI one of the few hyperscalers that allows you to import, host, and serve Hugging Face–format models, create fully managed endpoints, and run them on optimized AI clusters in the cloud.

If you’ve been waiting for a simple way to run your own LLMs - Qwen, Gemma, Llama, Phi, GPT-OSS, and many more—securely inside your OCI tenancy, this release is a game changer.

Let’s dive in.

Why This Release Matters

Generative AI adoption is expanding rapidly, but many organizations still prefer:

  • Model flexibility (use their preferred open-source models)

  • Data privacy and security (run models in their tenancy)

  • Performance tuning (choose compute shapes and optimize cost)

  • Vendor independence (avoid lock-in with a single model provider)

With OCI’s new import capability, customers can now:

  • Bring models from Hugging Face or OCI Object Storage

  • Create scalable, secure inference endpoints

  • Leverage high-performance AI clusters (A10, A100, H100, H200)

  • Use models in the OCI Generative AI Playground, API, or SDK

Whether you're building chatbots, enterprise copilots, search & retrieval systems, or embedding pipelines — this feature unlocks full control over your AI stack.

Supported Model Architectures

OCI Generative AI now supports a wide range of state-of-the-art model families:

๐Ÿ”น Chat Models

These enable conversational AI experiences. Supported architectures include:

  • Alibaba Qwen 2 & Qwen 3 — multilingual, multimodal capabilities

  • Google Gemma — lightweight yet powerful for broad language tasks

  • Meta Llama (Llama 2, 3, 3.1, 3.2, 3.3, 4) — industry-leading open LLMs

  • Microsoft Phi — efficient, compact, and cost-optimized

  • OpenAI GPT-OSS — open-weight MoE architecture with strong reasoning

๐Ÿ”น Embedding Models

  • Mistral — delivers high-performance embeddings for vector search, RAG, and semantic matching.

Prerequisites Before Importing a Model

1. Importing from Hugging Face

You need:

  • The model ID of any supported model

  • (If required) a Hugging Face access token with read permissions

    • Needed for gated models like Llama 3 / Llama 3.1

2. Importing from Object Storage

Ensure:

  • IAM policy allowing access to Object Storage

  • Model files stored in Hugging Face format, including:

    • config.json (must be exactly this filename)

    • tokenizer files

    • model weights

  • Model capability must be one of:

    • TEXT_TO_TEXT

    • IMAGE_TEXT_TO_TEXT

    • EMBEDDING

    • RERANK

Dedicated AI Cluster Requirements

OCI provides a variety of GPU cluster options depending on the model size.
Examples include:

Cluster UnitGPU TypeUnitsAI Unit Count
A10_X1NVIDIA A1011.77
A100_80G_X4NVIDIA A100 80GB412.96
H100_X8NVIDIA H100848.08
H200_X8NVIDIA H200849.76

Pricing = AI Unit Count × Price per AI Unit Hour (shown on OCI pricing page).

Step-by-Step: How to Import and Deploy Your Model

This workflow makes the process simple and intuitive for teams new to model deployment.

Step 1: Import the Model

Choose one of two options:

Option A: Directly from Hugging Face

Provide:

  • Model name

  • Optional: HF token

OCI automatically fetches and validates the model files.

Option B: From OCI Object Storage

Upload your Hugging Face–format model to a bucket and initiate the import.

Step 2: Create a Hosting Dedicated AI Cluster

  • Select the compartment

  • Choose the model architecture

  • Pick the recommended cluster unit size

  • Acknowledge the compute-hour commitment

  • Deploy the cluster

Within minutes, the cluster becomes active.

Step 3: Create an Endpoint

Endpoints let you interact with the model securely.

Configure:

  • Compartment

  • Endpoint name

  • Model & version

  • Hosting cluster

  • Networking (public endpoint for imported models)

  • Tags (optional)

Once active, your endpoint is ready for use.

Step 4: Use the Model

You can now use your imported model via:

  • OCI Generative AI Playground

  • API calls

  • SDKs (Python, Java, OCI CLI, etc.)

This means instant integration with:

  • Chat apps

  • RAG systems

  • Enterprise copilots

  • Embedding pipelines

  • Backend services

Using the Model in the Playground

Once the endpoint is active:

  1. Navigate to Endpoints

  2. Select the endpoint

  3. Click “View in Playground”

  4. Start sending messages or prompts

Playground displays your model as: <model-name> (<endpoint-name>)

This helps teams test and compare models before deploying them into production.

Enterprise-Ready Controls

Imported models support:

  • Public endpoints

  • Monitoring

  • Logging

  • On-demand scaling

  • Fine-grained IAM security

  • Network isolation options

Note:
Guardrails (Content Moderation, PII Protection, Prompt Injection Protection) currently apply only to pretrained & custom models — not imported models.

Who Should Use This Feature?

This release is ideal for:

  • Enterprises building RAG or conversational AI

  • AI/ML teams wanting to host LLMs inside their own cloud boundary

  • Developers needing full control over model selection

  • Organizations migrating from Hugging Face, OpenAI, or on-prem LLM deployments

  • Teams optimizing cost using flexible GPU cluster options

The Future of Bring-Your-Own-Model on OCI

This release marks the beginning of a new era in how organizations deploy AI on OCI. Combined with OCI’s high-performance GPUs, low-cost network, and enterprise-grade security, customers can now:

  • Train → Fine-tune → Import → Host → Deploy → Integrate
    all within a single cloud ecosystem.

OCI is quickly becoming a top choice for scalable, secure, and flexible enterprise generative AI deployment.

Conclusion

The Import Your Own Model capability in OCI Generative AI empowers businesses to bring the best of the open-source AI ecosystem into their secure cloud environment. It blends flexibility, performance, and cost efficiency — giving companies complete control over their AI strategy.

Whether you're developing an enterprise chatbot, powering a search engine with embeddings, or running multimodal use cases, OCI now gives you everything you need end-to-end.

Comments