Oracle AI Data Platform : AI + ML in a Single Workflow: Intelligent Cash Forecasting
How I Built a First-of-its-Kind Treasury Intelligence Solution on OCI AI Data Platform
Oracle EBS • OCI
AI Data Platform • PySpark • Prophet • Google Gemini
Oracle EBS is the backbone of financial operations for
thousands of enterprises. It contains everything: accounts payable outflows,
accounts receivable inflows, bank statement balances from Cash Management, and
transaction-level detail across dozens of operating units. But that data is
locked in normalized relational tables, designed for transactional processing —
not analytical intelligence.
Extracting meaningful cash forecasts traditionally means:
•
Manual SQL queries across AP, AR, and CE modules
•
Export to spreadsheets for time-series aggregation
•
Separate Python or R environments for forecasting
models
•
Human interpretation of results before distribution
Each hand-off introduces delays, errors, and inconsistency. We
set out to eliminate every single one of them.
The Architecture: One
Platform, Two Disciplines
OCI AI Data Platform provides a managed, serverless
environment where PySpark notebooks, AI model inference, and ML libraries
coexist in the same runtime. This is the critical enabler — you are not
stitching together separate services. Everything runs in one place.
|
KEY |
This
is the first solution we are aware of that combines real-time AI narrative
generation (via a large language model) and time-series ML forecasting (via
Meta’s Prophet) in a single Oracle EBS cash management workflow on OCI AI
Data Platform. |
The architecture has three distinct layers:
|
Layer |
Technology
& Purpose |
|
Data Layer |
Oracle EBS views (ai_cm_cash_forecast, ai_cm_bank_balance) built
on AP, AR, and CE tables — federated via OCI Data Catalog as
oracleebs_internal |
|
AI Layer |
Google Gemini 2.5 Flash Lite invoked via OCI’s query_model() SQL
function — generates per-org treasury narrative inline in Spark SQL |
|
ML Layer |
Meta Prophet time-series forecasting — trains on historical
inflow/outflow patterns and projects 90-day rolling cash positions |
Step 1: Building the
Oracle EBS Data Foundation
The foundation of the solution is a set of purpose-built views
in Oracle EBS that abstract away the complexity of the underlying schema. These
were designed following the principle of least privilege — a dedicated AI user
is granted select access only on specific views, with no direct table access.
Cash
Outflow from Accounts Payable
The ai_cm_ap_cash_outflow view surfaces all unpaid supplier
invoices (payment_status_flag = 'N') with their due dates, giving us the
forward-looking cash demand picture from the AP module. It joins across
ap_invoices_all, ap_suppliers, ap_payment_schedules_all, and
hr_all_organization_units_tl to provide org-resolved, vendor-named outflow
data.
Sample Code
CREATE OR REPLACE VIEW ai_cm_ap_cash_outflow AS
SELECT
ai.org_id,
hao.name AS
org_name,
ai.invoice_num,
aps.vendor_name,
ai.invoice_amount,
ai.invoice_currency_code,
apsa.due_date,
ai.payment_status_flag,
ai.invoice_date
FROM ap_invoices_all ai
JOIN ap_suppliers aps
ON ai.vendor_id =
aps.vendor_id
JOIN ap_payment_schedules_all apsa
ON ai.invoice_id =
apsa.invoice_id
JOIN hr_all_organization_units_tl hao
ON
hao.organization_id = ai.org_id
AND hao.language =
'US'
WHERE ai.payment_status_flag = 'N';
GRANT SELECT ON ai_cm_ap_cash_outflow TO ai;
Cash Inflow from Accounts Receivable
The ai_cm_ar_cash_inflow view pulls all open customer invoices
with remaining balances from AR, joined to hz_cust_accounts for customer name
resolution. The due_date and amount_due_remaining fields provide the forward
inflow timeline.
Sample Code
SELECT
rcta.org_id,
hao.name AS org_name,
rcta.trx_number,
rcta.trx_date,
aps.amount_due_original,
aps.amount_due_remaining,
hca.account_name customer_name,
rcta.invoice_currency_code,
aps.due_date
FROM ra_customer_trx_all rcta
JOIN ar_payment_schedules_all
aps
ON rcta.customer_trx_id =
aps.customer_trx_id
JOIN hz_cust_accounts hca
ON rcta.bill_to_customer_id =
hca.cust_account_id
JOIN
hr_all_organization_units_tl hao
ON hao.organization_id = rcta.org_id
AND hao.language = 'US'
WHERE aps.amount_due_remaining
> 0;
GRANT SELECT ON
ai_cm_ar_cash_inflow TO ai;
Current Bank Balance from Cash Management
The ai_cm_bank_balance view retrieves the latest bank
statement closing balance for each account using a correlated subquery on
ce_statement_headers. This ensures we always start the forecast from the most
recent known position — not a stale or average balance.
Sample Code
SELECT
ba.account_owner_org_id as org_id,
hao.name AS org_name,
ba.bank_account_name,
ba.bank_account_num,
ba.currency_code,
csh.statement_number,
csh.statement_date,
csh.control_end_balance AS bank_balance
FROM ce_bank_accounts ba
JOIN ce_statement_headers csh
ON ba.bank_account_id = csh.bank_account_id
JOIN
hr_all_organization_units_tl hao
ON hao.organization_id =
ba.account_owner_org_id
AND hao.language = 'US'
WHERE csh.statement_date = (
SELECT MAX(csh2.statement_date)
FROM ce_statement_headers csh2
WHERE csh2.bank_account_id =
csh.bank_account_id
);
GRANT SELECT ON ai_cm_bank_balance TO ai;
Unified
Cash Forecast View
Finally, ai_cm_cash_forecast is a UNION ALL view that combines
inflows and outflows into a single typed dataset with columns: org_id,
org_name, cash_type (INFLOW/OUTFLOW), reference_number, party_name, amount, and
cash_date. This single view is the entry point for all analytics upstream.
Sample Code
SELECT
org_id,
org_name,
'OUTFLOW' cash_type,
invoice_num reference_number,
vendor_name party_name,
invoice_amount amount,
due_date cash_date
FROM ai_cm_ap_cash_outflow
UNION ALL
SELECT
org_id,
org_name,
'INFLOW' cash_type,
trx_number reference_number,
customer_name party_name,
amount_due_remaining amount,
due_date cash_date
FROM ai_cm_ar_cash_inflow;
Step 2: Spark SQL
Aggregation and Position Calculation
The PySpark notebook ingests from the federated catalog
(oracleebs_internal.apps) and immediately builds a current cash position
summary using a UNION ALL pattern inside Spark SQL:
|
SQL |
SELECT
org_id, org_name, SUM(INFLOW) AS inflow, SUM(OUTFLOW) AS outflow,
SUM(BANKBALANCE) AS bank_balance, (bank_balance + inflow - outflow) AS
net_position FROM (… UNION ALL …) GROUP BY org_id, org_name |
The result is a clean, org-level cash position table
registered as a Spark temporary view (cash_position_view) that feeds directly
into the AI analysis step. This is deliberate — the temporary view acts as the
bridge between the data transformation layer and the generative AI layer.
Step 3: AI Narrative Generation via query_model()
This is where the solution becomes genuinely novel. OCI AI
Data Platform exposes a SQL function called query_model() that invokes a hosted
large language model — in our case, Google Gemini 2.5 Flash Lite — inline
within a Spark SQL SELECT statement.
We constructed a dynamic prompt per organization using
CONCAT(), passing the bank balance, inflow, outflow, and net position as
context. The LLM was instructed to act as an Oracle EBS Treasury Analyst and
produce a concise 2-3 sentence treasury narrative covering:
•
Current liquidity position assessment
•
Inflow vs. outflow risk interpretation
•
Recommended treasury action
|
INSIGHT |
No
Python code. No API calls from application logic. The AI inference happens
entirely inside Spark SQL, processed in parallel across all organisations
simultaneously. This is the architectural breakthrough. |
The resulting ai_df DataFrame contains one row per
organisation with a treasury_analysis column containing a human-readable,
context-aware financial narrative. This is persisted to a Delta table
(cm_treasury_explanations) for downstream consumption by dashboards, reports,
or email distribution.
Step 4: ML Time-Series
Forecasting with Meta Prophet
While the AI layer handles current-state narrative, the ML
layer handles forward-looking prediction. Meta’s Prophet library — designed
specifically for business time-series data with seasonality and trend
components — is installed at runtime and used to train separate models for
inflows and outflows.
Data
Preparation
The raw cash forecast data is grouped by org_id, org_name, and
cash_date, aggregating inflows and outflows separately. This produces a daily
time series per organisation. The PySpark DataFrame is then converted to Pandas
for Prophet consumption — a deliberate boundary between distributed Spark
processing (for scale) and single-node ML training (for algorithmic richness).
Model
Training and Forecasting
Two separate Prophet models are trained per organisation: one
on historical inflow patterns, one on outflows. Both generate 90-day forward
projections (yhat). The forecast logic then simulates a rolling balance:
|
LOGIC |
for
each future day: balance = balance + forecast_inflow(day) −
forecast_outflow(day). Starting from the latest actual bank balance retrieved
from Cash Management. |
This produces a forecast_bank_balance series — a day-by-day
projected cash position over the next 90 days — which is the core deliverable
of the ML pipeline.
Persistence
The 90-day forecast is converted back to a Spark DataFrame and
written to a Delta table (cm_treasury_forecast) alongside the source org
metadata. This table is available for BI reporting, alerting, or further AI
annotation in subsequent pipeline runs.
Why This Is Genuinely
First-of-its-Kind
There are many cash forecasting solutions. There are many
LLM-powered analytics tools. But the combination of these capabilities — in the
same notebook, on the same platform, operating on live Oracle EBS data — is
what makes this architecture distinctive.
|
Capability |
What Makes
It Unique |
|
Federated EBS Access |
Live Oracle EBS data via OCI Data Catalog — no ETL, no staging
database, no manual export |
|
In-SQL AI Inference |
query_model() runs Gemini inside Spark SQL — no external API
calls, no application middleware |
|
Same-Notebook AI + ML |
LLM narrative generation and Prophet time-series forecasting in a
single PySpark workflow |
|
Zero Operational Gap |
From raw EBS transactions to AI narrative + 90-day forecast in
one automated execution |
|
Multi-Org Parallel |
All operating units processed simultaneously — Spark distributes
the AI inference at scale |
Technical Considerations
and Lessons Learned
Building this solution surfaced several technical nuances
worth capturing for others on a similar path.
Column
Ambiguity in Multi-Source Joins
When joining PySpark DataFrames derived from multiple sources
that share column names (such as org_name appearing in both ai_cm_cash_forecast
and ai_cm_bank_balance), Spark retains all copies and raises an
AMBIGUOUS_REFERENCE error at resolution time. The fix is to explicitly select
only the columns needed from the joining DataFrame before the join, eliminating
the duplicate at source rather than at query time.
Pandas-Spark
Boundary
Prophet requires Pandas DataFrames. The conversion from Spark
to Pandas (toPandas()) is the correct approach for single-org ML training but
would not scale for very large organisations or high-frequency data. For
production deployments at scale, consider Spark’s Pandas UDF or distributed
forecasting libraries.
Bank
Balance Anchor
The forecast accuracy depends heavily on the quality of the
bank balance starting point. The ai_cm_bank_balance view uses a correlated
subquery to retrieve the most recent statement date per account — ensuring the
rolling balance simulation always starts from the latest known position.
Prompt
Engineering for Financial Narratives
The quality of Gemini’s treasury analysis is directly
correlated to prompt specificity. Instructing the model with a persona (Oracle
EBS Treasury Analyst), output constraints (2-3 sentences, under 300 words), and
structured financial context (balance, inflow, outflow, net position) produces
consistently professional and actionable narratives.
Conclusion
It is a demonstration that enterprise AI and enterprise ML no
longer need to live in separate worlds — separate platforms, separate teams,
separate pipelines. OCI AI Data Platform makes it possible to ask the same
question in two fundamentally different ways in the same workflow: What does
the data mean right now? (AI) and What will the data look like in 90 days? (ML)
— and deliver both answers in a single automated execution against live Oracle
EBS data.
For Oracle EBS customers exploring AI adoption, this
architecture offers a compelling blueprint: start with clean federated views,
leverage in-platform model inference, and combine generative AI with proven ML
forecasting to deliver treasury intelligence that is accurate, timely, and
genuinely useful.
|
NOTE |
The
full solution including Oracle EBS view scripts, the PySpark forecasting
notebook, and OCI AI Data Platform configuration guidance is available for
reference. Contact the author for implementation details or to discuss
adapting this pattern to other Oracle EBS modules. |
Built on OCI AI Data Platform •
Oracle EBS R12 • PySpark
• Meta Prophet •
Google Gemini 2.5
Comments
Post a Comment