7.5 KiB
MARS-835: Required External Tables for Smart Column Mapping
Overview
This document lists all external tables required for MARS-835 data exports using DATA_EXPORTER v2.4.0 with Smart Column Mapping feature.
Purpose: Smart Column Mapping ensures CSV files are generated with columns in the EXACT order expected by external tables, preventing NULL values due to Oracle's positional CSV mapping.
Required External Tables
Group 1: DATA Bucket (CSV Format) - CRITICAL
1. ODS.CSDB_DEBT_DATA_ODS
- Source Table: OU_CSDB.LEGACY_DEBT
- Format: CSV
- Bucket: DATA (mrds_data_dev/ODS/CSDB/CSDB_DEBT/)
- Key Column Mapping: A_ETL_LOAD_SET_FK → A_WORKFLOW_HISTORY_KEY (position 2 recommended)
- Critical: Must use Smart Column Mapping to avoid NULL values in A_WORKFLOW_HISTORY_KEY
2. ODS.CSDB_DEBT_DAILY_DATA_ODS
- Source Table: OU_CSDB.LEGACY_DEBT_DAILY
- Format: CSV
- Bucket: DATA (mrds_data_dev/ODS/CSDB/CSDB_DEBT_DAILY/)
- Key Column Mapping: A_ETL_LOAD_SET_FK → A_WORKFLOW_HISTORY_KEY (position 2 recommended)
- Critical: Must use Smart Column Mapping to avoid NULL values in A_WORKFLOW_HISTORY_KEY
Group 2: ARCHIVE Bucket (Parquet Format) - RECOMMENDED
3. ODS.CSDB_DEBT_ARCHIVE
- Source Table: OU_CSDB.LEGACY_DEBT
- Format: Parquet with Hive partitioning
- Bucket: ARCHIVE (mrds_hist_dev/ARCHIVE/CSDB/CSDB_DEBT/)
- Key Column Mapping: A_ETL_LOAD_SET_FK → A_WORKFLOW_HISTORY_KEY
- Note: Parquet uses schema-based mapping (column order less critical but Smart Column Mapping ensures consistency)
4. ODS.CSDB_DEBT_DAILY_ARCHIVE
- Source Table: OU_CSDB.LEGACY_DEBT_DAILY
- Format: Parquet with Hive partitioning
- Bucket: ARCHIVE (mrds_hist_dev/ARCHIVE/CSDB/CSDB_DEBT_DAILY/)
- Key Column Mapping: A_ETL_LOAD_SET_FK → A_WORKFLOW_HISTORY_KEY
5. ODS.CSDB_INSTR_RAT_FULL_ARCHIVE
- Source Table: OU_CSDB.LEGACY_INSTR_RAT_FULL
- Format: Parquet with Hive partitioning
- Bucket: ARCHIVE (mrds_hist_dev/ARCHIVE/CSDB/CSDB_INSTR_RAT_FULL/)
- Key Column Mapping: A_ETL_LOAD_SET_FK → A_WORKFLOW_HISTORY_KEY
6. ODS.CSDB_INSTR_DESC_FULL_ARCHIVE
- Source Table: OU_CSDB.LEGACY_INSTR_DESC_FULL
- Format: Parquet with Hive partitioning
- Bucket: ARCHIVE (mrds_hist_dev/ARCHIVE/CSDB/CSDB_INSTR_DESC_FULL/)
- Key Column Mapping: A_ETL_LOAD_SET_FK → A_WORKFLOW_HISTORY_KEY
7. ODS.CSDB_ISSUER_RAT_FULL_ARCHIVE
- Source Table: OU_CSDB.LEGACY_ISSUER_RAT_FULL
- Format: Parquet with Hive partitioning
- Bucket: ARCHIVE (mrds_hist_dev/ARCHIVE/CSDB/CSDB_ISSUER_RAT_FULL/)
- Key Column Mapping: A_ETL_LOAD_SET_FK → A_WORKFLOW_HISTORY_KEY
8. ODS.CSDB_ISSUER_DESC_FULL_ARCHIVE
- Source Table: OU_CSDB.LEGACY_ISSUER_DESC_FULL
- Format: Parquet with Hive partitioning
- Bucket: ARCHIVE (mrds_hist_dev/ARCHIVE/CSDB/CSDB_ISSUER_DESC_FULL/)
- Key Column Mapping: A_ETL_LOAD_SET_FK → A_WORKFLOW_HISTORY_KEY
External Table Column Order Requirements
CRITICAL for CSV Tables (DATA bucket):
All CSV external tables MUST have A_WORKFLOW_HISTORY_KEY at position 2:
Position 1: A_KEY (NUMBER)
Position 2: A_WORKFLOW_HISTORY_KEY (NUMBER) ← MUST BE HERE!
Position 3+: Other columns in any order
Reason: Oracle External Tables with CSV format use positional mapping (ignore header row). If source table has A_ETL_LOAD_SET_FK at position 72, but CSV puts it at position 72 while external table expects A_WORKFLOW_HISTORY_KEY at position 2, the external table will try to read position 2 (which might be a DATE column) as NUMBER → conversion fails → NULL value.
Solution: Smart Column Mapping (v2.4.0) generates CSV columns in EXTERNAL TABLE order, ensuring position 2 has the correct NUMBER value.
OPTIONAL for Parquet Tables (ARCHIVE bucket):
Parquet format uses schema-based mapping (column names). Column order doesn't matter, but Smart Column Mapping provides consistency.
Creation Script Example
CSV External Table (CRITICAL - Correct Column Order)
-- Example: ODS.CSDB_DEBT_DATA_ODS
-- IMPORTANT: A_WORKFLOW_HISTORY_KEY must be at position 2!
BEGIN
ODS.FILE_MANAGER_ODS.CREATE_EXTERNAL_TABLE(
pTableName => 'CSDB_DEBT_DATA_ODS',
pTemplateTableName => 'CT_ET_TEMPLATES.CSDB_DEBT_TEMPLATE',
pPrefix => 'ODS/CSDB/CSDB_DEBT',
pBucketUri => CT_MRDS.ENV_MANAGER.gvDataBucketUri,
pFormat => 'CSV' -- Uses positional mapping!
);
END;
/
-- Verify column order (A_WORKFLOW_HISTORY_KEY should be position 2)
SELECT column_id, column_name, data_type
FROM all_tab_columns
WHERE table_name = 'CSDB_DEBT_DATA_ODS'
AND owner = 'ODS'
ORDER BY column_id;
Parquet External Table (Optional Column Order)
-- Example: ODS.CSDB_DEBT_ARCHIVE
-- Column order flexible (schema-based mapping)
BEGIN
ODS.FILE_MANAGER_ODS.CREATE_EXTERNAL_TABLE(
pTableName => 'CSDB_DEBT_ARCHIVE',
pTemplateTableName => 'CT_ET_TEMPLATES.CSDB_DEBT_TEMPLATE',
pPrefix => 'ARCHIVE/CSDB/CSDB_DEBT',
pBucketUri => CT_MRDS.ENV_MANAGER.gvArchiveBucketUri,
pFormat => 'PARQUET' -- Uses schema-based mapping
);
END;
/
Template Tables Required
All external tables require corresponding template tables in CT_ET_TEMPLATES schema:
CT_ET_TEMPLATES.CSDB_DEBT_TEMPLATECT_ET_TEMPLATES.CSDB_DEBT_DAILY_TEMPLATECT_ET_TEMPLATES.CSDB_INSTR_RAT_FULL_TEMPLATECT_ET_TEMPLATES.CSDB_INSTR_DESC_FULL_TEMPLATECT_ET_TEMPLATES.CSDB_ISSUER_RAT_FULL_TEMPLATECT_ET_TEMPLATES.CSDB_ISSUER_DESC_FULL_TEMPLATE
Note: Template tables must be created by ADMIN or CT_ET_TEMPLATES user (MRDS_LOADER cannot create them).
Verification Checklist
Before running MARS-835 exports:
- All 8 external tables exist in ODS schema
- CSV tables (DATA bucket) have A_WORKFLOW_HISTORY_KEY at position 2
- Template tables exist in CT_ET_TEMPLATES schema
- MRDS_LOADER has EXECUTE privilege on ODS.FILE_MANAGER_ODS
- ODS schema has access to CT_MRDS.ENV_MANAGER for logging
- DATA_EXPORTER v2.4.0 deployed with Smart Column Mapping feature
Testing Verification
After export, verify A_WORKFLOW_HISTORY_KEY is not NULL:
-- CSV tables (should be 100% populated)
SELECT 'CSDB_DEBT_DATA_ODS' AS TABLE_NAME,
COUNT(*) AS TOTAL_ROWS,
COUNT(A_WORKFLOW_HISTORY_KEY) AS NON_NULL_COUNT,
ROUND(COUNT(A_WORKFLOW_HISTORY_KEY) * 100.0 / NULLIF(COUNT(*), 0), 2) AS SUCCESS_RATE_PCT
FROM ODS.CSDB_DEBT_DATA_ODS;
SELECT 'CSDB_DEBT_DAILY_DATA_ODS' AS TABLE_NAME,
COUNT(*) AS TOTAL_ROWS,
COUNT(A_WORKFLOW_HISTORY_KEY) AS NON_NULL_COUNT,
ROUND(COUNT(A_WORKFLOW_HISTORY_KEY) * 100.0 / NULLIF(COUNT(*), 0), 2) AS SUCCESS_RATE_PCT
FROM ODS.CSDB_DEBT_DAILY_DATA_ODS;
-- Parquet tables (should also be 100% populated)
SELECT 'CSDB_DEBT_ARCHIVE' AS TABLE_NAME,
COUNT(*) AS TOTAL_ROWS,
COUNT(A_WORKFLOW_HISTORY_KEY) AS NON_NULL_COUNT,
ROUND(COUNT(A_WORKFLOW_HISTORY_KEY) * 100.0 / NULLIF(COUNT(*), 0), 2) AS SUCCESS_RATE_PCT
FROM ODS.CSDB_DEBT_ARCHIVE;
Expected Result: SUCCESS_RATE_PCT = 100.00 for all tables
Related Documentation
- DATA_EXPORTER v2.4.0 Smart Column Mapping Examples
- Oracle External Tables Column Order Issue
- MARS-835 README
Last Updated: 2026-01-09 Author: GitHub Copilot (MARS-835 Update)