# MARS-835: Required External Tables for Smart Column Mapping ## Overview This document lists all external tables required for MARS-835 data exports using DATA_EXPORTER v2.4.0 with Smart Column Mapping feature. **Purpose**: Smart Column Mapping ensures CSV files are generated with columns in the EXACT order expected by external tables, preventing NULL values due to Oracle's positional CSV mapping. --- ## Required External Tables ### Group 1: DATA Bucket (CSV Format) - **CRITICAL** #### 1. ODS.CSDB_DEBT_DATA_ODS - **Source Table**: OU_CSDB.LEGACY_DEBT - **Format**: CSV - **Bucket**: DATA (mrds_data_dev/ODS/CSDB/CSDB_DEBT/) - **Key Column Mapping**: A_ETL_LOAD_SET_FK → A_WORKFLOW_HISTORY_KEY (position 2 recommended) - **Critical**: Must use Smart Column Mapping to avoid NULL values in A_WORKFLOW_HISTORY_KEY #### 2. ODS.CSDB_DEBT_DAILY_DATA_ODS - **Source Table**: OU_CSDB.LEGACY_DEBT_DAILY - **Format**: CSV - **Bucket**: DATA (mrds_data_dev/ODS/CSDB/CSDB_DEBT_DAILY/) - **Key Column Mapping**: A_ETL_LOAD_SET_FK → A_WORKFLOW_HISTORY_KEY (position 2 recommended) - **Critical**: Must use Smart Column Mapping to avoid NULL values in A_WORKFLOW_HISTORY_KEY --- ### Group 2: ARCHIVE Bucket (Parquet Format) - **RECOMMENDED** #### 3. ODS.CSDB_DEBT_ARCHIVE - **Source Table**: OU_CSDB.LEGACY_DEBT - **Format**: Parquet with Hive partitioning - **Bucket**: ARCHIVE (mrds_hist_dev/ARCHIVE/CSDB/CSDB_DEBT/) - **Key Column Mapping**: A_ETL_LOAD_SET_FK → A_WORKFLOW_HISTORY_KEY - **Note**: Parquet uses schema-based mapping (column order less critical but Smart Column Mapping ensures consistency) #### 4. ODS.CSDB_DEBT_DAILY_ARCHIVE - **Source Table**: OU_CSDB.LEGACY_DEBT_DAILY - **Format**: Parquet with Hive partitioning - **Bucket**: ARCHIVE (mrds_hist_dev/ARCHIVE/CSDB/CSDB_DEBT_DAILY/) - **Key Column Mapping**: A_ETL_LOAD_SET_FK → A_WORKFLOW_HISTORY_KEY #### 5. ODS.CSDB_INSTR_RAT_FULL_ARCHIVE - **Source Table**: OU_CSDB.LEGACY_INSTR_RAT_FULL - **Format**: Parquet with Hive partitioning - **Bucket**: ARCHIVE (mrds_hist_dev/ARCHIVE/CSDB/CSDB_INSTR_RAT_FULL/) - **Key Column Mapping**: A_ETL_LOAD_SET_FK → A_WORKFLOW_HISTORY_KEY #### 6. ODS.CSDB_INSTR_DESC_FULL_ARCHIVE - **Source Table**: OU_CSDB.LEGACY_INSTR_DESC_FULL - **Format**: Parquet with Hive partitioning - **Bucket**: ARCHIVE (mrds_hist_dev/ARCHIVE/CSDB/CSDB_INSTR_DESC_FULL/) - **Key Column Mapping**: A_ETL_LOAD_SET_FK → A_WORKFLOW_HISTORY_KEY #### 7. ODS.CSDB_ISSUER_RAT_FULL_ARCHIVE - **Source Table**: OU_CSDB.LEGACY_ISSUER_RAT_FULL - **Format**: Parquet with Hive partitioning - **Bucket**: ARCHIVE (mrds_hist_dev/ARCHIVE/CSDB/CSDB_ISSUER_RAT_FULL/) - **Key Column Mapping**: A_ETL_LOAD_SET_FK → A_WORKFLOW_HISTORY_KEY #### 8. ODS.CSDB_ISSUER_DESC_FULL_ARCHIVE - **Source Table**: OU_CSDB.LEGACY_ISSUER_DESC_FULL - **Format**: Parquet with Hive partitioning - **Bucket**: ARCHIVE (mrds_hist_dev/ARCHIVE/CSDB/CSDB_ISSUER_DESC_FULL/) - **Key Column Mapping**: A_ETL_LOAD_SET_FK → A_WORKFLOW_HISTORY_KEY --- ## External Table Column Order Requirements ### **CRITICAL for CSV Tables** (DATA bucket): All CSV external tables MUST have **A_WORKFLOW_HISTORY_KEY at position 2**: ``` Position 1: A_KEY (NUMBER) Position 2: A_WORKFLOW_HISTORY_KEY (NUMBER) ← MUST BE HERE! Position 3+: Other columns in any order ``` **Reason**: Oracle External Tables with CSV format use **positional mapping** (ignore header row). If source table has A_ETL_LOAD_SET_FK at position 72, but CSV puts it at position 72 while external table expects A_WORKFLOW_HISTORY_KEY at position 2, the external table will try to read position 2 (which might be a DATE column) as NUMBER → conversion fails → NULL value. **Solution**: Smart Column Mapping (v2.4.0) generates CSV columns in EXTERNAL TABLE order, ensuring position 2 has the correct NUMBER value. ### **OPTIONAL for Parquet Tables** (ARCHIVE bucket): Parquet format uses **schema-based mapping** (column names). Column order doesn't matter, but Smart Column Mapping provides consistency. --- ## Creation Script Example ### CSV External Table (CRITICAL - Correct Column Order) ```sql -- Example: ODS.CSDB_DEBT_DATA_ODS -- IMPORTANT: A_WORKFLOW_HISTORY_KEY must be at position 2! BEGIN ODS.FILE_MANAGER_ODS.CREATE_EXTERNAL_TABLE( pTableName => 'CSDB_DEBT_DATA_ODS', pTemplateTableName => 'CT_ET_TEMPLATES.CSDB_DEBT_TEMPLATE', pPrefix => 'ODS/CSDB/CSDB_DEBT', pBucketUri => CT_MRDS.ENV_MANAGER.gvDataBucketUri, pFormat => 'CSV' -- Uses positional mapping! ); END; / -- Verify column order (A_WORKFLOW_HISTORY_KEY should be position 2) SELECT column_id, column_name, data_type FROM all_tab_columns WHERE table_name = 'CSDB_DEBT_DATA_ODS' AND owner = 'ODS' ORDER BY column_id; ``` ### Parquet External Table (Optional Column Order) ```sql -- Example: ODS.CSDB_DEBT_ARCHIVE -- Column order flexible (schema-based mapping) BEGIN ODS.FILE_MANAGER_ODS.CREATE_EXTERNAL_TABLE( pTableName => 'CSDB_DEBT_ARCHIVE', pTemplateTableName => 'CT_ET_TEMPLATES.CSDB_DEBT_TEMPLATE', pPrefix => 'ARCHIVE/CSDB/CSDB_DEBT', pBucketUri => CT_MRDS.ENV_MANAGER.gvArchiveBucketUri, pFormat => 'PARQUET' -- Uses schema-based mapping ); END; / ``` --- ## Template Tables Required All external tables require corresponding template tables in CT_ET_TEMPLATES schema: - `CT_ET_TEMPLATES.CSDB_DEBT_TEMPLATE` - `CT_ET_TEMPLATES.CSDB_DEBT_DAILY_TEMPLATE` - `CT_ET_TEMPLATES.CSDB_INSTR_RAT_FULL_TEMPLATE` - `CT_ET_TEMPLATES.CSDB_INSTR_DESC_FULL_TEMPLATE` - `CT_ET_TEMPLATES.CSDB_ISSUER_RAT_FULL_TEMPLATE` - `CT_ET_TEMPLATES.CSDB_ISSUER_DESC_FULL_TEMPLATE` **Note**: Template tables must be created by ADMIN or CT_ET_TEMPLATES user (MRDS_LOADER cannot create them). --- ## Verification Checklist Before running MARS-835 exports: - [ ] All 8 external tables exist in ODS schema - [ ] CSV tables (DATA bucket) have A_WORKFLOW_HISTORY_KEY at position 2 - [ ] Template tables exist in CT_ET_TEMPLATES schema - [ ] MRDS_LOADER has EXECUTE privilege on ODS.FILE_MANAGER_ODS - [ ] ODS schema has access to CT_MRDS.ENV_MANAGER for logging - [ ] DATA_EXPORTER v2.4.0 deployed with Smart Column Mapping feature --- ## Testing Verification After export, verify A_WORKFLOW_HISTORY_KEY is not NULL: ```sql -- CSV tables (should be 100% populated) SELECT 'CSDB_DEBT_DATA_ODS' AS TABLE_NAME, COUNT(*) AS TOTAL_ROWS, COUNT(A_WORKFLOW_HISTORY_KEY) AS NON_NULL_COUNT, ROUND(COUNT(A_WORKFLOW_HISTORY_KEY) * 100.0 / NULLIF(COUNT(*), 0), 2) AS SUCCESS_RATE_PCT FROM ODS.CSDB_DEBT_DATA_ODS; SELECT 'CSDB_DEBT_DAILY_DATA_ODS' AS TABLE_NAME, COUNT(*) AS TOTAL_ROWS, COUNT(A_WORKFLOW_HISTORY_KEY) AS NON_NULL_COUNT, ROUND(COUNT(A_WORKFLOW_HISTORY_KEY) * 100.0 / NULLIF(COUNT(*), 0), 2) AS SUCCESS_RATE_PCT FROM ODS.CSDB_DEBT_DAILY_DATA_ODS; -- Parquet tables (should also be 100% populated) SELECT 'CSDB_DEBT_ARCHIVE' AS TABLE_NAME, COUNT(*) AS TOTAL_ROWS, COUNT(A_WORKFLOW_HISTORY_KEY) AS NON_NULL_COUNT, ROUND(COUNT(A_WORKFLOW_HISTORY_KEY) * 100.0 / NULLIF(COUNT(*), 0), 2) AS SUCCESS_RATE_PCT FROM ODS.CSDB_DEBT_ARCHIVE; ``` **Expected Result**: SUCCESS_RATE_PCT = 100.00 for all tables --- ## Related Documentation - [DATA_EXPORTER v2.4.0 Smart Column Mapping Examples](../MARS-835-PREHOOK/current_version/v2.3.0/DATA_EXPORTER_v2.4.0_Smart_Column_Mapping_Examples.sql) - [Oracle External Tables Column Order Issue](../../confluence/additions/Oracle_External_Tables_Column_Order_Issue.md) - [MARS-835 README](README.md) --- **Last Updated**: 2026-01-09 **Author**: GitHub Copilot (MARS-835 Update)