352 lines
13 KiB
Markdown
352 lines
13 KiB
Markdown
# New Table Setup Guide for FILE PROCESSOR System
|
|
|
|
This document describes the process of setting up new tables for the FILE PROCESSOR system when creating tables from scratch (without existing data migration).
|
|
|
|
## Overview
|
|
|
|
The new table setup process involves creating a complete table structure for the FILE PROCESSOR framework, which includes:
|
|
- Creating template tables for external table definitions
|
|
- Setting up external tables for different storage locations (INBOX, ODS, ARCHIVE)
|
|
- Configuring file processing rules
|
|
|
|
**Package Architecture Note:** The system uses two main packages:
|
|
- **FILE_MANAGER**: Handles file processing, validation, and external table creation
|
|
- **DATA_EXPORTER**: Handles data export operations (CSV and Parquet formats)
|
|
|
|
Since this guide covers new table setup without data migration, it primarily uses FILE_MANAGER procedures. For data export operations, refer to the `DATA_EXPORTER` package documentation.
|
|
|
|
## When to Use This Guide
|
|
|
|
Use this guide when:
|
|
- Creating completely new tables
|
|
- No existing data needs to be migrated
|
|
- Starting fresh with FILE PROCESSOR framework
|
|
- Setting up new data sources or file types
|
|
|
|
## Step-by-Step Setup Process
|
|
|
|
**Important:** The CT_MRDS.FILE_MANAGER package uses `AUTHID CURRENT_USER` clause, which means objects will be created in the schema of the user executing the procedures. Since our goal is to create external tables in the ODS schema, the CREATE_EXTERNAL_TABLE procedure must be run as the ODS user. Other procedures (ADD_SOURCE, ADD_SOURCE_FILE_CONFIG, ADD_COLUMN_DATE_FORMAT) can be executed from any user context.
|
|
|
|
**Workaround:** If you cannot connect as the ODS user, you can use the `ODS.FILE_MANAGER_ODS` package instead of `CT_MRDS.FILE_MANAGER`. The `FILE_MANAGER_ODS` package is a wrapper for the FILE_MANAGER package that uses `AUTHID DEFINER` instead of `AUTHID CURRENT_USER`, which means it will always create objects in the ODS schema regardless of which user executes the procedures.
|
|
|
|
Example using the workaround:
|
|
```sql
|
|
-- Instead of: CT_MRDS.FILE_MANAGER.CREATE_EXTERNAL_TABLE(...)
|
|
-- Use: ODS.FILE_MANAGER_ODS.CREATE_EXTERNAL_TABLE(...)
|
|
```
|
|
|
|
### Step 1: Create Template Table
|
|
|
|
Create a template table in the `CT_ET_TEMPLATES` schema with the desired structure:
|
|
|
|
```sql
|
|
CREATE TABLE CT_ET_TEMPLATES.{SOURCE}_{TABLE_NAME} (
|
|
COLUMN1 VARCHAR2(100),
|
|
COLUMN2 NUMBER(10,2),
|
|
COLUMN3 DATE,
|
|
SNAPSHOT_DATE DATE,
|
|
-- Add all required columns with appropriate data types
|
|
CONSTRAINT PK_{SOURCE}_{TABLE_NAME} PRIMARY KEY (COLUMN1)
|
|
);
|
|
```
|
|
|
|
**Purpose:**
|
|
- The template table defines the structure for external tables
|
|
- Define all columns with appropriate data types and constraints
|
|
- Located in `CT_ET_TEMPLATES` schema for centralized template management
|
|
- Will be used as a blueprint for external table creation
|
|
|
|
### Step 2: Configure FILE_MANAGER System
|
|
|
|
Set up the file processing configuration using FILE_MANAGER procedures before creating external tables, as the `CREATE_EXTERNAL_TABLE` procedure uses data from configuration tables:
|
|
|
|
```sql
|
|
-- Add source system if not exists
|
|
CALL CT_MRDS.FILE_MANAGER.ADD_SOURCE(
|
|
pSourceKey => '{SOURCE}',
|
|
pSourceName => '{Source System Description}'
|
|
);
|
|
|
|
-- Configure file type for processing
|
|
CALL CT_MRDS.FILE_MANAGER.ADD_SOURCE_FILE_CONFIG(
|
|
pSourceKey => '{SOURCE}',
|
|
pSourceFileType => 'INPUT',
|
|
pSourceFileId => '{SOURCE_FILE_ID}',
|
|
pSourceFileDesc => '{Description of file type}',
|
|
pSourceFileNamePattern => '{file_pattern_*.csv}',
|
|
pTableId => '{TABLE_NAME}',
|
|
pTemplateTableName => 'CT_ET_TEMPLATES.{SOURCE}_{TABLE_NAME}',
|
|
pContainerFileKey => NULL
|
|
);
|
|
|
|
-- Configure date formats if needed
|
|
CALL CT_MRDS.FILE_MANAGER.ADD_COLUMN_DATE_FORMAT(
|
|
pTemplateTableName => 'CT_ET_TEMPLATES.{SOURCE}_{TABLE_NAME}',
|
|
pColumnName => 'SNAPSHOT_DATE',
|
|
pDateFormat => 'YYYY-MM-DD'
|
|
);
|
|
```
|
|
|
|
**Purpose:**
|
|
- Configures automatic file processing
|
|
- Defines file naming patterns and locations
|
|
- Sets up date format handling for specific columns
|
|
- Enables end-to-end file processing workflow
|
|
|
|
### Step 3: Create External Tables
|
|
|
|
Create external tables for different storage locations using the `FILE_MANAGER.CREATE_EXTERNAL_TABLE` procedure:
|
|
|
|
```sql
|
|
-- External table for INBOX (incoming files)
|
|
BEGIN
|
|
CT_MRDS.FILE_MANAGER.CREATE_EXTERNAL_TABLE(
|
|
pTableName => '{SOURCE}_{TABLE_NAME}_INBOX',
|
|
pTemplateTableName => 'CT_ET_TEMPLATES.{SOURCE}_{TABLE_NAME}',
|
|
pPrefix => 'INBOX/{SOURCE}/{SOURCE_FILE_ID}/{TABLE_NAME}/',
|
|
pBucketUri => CT_MRDS.ENV_MANAGER.gvInboxBucketUri
|
|
);
|
|
END;
|
|
/
|
|
|
|
-- Alternative using DEFINER package (workaround)
|
|
-- BEGIN
|
|
-- ODS.FILE_MANAGER_ODS.CREATE_EXTERNAL_TABLE(
|
|
-- pTableName => '{SOURCE}_{TABLE_NAME}_INBOX',
|
|
-- pTemplateTableName => 'CT_ET_TEMPLATES.{SOURCE}_{TABLE_NAME}',
|
|
-- pPrefix => 'INBOX/{SOURCE}/{SOURCE_FILE_ID}/{TABLE_NAME}/',
|
|
-- pBucketUri => CT_MRDS.ENV_MANAGER.gvInboxBucketUri
|
|
-- );
|
|
-- END;
|
|
-- /
|
|
|
|
-- External table for ODS (operational data store)
|
|
BEGIN
|
|
CT_MRDS.FILE_MANAGER.CREATE_EXTERNAL_TABLE(
|
|
pTableName => '{SOURCE}_{TABLE_NAME}_ODS',
|
|
pTemplateTableName => 'CT_ET_TEMPLATES.{SOURCE}_{TABLE_NAME}',
|
|
pPrefix => 'ODS/{SOURCE}/{TABLE_NAME}/',
|
|
pBucketUri => CT_MRDS.ENV_MANAGER.gvDataBucketUri
|
|
);
|
|
END;
|
|
/
|
|
|
|
-- Alternative using DEFINER package (workaround)
|
|
-- BEGIN
|
|
-- ODS.FILE_MANAGER_ODS.CREATE_EXTERNAL_TABLE(
|
|
-- pTableName => '{SOURCE}_{TABLE_NAME}_ODS',
|
|
-- pTemplateTableName => 'CT_ET_TEMPLATES.{SOURCE}_{TABLE_NAME}',
|
|
-- pPrefix => 'ODS/{SOURCE}/{TABLE_NAME}/',
|
|
-- pBucketUri => CT_MRDS.ENV_MANAGER.gvDataBucketUri
|
|
-- );
|
|
-- END;
|
|
-- /
|
|
|
|
-- External table for ARCHIVE (historical data)
|
|
BEGIN
|
|
CT_MRDS.FILE_MANAGER.CREATE_EXTERNAL_TABLE(
|
|
pTableName => '{SOURCE}_{TABLE_NAME}_ARCHIVE',
|
|
pTemplateTableName => 'CT_ET_TEMPLATES.{SOURCE}_{TABLE_NAME}',
|
|
pPrefix => 'ARCHIVE/{SOURCE}/{TABLE_NAME}/',
|
|
pBucketUri => CT_MRDS.ENV_MANAGER.gvArchiveBucketUri
|
|
);
|
|
END;
|
|
/
|
|
|
|
-- Alternative using DEFINER package (workaround)
|
|
-- BEGIN
|
|
-- ODS.FILE_MANAGER_ODS.CREATE_EXTERNAL_TABLE(
|
|
-- pTableName => '{SOURCE}_{TABLE_NAME}_ARCHIVE',
|
|
-- pTemplateTableName => 'CT_ET_TEMPLATES.{SOURCE}_{TABLE_NAME}',
|
|
-- pPrefix => 'ARCHIVE/{SOURCE}/{TABLE_NAME}/',
|
|
-- pBucketUri => CT_MRDS.ENV_MANAGER.gvArchiveBucketUri
|
|
-- );
|
|
-- END;
|
|
-- /
|
|
```
|
|
|
|
**Parameters:**
|
|
- `pTableName`: Name of the external table to create
|
|
- `pTemplateTableName`: Template table defining the structure
|
|
- `pPrefix`: Storage path prefix in Oracle Cloud Storage
|
|
- `pBucketUri`: URI of the target bucket (uses ENV_MANAGER global variables for different storage types)
|
|
|
|
**Storage Locations:**
|
|
- **INBOX**: For incoming files awaiting processing (uses `gvInboxBucketUri`)
|
|
- **ODS**: For processed files in operational data store (uses `gvDataBucketUri`)
|
|
- **ARCHIVE**: For historical/archived files (uses `gvArchiveBucketUri`)
|
|
|
|
## FILE_MANAGER Package Procedures Used
|
|
|
|
**Execution Context:** The `CREATE_EXTERNAL_TABLE` procedure must be executed as the **ODS user** due to the `AUTHID CURRENT_USER` clause in the CT_MRDS.FILE_MANAGER package. This ensures that external tables are created in the ODS schema. Other procedures (ADD_SOURCE, ADD_SOURCE_FILE_CONFIG, ADD_COLUMN_DATE_FORMAT) can be executed from any user context as they only insert configuration data.
|
|
|
|
**Alternative (Workaround):** You can use the `ODS.FILE_MANAGER_ODS` package instead, which uses `AUTHID DEFINER` and will create objects in the ODS schema regardless of the executing user.
|
|
|
|
### CREATE_EXTERNAL_TABLE
|
|
|
|
Creates external tables that can read data from Oracle Cloud Storage. This procedure has two overloaded versions:
|
|
|
|
**Main Version - Manual Configuration:**
|
|
```sql
|
|
PROCEDURE CREATE_EXTERNAL_TABLE (
|
|
pTableName IN VARCHAR2,
|
|
pTemplateTableName IN VARCHAR2,
|
|
pPrefix IN VARCHAR2,
|
|
pBucketUri IN VARCHAR2 DEFAULT ENV_MANAGER.gvInboxBucketUri,
|
|
pFileName IN VARCHAR2 DEFAULT NULL,
|
|
pDelimiter IN VARCHAR2 DEFAULT ','
|
|
);
|
|
```
|
|
|
|
**Overloaded Version - Automatic Configuration:**
|
|
```sql
|
|
PROCEDURE CREATE_EXTERNAL_TABLE (
|
|
pSourceFileReceivedKey IN NUMBER
|
|
);
|
|
```
|
|
|
|
**Purpose:**
|
|
- **Main version**: Creates external tables with manually specified parameters
|
|
- **Overloaded version**: Automatically creates external table for a registered file using its `A_SOURCE_FILE_RECEIVED_KEY`. This version retrieves all necessary parameters (table name, template, prefix, bucket URI) from the file's configuration record and delegates to the main procedure.
|
|
|
|
### Configuration Procedures
|
|
|
|
For detailed information on the FILE_MANAGER configuration procedures including ADD_SOURCE, ADD_SOURCE_FILE_CONFIG, and ADD_COLUMN_DATE_FORMAT, see the comprehensive [FILE_MANAGER Configuration Guide](FILE_MANAGER_Configuration_Guide.md).
|
|
|
|
## Best Practices
|
|
|
|
### 1. Naming Conventions
|
|
|
|
- **Template tables**: `CT_ET_TEMPLATES.{SOURCE}_{TABLE_NAME}`
|
|
- **External tables**: `{SOURCE}_{TABLE_NAME}_{LOCATION}` (e.g., `_INBOX`, `_ODS`, `_ARCHIVE`)
|
|
|
|
### 2. Schema Organization
|
|
|
|
- **CT_ET_TEMPLATES**: Template table definitions
|
|
- **ODS**: External tables for processed data
|
|
|
|
### 3. Storage Structure
|
|
|
|
```
|
|
Oracle Cloud Storage Bucket
|
|
├── INBOX/
|
|
│ └── {SOURCE}/
|
|
│ └── {SOURCE_FILE_ID}/
|
|
│ └── {TABLE_NAME}/
|
|
├── ODS/
|
|
│ └── {SOURCE}/
|
|
│ └── {TABLE_NAME}/
|
|
└── ARCHIVE/
|
|
└── {SOURCE}/
|
|
└── {TABLE_NAME}/
|
|
```
|
|
|
|
### 4. Setup Checklist
|
|
|
|
- [ ] Create template table with proper structure
|
|
- [ ] Configure FILE_MANAGER system (ADD_SOURCE, ADD_SOURCE_FILE_CONFIG, ADD_COLUMN_DATE_FORMAT)
|
|
- [ ] Create external tables (INBOX, ODS, ARCHIVE)
|
|
- [ ] Test file processing workflow
|
|
|
|
## Troubleshooting
|
|
|
|
### Common Issues
|
|
|
|
1. **Execution Context Issues**
|
|
- **Problem:** External tables created in wrong schema
|
|
- **Solution 1:** Ensure CREATE_EXTERNAL_TABLE procedure is executed as ODS user
|
|
```sql
|
|
-- Check current user context
|
|
SELECT USER FROM DUAL;
|
|
-- Should return: ODS
|
|
```
|
|
- **Solution 2 (Workaround):** Use the DEFINER package that works from any user
|
|
```sql
|
|
-- Use ODS.FILE_MANAGER_ODS instead of CT_MRDS.FILE_MANAGER
|
|
BEGIN
|
|
ODS.FILE_MANAGER_ODS.CREATE_EXTERNAL_TABLE(...);
|
|
END;
|
|
/
|
|
```
|
|
|
|
2. **Syntax Issues**
|
|
- **Problem:** Multi-line EXEC commands fail
|
|
- **Solution:** Use BEGIN...END blocks instead of EXEC for multi-line calls
|
|
```sql
|
|
-- ❌ Wrong - multi-line EXEC doesn't work:
|
|
-- EXEC CT_MRDS.FILE_MANAGER.CREATE_EXTERNAL_TABLE(
|
|
-- 'table',
|
|
-- 'template',
|
|
-- 'path'
|
|
-- );
|
|
|
|
-- ✅ Correct - use BEGIN...END block:
|
|
BEGIN
|
|
CT_MRDS.FILE_MANAGER.CREATE_EXTERNAL_TABLE(
|
|
pTableName => 'table',
|
|
pTemplateTableName => 'template',
|
|
pPrefix => 'path'
|
|
);
|
|
END;
|
|
/
|
|
```
|
|
|
|
2. **Template Table Structure Issues**
|
|
```sql
|
|
-- Verify template table structure
|
|
SELECT column_name, data_type, data_length, nullable
|
|
FROM user_tab_columns
|
|
WHERE table_name = '{SOURCE}_{TABLE_NAME}'
|
|
ORDER BY column_id;
|
|
```
|
|
|
|
3. **External Table Creation Failures**
|
|
- Verify bucket and folder paths exist
|
|
- Check credential configuration
|
|
- Ensure template table structure is correct
|
|
- Verify ENV_MANAGER variables are set
|
|
|
|
4. **File Processing Configuration Issues**
|
|
```sql
|
|
-- Check source configuration
|
|
SELECT * FROM CT_MRDS.A_SOURCE WHERE A_SOURCE_KEY = '{SOURCE}';
|
|
|
|
-- Check file configuration
|
|
SELECT * FROM CT_MRDS.A_SOURCE_FILE_CONFIG
|
|
WHERE SOURCE_FILE_ID = '{SOURCE_FILE_ID}';
|
|
```
|
|
|
|
### Verification Queries
|
|
|
|
```sql
|
|
-- Check template table exists
|
|
SELECT table_name FROM user_tables
|
|
WHERE table_name = '{SOURCE}_{TABLE_NAME}';
|
|
|
|
-- Verify external tables creation
|
|
SELECT table_name, table_type
|
|
FROM user_tables
|
|
WHERE table_name LIKE '{SOURCE}_{TABLE_NAME}%';
|
|
|
|
-- Check file processing configuration
|
|
SELECT sfc.SOURCE_FILE_ID, sfc.SOURCE_FILE_DESC, sfc.SOURCE_FILE_NAME_PATTERN
|
|
FROM CT_MRDS.A_SOURCE_FILE_CONFIG sfc
|
|
JOIN CT_MRDS.A_SOURCE s ON s.A_SOURCE_KEY = sfc.A_SOURCE_KEY
|
|
WHERE s.A_SOURCE_KEY = '{SOURCE}';
|
|
```
|
|
|
|
## Summary
|
|
|
|
This process successfully sets up new tables for the FILE PROCESSOR framework from scratch, enabling automated cloud-based file processing. The setup includes:
|
|
|
|
1. **Template Creation** - Defining table structures in CT_ET_TEMPLATES schema
|
|
2. **FILE_MANAGER Configuration** - Setting up source systems, file processing rules, and date formats
|
|
3. **External Tables Setup** - Creating INBOX, ODS, and ARCHIVE external tables
|
|
|
|
After completion, your system will be ready for automated file processing workflows where:
|
|
- Files uploaded to INBOX are automatically recognized and processed
|
|
- Data is moved to ODS for operational access
|
|
- Historical data is archived with proper partitioning
|
|
- External tables provide seamless access to cloud-stored data
|
|
|