This commit is contained in:
Grzegorz Michalski
2026-02-02 10:59:29 +01:00
commit ecd833f682
679 changed files with 122717 additions and 0 deletions

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,137 @@
# GET_PACKAGE_DOCUMENTATION Function Guide
## Overview
`GET_PACKAGE_DOCUMENTATION` is a standalone Oracle PL/SQL function designed to automatically generate comprehensive markdown documentation from Oracle packages. It extracts procedural and function metadata along with embedded comments to create structured documentation.
## Function Details
### Purpose
The function parses Oracle package source code and generates formatted markdown documentation, including:
- Function and procedure signatures
- Parameter information
- Usage examples
- Return types
- Embedded comments with special annotations
### Syntax
```sql
GET_PACKAGE_DOCUMENTATION(package_name VARCHAR2, schema_name VARCHAR2) RETURN CLOB
```
### Parameters
| Parameter | Type | Description |
|-----------|------|-------------|
| `package_name` | VARCHAR2 | Name of the Oracle package to document |
| `schema_name` | VARCHAR2 | Schema containing the package |
### Return Type
- **CLOB**: Returns formatted markdown documentation as a Character Large Object
## Usage Examples
### Basic Usage
```sql
SELECT CT_MRDS.GET_PACKAGE_DOCUMENTATION('FILE_MANAGER', 'CT_MRDS') FROM DUAL;
```
## Documentation Format
The function generates markdown with the following structure:
### Function/Procedure Entries
- **Header**: Function/Procedure name as H3 heading
- **Description**: Extracted from `@desc` comments
- **Return Type**: For functions only
- **Parameters Table**: Name, IN/OUT direction, and data type
- **Usage Example**: Code from `@example` comments
- **Example Result**: Output from `@ex_rslt` comments
### Example Output Format
```markdown
### Function FUNCTION_NAME
__Description:__ Function description from comments
__Return:__ RETURN_TYPE
__Parameters:__
|Name|IN/OUT|Data Type|
|----------|----------|----------|
|PARAM1 |IN| VARCHAR2|
|PARAM2 |OUT| NUMBER|
__Example usage:__
```sql
-- Example code
```
__Example result:__
```sql
-- Expected output
```
```
## Special Comment Annotations
The function recognizes these special comment patterns in package source:
| Annotation | Purpose | Example |
|------------|---------|---------|
| `@name` | Function/procedure name | `-- @name FUNCTION_NAME` |
| `@desc` | Description text | `-- @desc Returns formatted data` |
| `@example` | Usage example code | `-- @example SELECT func() FROM dual;` |
| `@ex_rslt` | Expected result | `-- @ex_rslt 42` |
## Requirements
### Database Privileges
The function requires access to these Oracle system views:
- `ALL_SOURCE` - For package source code
- `ALL_PROCEDURES` - For procedure/function metadata
- `ALL_ARGUMENTS` - For parameter information
## Best Practices
### Documentation Standards
1. **Comment Placement**: Place special annotations directly above function/procedure declarations
2. **Example Quality**: Provide realistic, executable examples
3. **Description Clarity**: Write clear, concise descriptions
4. **Parameter Documentation**: Document all parameters with meaningful names
### Usage Recommendations
1. **Output Settings**: Always use appropriate SQL*Plus settings for CLOB output
2. **File Generation**: Redirect output to `.md` files for version control
3. **Regular Updates**: Regenerate documentation when package code changes
4. **Review Process**: Review generated documentation for accuracy
## Troubleshooting
### Common Issues
1. **Truncated Output**: Use proper LINESIZE and LONG settings
2. **Access Denied**: Ensure proper schema privileges
3. **Missing Content**: Verify special comment annotations in source
4. **Formatting Issues**: Check for special characters in comments
### SQL*Plus Settings
For complete output, always use:
```sql
SET PAGESIZE 0
SET LINESIZE 32000
SET LONG 1000000
```
## Integration with Development Workflow
### Version Control
- Store generated documentation in repository
- Update documentation with each package change
- Use consistent naming conventions
### CI/CD Integration
The function can be integrated into automated documentation pipelines:
1. Package compilation
2. Documentation generation
3. File output to documentation directory
4. Git commit with package changes
## Conclusion
The `GET_PACKAGE_DOCUMENTATION` function provides an automated solution for maintaining up-to-date Oracle package documentation. By leveraging embedded comments and Oracle metadata, it ensures documentation stays synchronized with code changes while providing a consistent, readable format for developers and stakeholders.
The function successfully generates comprehensive documentation as demonstrated with the FILE_MANAGER package, producing 626 lines of detailed markdown including function signatures, parameters, examples, and descriptions.

View File

@@ -0,0 +1,351 @@
# New Table Setup Guide for FILE PROCESSOR System
This document describes the process of setting up new tables for the FILE PROCESSOR system when creating tables from scratch (without existing data migration).
## Overview
The new table setup process involves creating a complete table structure for the FILE PROCESSOR framework, which includes:
- Creating template tables for external table definitions
- Setting up external tables for different storage locations (INBOX, ODS, ARCHIVE)
- Configuring file processing rules
**Package Architecture Note:** The system uses two main packages:
- **FILE_MANAGER**: Handles file processing, validation, and external table creation
- **DATA_EXPORTER**: Handles data export operations (CSV and Parquet formats)
Since this guide covers new table setup without data migration, it primarily uses FILE_MANAGER procedures. For data export operations, refer to the `DATA_EXPORTER` package documentation.
## When to Use This Guide
Use this guide when:
- Creating completely new tables
- No existing data needs to be migrated
- Starting fresh with FILE PROCESSOR framework
- Setting up new data sources or file types
## Step-by-Step Setup Process
**Important:** The CT_MRDS.FILE_MANAGER package uses `AUTHID CURRENT_USER` clause, which means objects will be created in the schema of the user executing the procedures. Since our goal is to create external tables in the ODS schema, the CREATE_EXTERNAL_TABLE procedure must be run as the ODS user. Other procedures (ADD_SOURCE, ADD_SOURCE_FILE_CONFIG, ADD_COLUMN_DATE_FORMAT) can be executed from any user context.
**Workaround:** If you cannot connect as the ODS user, you can use the `ODS.FILE_MANAGER_ODS` package instead of `CT_MRDS.FILE_MANAGER`. The `FILE_MANAGER_ODS` package is a wrapper for the FILE_MANAGER package that uses `AUTHID DEFINER` instead of `AUTHID CURRENT_USER`, which means it will always create objects in the ODS schema regardless of which user executes the procedures.
Example using the workaround:
```sql
-- Instead of: CT_MRDS.FILE_MANAGER.CREATE_EXTERNAL_TABLE(...)
-- Use: ODS.FILE_MANAGER_ODS.CREATE_EXTERNAL_TABLE(...)
```
### Step 1: Create Template Table
Create a template table in the `CT_ET_TEMPLATES` schema with the desired structure:
```sql
CREATE TABLE CT_ET_TEMPLATES.{SOURCE}_{TABLE_NAME} (
COLUMN1 VARCHAR2(100),
COLUMN2 NUMBER(10,2),
COLUMN3 DATE,
SNAPSHOT_DATE DATE,
-- Add all required columns with appropriate data types
CONSTRAINT PK_{SOURCE}_{TABLE_NAME} PRIMARY KEY (COLUMN1)
);
```
**Purpose:**
- The template table defines the structure for external tables
- Define all columns with appropriate data types and constraints
- Located in `CT_ET_TEMPLATES` schema for centralized template management
- Will be used as a blueprint for external table creation
### Step 2: Configure FILE_MANAGER System
Set up the file processing configuration using FILE_MANAGER procedures before creating external tables, as the `CREATE_EXTERNAL_TABLE` procedure uses data from configuration tables:
```sql
-- Add source system if not exists
CALL CT_MRDS.FILE_MANAGER.ADD_SOURCE(
pSourceKey => '{SOURCE}',
pSourceName => '{Source System Description}'
);
-- Configure file type for processing
CALL CT_MRDS.FILE_MANAGER.ADD_SOURCE_FILE_CONFIG(
pSourceKey => '{SOURCE}',
pSourceFileType => 'INPUT',
pSourceFileId => '{SOURCE_FILE_ID}',
pSourceFileDesc => '{Description of file type}',
pSourceFileNamePattern => '{file_pattern_*.csv}',
pTableId => '{TABLE_NAME}',
pTemplateTableName => 'CT_ET_TEMPLATES.{SOURCE}_{TABLE_NAME}',
pContainerFileKey => NULL
);
-- Configure date formats if needed
CALL CT_MRDS.FILE_MANAGER.ADD_COLUMN_DATE_FORMAT(
pTemplateTableName => 'CT_ET_TEMPLATES.{SOURCE}_{TABLE_NAME}',
pColumnName => 'SNAPSHOT_DATE',
pDateFormat => 'YYYY-MM-DD'
);
```
**Purpose:**
- Configures automatic file processing
- Defines file naming patterns and locations
- Sets up date format handling for specific columns
- Enables end-to-end file processing workflow
### Step 3: Create External Tables
Create external tables for different storage locations using the `FILE_MANAGER.CREATE_EXTERNAL_TABLE` procedure:
```sql
-- External table for INBOX (incoming files)
BEGIN
CT_MRDS.FILE_MANAGER.CREATE_EXTERNAL_TABLE(
pTableName => '{SOURCE}_{TABLE_NAME}_INBOX',
pTemplateTableName => 'CT_ET_TEMPLATES.{SOURCE}_{TABLE_NAME}',
pPrefix => 'INBOX/{SOURCE}/{SOURCE_FILE_ID}/{TABLE_NAME}/',
pBucketUri => CT_MRDS.ENV_MANAGER.gvInboxBucketUri
);
END;
/
-- Alternative using DEFINER package (workaround)
-- BEGIN
-- ODS.FILE_MANAGER_ODS.CREATE_EXTERNAL_TABLE(
-- pTableName => '{SOURCE}_{TABLE_NAME}_INBOX',
-- pTemplateTableName => 'CT_ET_TEMPLATES.{SOURCE}_{TABLE_NAME}',
-- pPrefix => 'INBOX/{SOURCE}/{SOURCE_FILE_ID}/{TABLE_NAME}/',
-- pBucketUri => CT_MRDS.ENV_MANAGER.gvInboxBucketUri
-- );
-- END;
-- /
-- External table for ODS (operational data store)
BEGIN
CT_MRDS.FILE_MANAGER.CREATE_EXTERNAL_TABLE(
pTableName => '{SOURCE}_{TABLE_NAME}_ODS',
pTemplateTableName => 'CT_ET_TEMPLATES.{SOURCE}_{TABLE_NAME}',
pPrefix => 'ODS/{SOURCE}/{TABLE_NAME}/',
pBucketUri => CT_MRDS.ENV_MANAGER.gvDataBucketUri
);
END;
/
-- Alternative using DEFINER package (workaround)
-- BEGIN
-- ODS.FILE_MANAGER_ODS.CREATE_EXTERNAL_TABLE(
-- pTableName => '{SOURCE}_{TABLE_NAME}_ODS',
-- pTemplateTableName => 'CT_ET_TEMPLATES.{SOURCE}_{TABLE_NAME}',
-- pPrefix => 'ODS/{SOURCE}/{TABLE_NAME}/',
-- pBucketUri => CT_MRDS.ENV_MANAGER.gvDataBucketUri
-- );
-- END;
-- /
-- External table for ARCHIVE (historical data)
BEGIN
CT_MRDS.FILE_MANAGER.CREATE_EXTERNAL_TABLE(
pTableName => '{SOURCE}_{TABLE_NAME}_ARCHIVE',
pTemplateTableName => 'CT_ET_TEMPLATES.{SOURCE}_{TABLE_NAME}',
pPrefix => 'ARCHIVE/{SOURCE}/{TABLE_NAME}/',
pBucketUri => CT_MRDS.ENV_MANAGER.gvArchiveBucketUri
);
END;
/
-- Alternative using DEFINER package (workaround)
-- BEGIN
-- ODS.FILE_MANAGER_ODS.CREATE_EXTERNAL_TABLE(
-- pTableName => '{SOURCE}_{TABLE_NAME}_ARCHIVE',
-- pTemplateTableName => 'CT_ET_TEMPLATES.{SOURCE}_{TABLE_NAME}',
-- pPrefix => 'ARCHIVE/{SOURCE}/{TABLE_NAME}/',
-- pBucketUri => CT_MRDS.ENV_MANAGER.gvArchiveBucketUri
-- );
-- END;
-- /
```
**Parameters:**
- `pTableName`: Name of the external table to create
- `pTemplateTableName`: Template table defining the structure
- `pPrefix`: Storage path prefix in Oracle Cloud Storage
- `pBucketUri`: URI of the target bucket (uses ENV_MANAGER global variables for different storage types)
**Storage Locations:**
- **INBOX**: For incoming files awaiting processing (uses `gvInboxBucketUri`)
- **ODS**: For processed files in operational data store (uses `gvDataBucketUri`)
- **ARCHIVE**: For historical/archived files (uses `gvArchiveBucketUri`)
## FILE_MANAGER Package Procedures Used
**Execution Context:** The `CREATE_EXTERNAL_TABLE` procedure must be executed as the **ODS user** due to the `AUTHID CURRENT_USER` clause in the CT_MRDS.FILE_MANAGER package. This ensures that external tables are created in the ODS schema. Other procedures (ADD_SOURCE, ADD_SOURCE_FILE_CONFIG, ADD_COLUMN_DATE_FORMAT) can be executed from any user context as they only insert configuration data.
**Alternative (Workaround):** You can use the `ODS.FILE_MANAGER_ODS` package instead, which uses `AUTHID DEFINER` and will create objects in the ODS schema regardless of the executing user.
### CREATE_EXTERNAL_TABLE
Creates external tables that can read data from Oracle Cloud Storage. This procedure has two overloaded versions:
**Main Version - Manual Configuration:**
```sql
PROCEDURE CREATE_EXTERNAL_TABLE (
pTableName IN VARCHAR2,
pTemplateTableName IN VARCHAR2,
pPrefix IN VARCHAR2,
pBucketUri IN VARCHAR2 DEFAULT ENV_MANAGER.gvInboxBucketUri,
pFileName IN VARCHAR2 DEFAULT NULL,
pDelimiter IN VARCHAR2 DEFAULT ','
);
```
**Overloaded Version - Automatic Configuration:**
```sql
PROCEDURE CREATE_EXTERNAL_TABLE (
pSourceFileReceivedKey IN NUMBER
);
```
**Purpose:**
- **Main version**: Creates external tables with manually specified parameters
- **Overloaded version**: Automatically creates external table for a registered file using its `A_SOURCE_FILE_RECEIVED_KEY`. This version retrieves all necessary parameters (table name, template, prefix, bucket URI) from the file's configuration record and delegates to the main procedure.
### Configuration Procedures
For detailed information on the FILE_MANAGER configuration procedures including ADD_SOURCE, ADD_SOURCE_FILE_CONFIG, and ADD_COLUMN_DATE_FORMAT, see the comprehensive [FILE_MANAGER Configuration Guide](FILE_MANAGER_Configuration_Guide.md).
## Best Practices
### 1. Naming Conventions
- **Template tables**: `CT_ET_TEMPLATES.{SOURCE}_{TABLE_NAME}`
- **External tables**: `{SOURCE}_{TABLE_NAME}_{LOCATION}` (e.g., `_INBOX`, `_ODS`, `_ARCHIVE`)
### 2. Schema Organization
- **CT_ET_TEMPLATES**: Template table definitions
- **ODS**: External tables for processed data
### 3. Storage Structure
```
Oracle Cloud Storage Bucket
├── INBOX/
│ └── {SOURCE}/
│ └── {SOURCE_FILE_ID}/
│ └── {TABLE_NAME}/
├── ODS/
│ └── {SOURCE}/
│ └── {TABLE_NAME}/
└── ARCHIVE/
└── {SOURCE}/
└── {TABLE_NAME}/
```
### 4. Setup Checklist
- [ ] Create template table with proper structure
- [ ] Configure FILE_MANAGER system (ADD_SOURCE, ADD_SOURCE_FILE_CONFIG, ADD_COLUMN_DATE_FORMAT)
- [ ] Create external tables (INBOX, ODS, ARCHIVE)
- [ ] Test file processing workflow
## Troubleshooting
### Common Issues
1. **Execution Context Issues**
- **Problem:** External tables created in wrong schema
- **Solution 1:** Ensure CREATE_EXTERNAL_TABLE procedure is executed as ODS user
```sql
-- Check current user context
SELECT USER FROM DUAL;
-- Should return: ODS
```
- **Solution 2 (Workaround):** Use the DEFINER package that works from any user
```sql
-- Use ODS.FILE_MANAGER_ODS instead of CT_MRDS.FILE_MANAGER
BEGIN
ODS.FILE_MANAGER_ODS.CREATE_EXTERNAL_TABLE(...);
END;
/
```
2. **Syntax Issues**
- **Problem:** Multi-line EXEC commands fail
- **Solution:** Use BEGIN...END blocks instead of EXEC for multi-line calls
```sql
-- ❌ Wrong - multi-line EXEC doesn't work:
-- EXEC CT_MRDS.FILE_MANAGER.CREATE_EXTERNAL_TABLE(
-- 'table',
-- 'template',
-- 'path'
-- );
-- ✅ Correct - use BEGIN...END block:
BEGIN
CT_MRDS.FILE_MANAGER.CREATE_EXTERNAL_TABLE(
pTableName => 'table',
pTemplateTableName => 'template',
pPrefix => 'path'
);
END;
/
```
2. **Template Table Structure Issues**
```sql
-- Verify template table structure
SELECT column_name, data_type, data_length, nullable
FROM user_tab_columns
WHERE table_name = '{SOURCE}_{TABLE_NAME}'
ORDER BY column_id;
```
3. **External Table Creation Failures**
- Verify bucket and folder paths exist
- Check credential configuration
- Ensure template table structure is correct
- Verify ENV_MANAGER variables are set
4. **File Processing Configuration Issues**
```sql
-- Check source configuration
SELECT * FROM CT_MRDS.A_SOURCE WHERE A_SOURCE_KEY = '{SOURCE}';
-- Check file configuration
SELECT * FROM CT_MRDS.A_SOURCE_FILE_CONFIG
WHERE SOURCE_FILE_ID = '{SOURCE_FILE_ID}';
```
### Verification Queries
```sql
-- Check template table exists
SELECT table_name FROM user_tables
WHERE table_name = '{SOURCE}_{TABLE_NAME}';
-- Verify external tables creation
SELECT table_name, table_type
FROM user_tables
WHERE table_name LIKE '{SOURCE}_{TABLE_NAME}%';
-- Check file processing configuration
SELECT sfc.SOURCE_FILE_ID, sfc.SOURCE_FILE_DESC, sfc.SOURCE_FILE_NAME_PATTERN
FROM CT_MRDS.A_SOURCE_FILE_CONFIG sfc
JOIN CT_MRDS.A_SOURCE s ON s.A_SOURCE_KEY = sfc.A_SOURCE_KEY
WHERE s.A_SOURCE_KEY = '{SOURCE}';
```
## Summary
This process successfully sets up new tables for the FILE PROCESSOR framework from scratch, enabling automated cloud-based file processing. The setup includes:
1. **Template Creation** - Defining table structures in CT_ET_TEMPLATES schema
2. **FILE_MANAGER Configuration** - Setting up source systems, file processing rules, and date formats
3. **External Tables Setup** - Creating INBOX, ODS, and ARCHIVE external tables
After completion, your system will be ready for automated file processing workflows where:
- Files uploaded to INBOX are automatically recognized and processed
- Data is moved to ODS for operational access
- Historical data is archived with proper partitioning
- External tables provide seamless access to cloud-stored data

View File

@@ -0,0 +1,372 @@
# PROCESS_SOURCE_FILE Procedure Guide
This document provides comprehensive documentation for the `FILE_MANAGER.PROCESS_SOURCE_FILE` procedure, which validates incoming files and prepares them for loading on the Airflow+DBT side through Oracle Cloud Infrastructure (OCI) file management operations.
## Overview
`PROCESS_SOURCE_FILE` is an umbrella procedure that validates incoming files and prepares them for downstream processing by Airflow+DBT pipelines. It orchestrates the complete workflow from file registration and validation to OCI storage preparation, ensuring files are properly validated and positioned for consumption by the Airflow+DBT data processing stack.
**Key Characteristics:**
- **File Validation Focus**: Comprehensive validation of incoming CSV files against template structures
- **Airflow+DBT Preparation**: Prepares validated files for loading and processing by Airflow+DBT pipelines
- **OCI File Management**: Handles file operations and movements within Oracle Cloud Infrastructure
- **Umbrella Procedure**: Coordinates multiple validation and file preparation sub-procedures in sequence
- **Automated Workflow**: Requires minimal manual intervention once configured
- **Error Resilient**: Comprehensive error handling and logging for validation and file operations
- **Status Tracking**: Updates file processing status throughout validation and preparation workflow
## Procedure Signatures
The procedure is available in two variants:
### Procedure Version
```sql
PROCEDURE PROCESS_SOURCE_FILE(pSourceFileReceivedName IN VARCHAR2);
```
**Purpose**: Execute processing workflow without return value
**Use Case**: Standard automated processing, fire-and-forget scenarios
### Function Version
```sql
FUNCTION PROCESS_SOURCE_FILE(pSourceFileReceivedName IN VARCHAR2) RETURN PLS_INTEGER;
```
**Purpose**: Execute processing workflow and return status code
**Use Case**: When you need to check processing result programmatically
## Parameters
### pSourceFileReceivedName
- **Type**: VARCHAR2
- **Required**: YES
- **Description**: Relative path to the file within the cloud storage bucket
- **Format**: `INBOX/{SOURCE}/{SOURCE_FILE_ID}/{TABLE_ID}/filename.csv`
**Examples:**
```sql
'INBOX/C2D/UC_DISSEM/A_UC_DISSEM_METADATA_LOADS/UC_NMA_DISSEM-277740.csv'
'INBOX/TOP/ALLOTMENT/AGGREGATED_ALLOTMENT/allotment_data_20241006.csv'
'INBOX/LM/RATES/INTEREST_RATES/rates_monthly_202410.csv'
```
## Processing Workflow
The procedure executes six main steps in sequence:
### Step 1: REGISTER_SOURCE_FILE_RECEIVED
**Purpose**: Register file in the system and extract metadata
**Actions:**
- Creates record in `CT_MRDS.A_SOURCE_FILE_RECEIVED` table
- Determines source configuration based on file path pattern
- Extracts file metadata (size, checksum, creation date)
- Assigns unique `A_SOURCE_FILE_RECEIVED_KEY`
- Sets initial status to 'RECEIVED'
### Step 2: CREATE_EXTERNAL_TABLE
**Purpose**: Create temporary external table for data access
**Actions:**
- Generates unique external table name
- Creates external table pointing to the CSV file
- Uses template table structure from `CT_ET_TEMPLATES`
- Configures appropriate column mappings and data types
### Step 3: VALIDATE_SOURCE_FILE_RECEIVED
**Purpose**: Perform comprehensive data validation
**Actions:**
- Validates CSV column count against template
- Checks data type compatibility
- Verifies required fields are populated
- Performs business rule validations
- Updates status to 'VALIDATED' on success
### Step 4: DROP_EXTERNAL_TABLE
**Purpose**: Clean up temporary external table
**Actions:**
- Drops the temporary external table created in Step 2
- Releases database resources
- Maintains clean schema state
### Step 5: MOVE_FILE
**Purpose**: Relocate file from INBOX to ODS location
**Actions:**
- Copies file from INBOX bucket to ODS bucket
- Preserves file metadata
- Deletes original file from INBOX after successful copy
### Step 6: SET_SOURCE_FILE_RECEIVED_STATUS
**Purpose**: Update final processing status
**Actions:**
- Sets `PROCESSING_STATUS` to 'READY_FOR_INGESTION'
- Records completion timestamp
- Indicates file is validated and ready for Airflow+DBT processing
## Return Values (Function Version)
| Value | Meaning | Description |
|-------|---------|-------------|
| `0` | Success | File processed successfully through all steps |
| `-20001` | Empty Parameters | Both fileUri and receivedKey parameters are NULL |
| `-20002` | No Config Match | No configuration matches the file pattern |
| `-20011` | Column Mismatch | CSV has different column count than template |
| `-20021` | Processing Error | General processing failure |
| Other negative | Various Errors | Specific error codes for different failure scenarios |
## Usage Examples
### Basic Processing
```sql
-- Simple processing (procedure version)
BEGIN
CT_MRDS.FILE_MANAGER.PROCESS_SOURCE_FILE(
pSourceFileReceivedName => 'INBOX/C2D/UC_DISSEM/A_UC_DISSEM_METADATA_LOADS/data_file.csv'
);
END;
/
```
## Prerequisites
Before using `PROCESS_SOURCE_FILE`, ensure proper system configuration is in place. For detailed setup instructions including source system registration, file type configuration, template table creation, and date format configuration, see the [FILE_MANAGER Configuration Guide](FILE_MANAGER_Configuration_Guide.md).
## Monitoring and Troubleshooting
### Monitoring File Processing Status
```sql
-- Check recent file processing activity
SELECT
SOURCE_FILE_NAME,
PROCESSING_STATUS,
RECEPTION_DATE,
EXTERNAL_TABLE_NAME
FROM CT_MRDS.A_SOURCE_FILE_RECEIVED
WHERE RECEPTION_DATE >= SYSDATE - 1 -- Last 24 hours
ORDER BY RECEPTION_DATE DESC;
```
### Processing Status Values
**Processing Status Values:**
| Status | Description | Workflow Stage |
|--------|-------------|----------------|
| `RECEIVED` | File registered, processing starting | Initial registration |
| `VALIDATED` | File validation completed successfully | After successful validation |
| `READY_FOR_INGESTION` | File validated and prepared for Airflow+DBT processing | After successful validation and preparation |
| `INGESTED` | Data has been consumed/ingested by target system | After data consumption |
| `ARCHIVED` | Data exported to PARQUET format and file moved to archival storage | Final archival state using FILE_ARCHIVER |
| `VALIDATION_FAILED` | File validation failed | After failed validation |
### Detailed Processing Logs
```sql
-- View detailed processing logs
SELECT
LOG_TIMESTAMP,
PROCEDURE_NAME,
LOG_LEVEL,
LOG_MESSAGE,
PROCEDURE_PARAMETERS
FROM CT_MRDS.A_PROCESS_LOG
WHERE PROCEDURE_NAME IN ('PROCESS_SOURCE_FILE', 'REGISTER_SOURCE_FILE_RECEIVED',
'CREATE_EXTERNAL_TABLE', 'VALIDATE_SOURCE_FILE_RECEIVED')
AND LOG_TIMESTAMP >= SYSDATE - 1
ORDER BY LOG_TIMESTAMP DESC;
```
### Common Error Scenarios and Solutions
#### Error -20002: No Configuration Match
**Problem**: File path doesn't match any configured pattern
```sql
-- Check configured patterns
SELECT
s.A_SOURCE_KEY,
sfc.SOURCE_FILE_ID,
sfc.SOURCE_FILE_NAME_PATTERN,
sfc.TABLE_ID
FROM CT_MRDS.A_SOURCE_FILE_CONFIG sfc
JOIN CT_MRDS.A_SOURCE s ON s.A_SOURCE_KEY = sfc.A_SOURCE_KEY
ORDER BY s.A_SOURCE_KEY, sfc.SOURCE_FILE_ID;
```
**Solution**: Add missing configuration or correct file naming
#### Error -20011: Column Count Mismatch
**Problem**: CSV file has different number of columns than template table
```sql
-- Check template table structure
SELECT column_name, data_type, column_id
FROM user_tab_columns
WHERE table_name = 'YOUR_TEMPLATE_TABLE'
ORDER BY column_id;
-- Analyze validation errors
SELECT FILE_MANAGER.ANALYZE_VALIDATION_ERRORS(file_key) FROM DUAL;
```
**Solutions**:
1. Fix CSV file column count
2. Add missing columns to template table
3. Remove excess columns from CSV
#### File Not Found Errors
**Problem**: File doesn't exist in expected cloud storage location
```sql
-- List files in bucket location
SELECT object_name
FROM DBMS_CLOUD.LIST_OBJECTS(
credential_name => 'DEF_CRED_ARN',
location_uri => 'https://your-bucket-uri/',
prefix => 'INBOX/C2D/UC_DISSEM/'
)
WHERE ROWNUM <= 20;
```
**Solutions**:
1. Verify file upload to correct location
2. Check file naming matches expected pattern
3. Verify cloud storage credentials and permissions
## Enhanced Error Monitoring and Logging
### Error Log Monitoring
The FILE_MANAGER system provides comprehensive error logging for troubleshooting:
```sql
-- View recent processing errors
SELECT LOG_TIMESTAMP, LOG_LEVEL, LOG_MESSAGE, PROCEDURE_NAME
FROM CT_MRDS.A_PROCESS_LOG
WHERE LOG_LEVEL = 'ERROR'
AND LOG_TIMESTAMP >= SYSDATE - 1 -- Last 24 hours
ORDER BY LOG_TIMESTAMP DESC;
-- View validation-specific errors
SELECT LOG_TIMESTAMP, LOG_MESSAGE
FROM CT_MRDS.A_PROCESS_LOG
WHERE LOG_MESSAGE LIKE '%EXCESS COLUMNS%'
OR LOG_MESSAGE LIKE '%VALIDATION%'
ORDER BY LOG_TIMESTAMP DESC;
-- Analyze errors for specific file
SELECT sfl.SOURCE_FILE_NAME, pl.LOG_MESSAGE, pl.LOG_TIMESTAMP
FROM CT_MRDS.A_SOURCE_FILE_RECEIVED sfl
JOIN CT_MRDS.A_PROCESS_LOG pl ON pl.LOG_MESSAGE LIKE '%' || sfl.SOURCE_FILE_NAME || '%'
WHERE sfl.SOURCE_FILE_NAME = 'your_file.csv'
AND pl.LOG_LEVEL = 'ERROR';
```
### File Validation and Error Handling
The FILE_MANAGER system includes comprehensive validation features for CSV files during processing:
#### Pre-Processing Validation
- **Column Count Verification**: Automatically checks if CSV files match template table structure
- **Error Prevention**: Validates files before creating external tables to prevent processing failures
- **Detailed Error Messages**: Provides specific guidance when validation fails
#### Common Validation Scenarios
**Scenario 1: Excess Columns (Error -20011)**
```
EXCESS COLUMNS DETECTED!
CSV file has 8 columns but template expects only 5
Excess columns: 3
```
**Solutions:**
1. Remove excess columns from CSV file
2. Add missing columns to template table:
```sql
ALTER TABLE CT_ET_TEMPLATES.{SOURCE}_{TABLE_NAME}
ADD (NEW_COLUMN1 VARCHAR2(100), NEW_COLUMN2 NUMBER);
```
#### Error Analysis for File Validation
```sql
-- Find file key for analysis
SELECT A_SOURCE_FILE_RECEIVED_KEY
FROM CT_MRDS.A_SOURCE_FILE_RECEIVED
WHERE SOURCE_FILE_NAME = 'your_file.csv';
-- Analyze validation errors using wrapper function
SELECT CT_MRDS.FILE_MANAGER.ANALYZE_VALIDATION_ERRORS(file_key) FROM DUAL;
-- Example with specific key:
SELECT CT_MRDS.FILE_MANAGER.ANALYZE_VALIDATION_ERRORS(63) FROM DUAL;
```
#### Validation Error Monitoring
```sql
-- View recent validation errors
SELECT LOG_TIMESTAMP, LOG_MESSAGE
FROM CT_MRDS.A_PROCESS_LOG
WHERE LOG_LEVEL = 'ERROR'
AND (LOG_MESSAGE LIKE '%EXCESS COLUMNS%' OR LOG_MESSAGE LIKE '%VALIDATION%')
ORDER BY LOG_TIMESTAMP DESC;
```
### Common Error Patterns and Solutions
| Error Code | Pattern | Solution |
|------------|---------|----------|
| ORA-20011 | EXCESS COLUMNS DETECTED | Remove excess columns from CSV or add missing columns to template table |
| ORA-20002 | No match for source file | Configure file pattern in A_SOURCE_FILE_CONFIG |
| ORA-29913 | External table open error | Check bucket paths and file existence |
| ORA-01821 | Date format not recognized | Update date format in ADD_COLUMN_DATE_FORMAT |
### Proactive Monitoring Setup
Set up monitoring for critical error patterns:
```sql
-- Create monitoring view for critical errors
CREATE OR REPLACE VIEW V_CRITICAL_ERRORS AS
SELECT
LOG_TIMESTAMP,
PROCEDURE_NAME,
CASE
WHEN LOG_MESSAGE LIKE '%ORA-20011%' THEN 'COLUMN_MISMATCH'
WHEN LOG_MESSAGE LIKE '%ORA-20002%' THEN 'CONFIG_MISSING'
WHEN LOG_MESSAGE LIKE '%ORA-29913%' THEN 'FILE_ACCESS'
ELSE 'OTHER_ERROR'
END as ERROR_CATEGORY,
LOG_MESSAGE
FROM CT_MRDS.A_PROCESS_LOG
WHERE LOG_LEVEL = 'ERROR'
AND LOG_TIMESTAMP >= SYSDATE - 7; -- Last week
```
This enhanced monitoring helps identify and resolve issues quickly, ensuring smooth file processing operations.
## Best Practices
### File Naming Conventions
- Use consistent naming patterns that match `SOURCE_FILE_NAME_PATTERN`
- Avoid special characters that might cause parsing issues
## Related Procedures
The following procedures are called internally by `PROCESS_SOURCE_FILE`:
- **REGISTER_SOURCE_FILE_RECEIVED**: File registration and metadata extraction
- **CREATE_EXTERNAL_TABLE**: External table creation for data access
- **VALIDATE_SOURCE_FILE_RECEIVED**: Data validation and structure checking
- **DROP_EXTERNAL_TABLE**: Cleanup of temporary external tables
- **MOVE_FILE**: File relocation between buckets
- **SET_SOURCE_FILE_RECEIVED_STATUS**: Status management
For detailed information about individual procedures, refer to the package documentation.
## Summary
`PROCESS_SOURCE_FILE` is the cornerstone of the FILE PROCESSOR system, providing a complete automated workflow for validating files and preparing them for Airflow+DBT processing pipelines. Its umbrella architecture ensures consistent file validation and preparation while comprehensive error handling and logging provide visibility and reliability for enterprise file processing operations that feed into downstream Airflow+DBT data workflows.

File diff suppressed because it is too large Load Diff

692
confluence/Tables_setup.md Normal file
View File

@@ -0,0 +1,692 @@
# Table Setup Guide for FILE PROCESSOR System
This document describes the process of setting up tables for the FILE PROCESSOR system using the example of `C2D_A_UC_DISSEM_METADATA_LOADS` table setup.
## Overview
The table setup process involves migrating existing tables from operational schemas (like `OU_C2D`) to the FILE PROCESSOR framework, which includes:
- Creating template tables for external table definitions
- Setting up external tables for different storage locations (INBOX, ODS, ARCHIVE)
- Exporting existing data to cloud storage
- Creating legacy backup tables
- Setting up views to maintain compatibility
## Step-by-Step Setup Process
**Recommended Approach:** Use the `ODS.FILE_MANAGER_ODS` package for creating external tables. The `FILE_MANAGER_ODS` package is a wrapper for the FILE_MANAGER package that uses `AUTHID DEFINER` instead of `AUTHID CURRENT_USER`, which means it will always create objects in the ODS schema regardless of which user executes the procedures.
**Alternative Approach:** If you prefer using the original package, you can use `CT_MRDS.FILE_MANAGER`, but the CREATE_EXTERNAL_TABLE procedure must be run as the ODS user due to the `AUTHID CURRENT_USER` clause. Export procedures (EXPORT_TABLE_DATA_TO_CSV_BY_DATE, EXPORT_TABLE_DATA_BY_DATE) have been moved to the `CT_MRDS.DATA_EXPORTER` package.
Example usage:
```sql
-- Recommended: Use DEFINER package (works from any user context)
ODS.FILE_MANAGER_ODS.CREATE_EXTERNAL_TABLE(...)
-- Alternative: Use CURRENT_USER package (requires ODS user context)
-- CT_MRDS.FILE_MANAGER.CREATE_EXTERNAL_TABLE(...)
```
### Step 1: Create Template Table
Create a template table in the `CT_ET_TEMPLATES` schema based on the existing operational table structure:
```sql
CREATE TABLE CT_ET_TEMPLATES.C2D_A_UC_DISSEM_METADATA_LOADS
AS SELECT * FROM OU_C2D.A_UC_DISSEM_METADATA_LOADS WHERE 1=2;
```
**Purpose:**
- The template table defines the structure for external tables
- Uses `WHERE 1=2` to copy only the structure, not the data
- Located in `CT_ET_TEMPLATES` schema for centralized template management
### Step 2: Configure FILE_MANAGER System
Before creating external tables, configure the FILE_MANAGER system with source information and file processing rules, as the `CREATE_EXTERNAL_TABLE` procedure uses data from configuration tables. For detailed information on configuring the FILE_MANAGER package procedures including:
- **ADD_SOURCE**: Registering new source systems
- **ADD_SOURCE_FILE_CONFIG**: Configuring file processing rules and naming patterns (**MARS-1049**: now includes `pEncoding` parameter for CSV character set support)
- **ADD_COLUMN_DATE_FORMAT**: Setting up date format handling for specific columns
See the comprehensive [FILE_MANAGER Configuration Guide](FILE_MANAGER_Configuration_Guide.md).
**MARS-1049 Enhancement Note**: The `ADD_SOURCE_FILE_CONFIG` procedure now supports the `pEncoding` parameter to specify character encodings (e.g., UTF8, WE8MSWIN1252) for proper international character handling in CSV files.
This configuration enables automatic file processing workflows where files uploaded to the INBOX are automatically processed and moved to ODS, with historical data archived in partitioned PARQUET format.
### Step 3: Create External Tables
Create external tables for different storage locations using the `FILE_MANAGER.CREATE_EXTERNAL_TABLE` procedure:
```sql
-- External table for INBOX (incoming files)
BEGIN
ODS.FILE_MANAGER_ODS.CREATE_EXTERNAL_TABLE(
pTableName => 'C2D_A_UC_DISSEM_METADATA_LOADS_INBOX',
pTemplateTableName => 'CT_ET_TEMPLATES.C2D_A_UC_DISSEM_METADATA_LOADS',
pPrefix => 'INBOX/C2D/UC_DISSEM/A_UC_DISSEM_METADATA_LOADS',
pBucketUri => CT_MRDS.ENV_MANAGER.gvInboxBucketUri
);
END;
/
-- Alternative using CURRENT_USER package (requires ODS user context)
-- BEGIN
-- CT_MRDS.FILE_MANAGER.CREATE_EXTERNAL_TABLE(
-- pTableName => 'C2D_A_UC_DISSEM_METADATA_LOADS_INBOX',
-- pTemplateTableName => 'CT_ET_TEMPLATES.C2D_A_UC_DISSEM_METADATA_LOADS',
-- pPrefix => 'INBOX/C2D/UC_DISSEM/A_UC_DISSEM_METADATA_LOADS',
-- pBucketUri => CT_MRDS.ENV_MANAGER.gvInboxBucketUri
-- );
-- END;
-- /
-- External table for ODS (operational data store)
BEGIN
ODS.FILE_MANAGER_ODS.CREATE_EXTERNAL_TABLE(
pTableName => 'C2D_A_UC_DISSEM_METADATA_LOADS_ODS',
pTemplateTableName => 'CT_ET_TEMPLATES.C2D_A_UC_DISSEM_METADATA_LOADS',
pPrefix => 'ODS/C2D/A_UC_DISSEM_METADATA_LOADS',
pBucketUri => CT_MRDS.ENV_MANAGER.gvDataBucketUri
);
END;
/
-- Alternative using CURRENT_USER package (requires ODS user context)
-- BEGIN
-- CT_MRDS.FILE_MANAGER.CREATE_EXTERNAL_TABLE(
-- pTableName => 'C2D_A_UC_DISSEM_METADATA_LOADS_ODS',
-- pTemplateTableName => 'CT_ET_TEMPLATES.C2D_A_UC_DISSEM_METADATA_LOADS',
-- pPrefix => 'ODS/C2D/A_UC_DISSEM_METADATA_LOADS',
-- pBucketUri => CT_MRDS.ENV_MANAGER.gvDataBucketUri
-- );
-- END;
-- /
-- External table for ARCHIVE (historical data)
BEGIN
ODS.FILE_MANAGER_ODS.CREATE_EXTERNAL_TABLE(
pTableName => 'C2D_A_UC_DISSEM_METADATA_LOADS_ARCHIVE',
pTemplateTableName => 'CT_ET_TEMPLATES.C2D_A_UC_DISSEM_METADATA_LOADS',
pPrefix => 'ARCHIVE/C2D/A_UC_DISSEM_METADATA_LOADS',
pBucketUri => CT_MRDS.ENV_MANAGER.gvArchiveBucketUri
);
END;
/
-- Alternative using CURRENT_USER package (requires ODS user context)
-- BEGIN
-- CT_MRDS.FILE_MANAGER.CREATE_EXTERNAL_TABLE(
-- pTableName => 'C2D_A_UC_DISSEM_METADATA_LOADS_ARCHIVE',
-- pTemplateTableName => 'CT_ET_TEMPLATES.C2D_A_UC_DISSEM_METADATA_LOADS',
-- pPrefix => 'ARCHIVE/C2D/A_UC_DISSEM_METADATA_LOADS',
-- pBucketUri => CT_MRDS.ENV_MANAGER.gvArchiveBucketUri
-- );
-- END;
-- /
```
**Parameters:**
- `pTableName`: Name of the external table to create
- `pTemplateTableName`: Template table defining the structure
- `pPrefix`: Storage path prefix in Oracle Cloud Storage
- `pBucketUri`: URI of the target bucket (uses ENV_MANAGER global variables for different storage types)
**Storage Locations:**
- **INBOX**: For incoming files awaiting processing (uses `gvInboxBucketUri`)
- **ODS**: For processed files in operational data store (uses `gvDataBucketUri`)
- **ARCHIVE**: For historical/archived files (uses `gvArchiveBucketUri`)
### Step 4: Export Data to ODS Bucket (CSV Format)
Export existing operational data to ODS bucket in CSV format using the `DATA_EXPORTER` package:
```sql
-- Export all data to CSV format (using default pMinDate = 1900-01-01 and pMaxDate = SYSDATE)
BEGIN
CT_MRDS.DATA_EXPORTER.EXPORT_TABLE_DATA_TO_CSV_BY_DATE(
pSchemaName => 'OU_C2D', -- Source schema
pTableName => 'A_UC_DISSEM_METADATA_LOADS', -- Source table
pKeyColumnName => 'A_ETL_LOAD_SET_FK', -- Key column for partitioning
pBucketName => 'mrds_data_poc', -- Target bucket (ODS)
pFolderName => 'ODS/C2D/A_UC_DISSEM_METADATA_LOADS' -- Target folder
);
END;
/
-- Export data within a specific date range to CSV
BEGIN
CT_MRDS.DATA_EXPORTER.EXPORT_TABLE_DATA_TO_CSV_BY_DATE(
pSchemaName => 'OU_C2D',
pTableName => 'A_UC_DISSEM_METADATA_LOADS',
pKeyColumnName => 'A_ETL_LOAD_SET_FK',
pBucketName => 'mrds_data_poc',
pFolderName => 'ODS/C2D/A_UC_DISSEM_METADATA_LOADS',
pMinDate => DATE '2024-01-01', -- Export data from 2024-01-01
pMaxDate => DATE '2024-12-31' -- Export data up to 2024-12-31
);
END;
/
-- Export only recent data to CSV (last 30 days)
BEGIN
CT_MRDS.DATA_EXPORTER.EXPORT_TABLE_DATA_TO_CSV_BY_DATE(
pSchemaName => 'OU_C2D',
pTableName => 'A_UC_DISSEM_METADATA_LOADS',
pKeyColumnName => 'A_ETL_LOAD_SET_FK',
pBucketName => 'mrds_data_poc',
pFolderName => 'ODS/C2D/A_UC_DISSEM_METADATA_LOADS',
pMinDate => SYSDATE - 30 -- Export data from last 30 days
);
END;
/
```
**Purpose:**
- Exports existing data to ODS bucket in CSV format for immediate operational use
- Creates CSV files partitioned by date (YYYYMM format)
- Files are ready for consumption by external tables and processing workflows
- Uses `mrds_data_poc` bucket which corresponds to `gvDataBucketUri`
### Step 5: Export Historical Data to Archive Bucket (PARQUET Format)
Export existing operational data to cloud storage using partitioning by date with the `DATA_EXPORTER` package:
```sql
-- Export all data (using default pMinDate = 1900-01-01 and pMaxDate = SYSDATE)
BEGIN
CT_MRDS.DATA_EXPORTER.EXPORT_TABLE_DATA_BY_DATE(
pSchemaName => 'OU_C2D', -- Source schema
pTableName => 'A_UC_DISSEM_METADATA_LOADS', -- Source table
pKeyColumnName => 'A_ETL_LOAD_SET_FK', -- Key column for partitioning
pBucketName => 'mrds_history_poc', -- Target bucket
pFolderName => 'C2D/A_UC_DISSEM_METADATA_LOADS' -- Target folder
);
END;
/
-- Export data within a specific date range
BEGIN
CT_MRDS.DATA_EXPORTER.EXPORT_TABLE_DATA_BY_DATE(
pSchemaName => 'OU_C2D',
pTableName => 'A_UC_DISSEM_METADATA_LOADS',
pKeyColumnName => 'A_ETL_LOAD_SET_FK',
pBucketName => 'mrds_history_poc',
pFolderName => 'C2D/A_UC_DISSEM_METADATA_LOADS',
pMinDate => DATE '2024-01-01', -- Export data from 2024-01-01
pMaxDate => DATE '2024-12-31' -- Export data up to 2024-12-31
);
END;
/
-- Export only recent data (last 30 days)
BEGIN
CT_MRDS.DATA_EXPORTER.EXPORT_TABLE_DATA_BY_DATE(
pSchemaName => 'OU_C2D',
pTableName => 'A_UC_DISSEM_METADATA_LOADS',
pKeyColumnName => 'A_ETL_LOAD_SET_FK',
pBucketName => 'mrds_history_poc',
pFolderName => 'C2D/A_UC_DISSEM_METADATA_LOADS',
pMinDate => SYSDATE - 30 -- Export data from last 30 days
);
END;
/
```
**Purpose:**
- Exports existing data to cloud storage before switching to FILE_MANAGER
- Creates PARQUET files partitioned by date (YEAR_MONTH)
- Ensures data preservation during migration
### Step 6: Create Legacy Backup Table
Rename the original table to preserve existing data:
```sql
ALTER TABLE OU_C2D.A_UC_DISSEM_METADATA_LOADS
RENAME TO A_UC_DISSEM_METADATA_LOADS_LEGACY;
```
**Purpose:**
- Preserves original table as backup
- Allows rollback if needed
- Maintains data integrity during migration
### Step 7: Create Compatibility View
Create a view that points to the new ODS external table:
```sql
-- Grant access to the ODS external table
GRANT SELECT ON ods.C2D_A_UC_DISSEM_METADATA_LOADS_ODS TO OU_C2D;
-- Create view with original table name
CREATE OR REPLACE VIEW OU_C2D.A_UC_DISSEM_METADATA_LOADS AS
SELECT * FROM ods.C2D_A_UC_DISSEM_METADATA_LOADS_ODS;
```
**Purpose:**
- Maintains compatibility with existing applications
- Redirects queries to FILE PROCESSOR external tables
- Seamless transition for dependent systems
## Package Procedures Used
### FILE_MANAGER Package (External Tables)
**Recommended Approach:** Use the **ODS.FILE_MANAGER_ODS** package which uses `AUTHID DEFINER` and will create objects in the ODS schema regardless of the executing user. This is the most reliable approach for creating external tables.
**Alternative Approach:** You can use the `CT_MRDS.FILE_MANAGER` package, but the `CREATE_EXTERNAL_TABLE` procedure must be executed as the **ODS user** due to the `AUTHID CURRENT_USER` clause.
### DATA_EXPORTER Package (Data Export)
Data export procedures have been moved to the `CT_MRDS.DATA_EXPORTER` package, which provides enhanced functionality and better separation of concerns.
### CREATE_EXTERNAL_TABLE
Creates external tables that can read data from Oracle Cloud Storage.
**Signature:**
```sql
PROCEDURE CREATE_EXTERNAL_TABLE (
pTableName IN VARCHAR2,
pTemplateTableName IN VARCHAR2,
pPrefix IN VARCHAR2,
pBucketUri IN VARCHAR2 DEFAULT ENV_MANAGER.gvInboxBucketUri,
pFileName IN VARCHAR2 DEFAULT NULL,
pDelimiter IN VARCHAR2 DEFAULT ','
);
```
### DATA_EXPORTER.EXPORT_TABLE_DATA_TO_CSV_BY_DATE
Exports table data to cloud storage in CSV format with date-based partitioning.
**Signature:**
```sql
PROCEDURE EXPORT_TABLE_DATA_TO_CSV_BY_DATE (
pSchemaName IN VARCHAR2,
pTableName IN VARCHAR2,
pKeyColumnName IN VARCHAR2,
pBucketArea IN VARCHAR2, -- Updated: now uses pBucketArea instead of pBucketName
pFolderName IN VARCHAR2,
pFileName IN VARCHAR2 DEFAULT NULL,
pColumnList IN VARCHAR2 DEFAULT NULL, -- New: allows specifying custom columns
pMinDate IN DATE DEFAULT DATE '1900-01-01',
pMaxDate IN DATE DEFAULT SYSDATE,
pCredentialName IN VARCHAR2 DEFAULT ENV_MANAGER.gvCredentialName
);
```
**Parameters:**
- `pSchemaName`: Schema containing the source table
- `pTableName`: Name of the table to export
- `pKeyColumnName`: Column name used for date-based filtering (typically LOAD_START or similar date column)
- `pBucketName`: Oracle Cloud Storage bucket name for export
- `pFolderName`: Folder path within the bucket
- `pFileName`: Custom filename prefix (optional). If NULL, uses table name as prefix
- `pMinDate`: Minimum date for filtering records (defaults to DATE '1900-01-01'). Only records where the key column is >= pMinDate will be exported
- `pMaxDate`: Maximum date for filtering records (defaults to SYSDATE). Only records where the key column is <= pMaxDate will be exported
- `pNamespace`: OCI namespace (defaults to environment configuration)
- `pRegion`: OCI region (defaults to environment configuration)
- `pCredentialName`: OCI credentials (defaults to environment configuration)
**Purpose:**
- Creates separate CSV files partitioned by year and month (YYYYMM format)
- Files are immediately ready for consumption by external tables
- Ideal for operational data that needs to be processed quickly
- File naming pattern: `{pFileName}_YYYYMM.csv` or `{TABLE_NAME}_YYYYMM.csv` (if pFileName is NULL)
### DATA_EXPORTER.EXPORT_TABLE_DATA_BY_DATE
Exports table data to cloud storage in PARQUET format with date-based partitioning.
**Signature:**
```sql
PROCEDURE EXPORT_TABLE_DATA_BY_DATE (
pSchemaName IN VARCHAR2,
pTableName IN VARCHAR2,
pKeyColumnName IN VARCHAR2,
pBucketArea IN VARCHAR2, -- Updated: now uses pBucketArea instead of pBucketName
pFolderName IN VARCHAR2,
pColumnList IN VARCHAR2 DEFAULT NULL, -- New: allows specifying custom columns
pMinDate IN DATE DEFAULT DATE '1900-01-01',
pMaxDate IN DATE DEFAULT SYSDATE,
pCredentialName IN VARCHAR2 DEFAULT ENV_MANAGER.gvCredentialName
);
```
**Parameters:**
- `pSchemaName`: Schema containing the source table
- `pTableName`: Name of the table to export
- `pKeyColumnName`: Column name used for date-based filtering (typically LOAD_START or similar date column)
- `pBucketArea`: Bucket area identifier ('INBOX', 'ODS', 'DATA', 'ARCHIVE') - automatically maps to correct bucket URI
- `pFolderName`: Folder path within the bucket
- `pColumnList`: Optional comma-separated list of columns to export (uses T.* if NULL)
- `pMinDate`: Minimum date for filtering records (defaults to DATE '1900-01-01'). Only records where the key column is >= pMinDate will be exported
- `pMaxDate`: Maximum date for filtering records (defaults to SYSDATE). Only records where the key column is <= pMaxDate will be exported
- `pCredentialName`: OCI credentials (defaults to environment configuration)
## Best Practices
### 1. Naming Conventions
- **Template tables**: `CT_ET_TEMPLATES.{SOURCE}_{TABLE_NAME}`
- **External tables**: `{SOURCE}_{TABLE_NAME}_{LOCATION}` (e.g., `_INBOX`, `_ODS`, `_ARCHIVE`)
- **Legacy tables**: `{ORIGINAL_NAME}_LEGACY`
### 2. Schema Organization
- **CT_ET_TEMPLATES**: Template table definitions
- **ODS**: External tables for processed data
- **OU_{SOURCE}**: Compatibility views and legacy tables
### 3. Storage Structure
```
Oracle Cloud Storage Buckets Structure:
INBOX Bucket:
├── INBOX/
│ └── {SOURCE}/
│ └── {SOURCE_FILE_ID}/
│ └── {TABLE_NAME}/
DATA Bucket:
├── ODS/
│ └── {SOURCE}/
│ └── {TABLE_NAME}/
ARCHIVE Bucket:
└── ARCHIVE/
└── {SOURCE}/
└── {TABLE_NAME}/
└── PARTITION_YEAR=*/
└── PARTITION_MONTH=*/
└── *.parquet
```
### 4. Migration Checklist
- [ ] Create template table
- [ ] Configure FILE_MANAGER system (ADD_SOURCE, ADD_SOURCE_FILE_CONFIG, ADD_COLUMN_DATE_FORMAT)
- [ ] Create external tables (INBOX, ODS, ARCHIVE)
- [ ] Export data to ODS bucket (CSV format) (optional) (data baudries)
- [ ] Export historical data to archive bucket (PARQUET format)
- [ ] Rename original table to legacy
- [ ] Create compatibility view
- [ ] Test data access through view
- [ ] Verify error handling
## Troubleshooting
### Common Issues
1. **Execution Context Issues**
- **Problem:** External tables created in wrong schema
- **Solution 1 (Recommended):** Use the DEFINER package that works from any user context
```sql
-- Use ODS.FILE_MANAGER_ODS instead of CT_MRDS.FILE_MANAGER
BEGIN
ODS.FILE_MANAGER_ODS.CREATE_EXTERNAL_TABLE(...);
END;
/
```
- **Solution 2 (Alternative):** Ensure CREATE_EXTERNAL_TABLE procedure is executed as ODS user
```sql
-- Check current user context
SELECT USER FROM DUAL;
-- Should return: ODS
```
2. **Permission Errors**
```sql
-- Grant necessary permissions
GRANT SELECT ON CT_ET_TEMPLATES.{TEMPLATE_TABLE} TO CT_MRDS;
GRANT SELECT ON ods.{EXTERNAL_TABLE} TO {TARGET_SCHEMA};
```
3. **External Table Creation Failures**
- Verify bucket and folder paths exist
- Check credential configuration
- Ensure template table structure is correct
4. **Data Export Issues**
- Verify source table has data
- Check key column exists and has appropriate data type
- Ensure sufficient storage space in target bucket
5. **File Validation Failures (ORA-20011)**
- **Problem:** CSV file contains more columns than template table allows
- **Error Message:** "EXCESS COLUMNS DETECTED! CSV file has X columns but template expects only Y"
- **Solutions:**
- Remove excess columns from CSV file before processing
- Add missing columns to template table
- Use `ANALYZE_VALIDATION_ERRORS()` function for detailed analysis
- **Prevention:** Ensure CSV structure matches template table before upload
6. **External Table Query Failures (ORA-29913/KUP-05002)**
- **Problem:** "ORA-29913: error while processing ODCIEXTTABLEOPEN routine" with "KUP-05002: There are no matching files for any file specification in the LOCATION clause"
- **Cause:** Oracle cannot find files in the specified external table location
- **Common Scenarios:**
- Empty bucket/folder - no files uploaded yet
- Incorrect file path in LOCATION clause
- Wrong file name pattern (e.g., looking for *.csv but files are *.txt)
- Case sensitivity issues in file names
- **Solutions:**
```sql
-- 1. Check external table location
SELECT table_name, location
FROM user_external_locations
WHERE table_name = 'YOUR_EXTERNAL_TABLE';
-- 2. List files in bucket to verify they exist
SELECT object_name FROM DBMS_CLOUD.LIST_OBJECTS(
credential_name => 'YOUR_CREDENTIAL',
location_uri => 'YOUR_BUCKET_URI',
prefix => 'YOUR_FOLDER_PREFIX'
);
-- 3. Test with a simple file upload
-- Upload a test CSV file to the exact location specified in external table
```
- **Prevention:** Always verify file location matches external table LOCATION clause exactly
### Verification Queries
```sql
-- Check template table structure
SELECT column_name, data_type, data_length
FROM user_tab_columns
WHERE table_name = '{TEMPLATE_TABLE_NAME}'
ORDER BY column_id;
-- Verify external table creation
SELECT table_name, table_type
FROM user_tables
WHERE table_name LIKE '%{SOURCE}%{TABLE}%';
-- Test view access
SELECT COUNT(*) FROM OU_{SOURCE}.{TABLE_NAME};
-- Check exported files
SELECT object_name
FROM dbms_cloud.list_objects('{BUCKET_NAME}', '{FOLDER_PREFIX}');
```
## Example: Complete Setup Script
Here's a complete example based on the C2D UC_DISSEM metadata loads table:
```sql
-- Step 1: Create template table
CREATE TABLE CT_ET_TEMPLATES.C2D_A_UC_DISSEM_METADATA_LOADS
AS SELECT * FROM OU_C2D.A_UC_DISSEM_METADATA_LOADS WHERE 1=2;
-- Step 2: Configure FILE_MANAGER system
CALL CT_MRDS.FILE_MANAGER.ADD_SOURCE(
pSourceKey => 'C2D',
pSourceName => 'Central Bank Data System'
);
CALL CT_MRDS.FILE_MANAGER.ADD_SOURCE_FILE_CONFIG(
pSourceKey => 'C2D',
pSourceFileType => 'INPUT',
pSourceFileId => 'UC_DISSEM',
pSourceFileDesc => 'UC Dissemination Metadata Loads',
pSourceFileNamePattern => 'UC_NMA_DISSEM-*.csv',
pTableId => 'A_UC_DISSEM_METADATA_LOADS',
pTemplateTableName => 'CT_ET_TEMPLATES.C2D_A_UC_DISSEM_METADATA_LOADS',
pContainerFileKey => NULL
);
CALL CT_MRDS.FILE_MANAGER.ADD_COLUMN_DATE_FORMAT(
pTemplateTableName => 'CT_ET_TEMPLATES.C2D_A_UC_DISSEM_METADATA_LOADS',
pColumnName => 'A_ETL_LOAD_SET_FK',
pDateFormat => 'YYYY-MM-DD'
);
-- Step 3: Create external tables
BEGIN
ODS.FILE_MANAGER_ODS.CREATE_EXTERNAL_TABLE(
pTableName => 'C2D_A_UC_DISSEM_METADATA_LOADS_INBOX',
pTemplateTableName => 'CT_ET_TEMPLATES.C2D_A_UC_DISSEM_METADATA_LOADS',
pPrefix => 'INBOX/C2D/UC_DISSEM/A_UC_DISSEM_METADATA_LOADS',
pBucketUri => CT_MRDS.ENV_MANAGER.gvInboxBucketUri
);
END;
/
BEGIN
ODS.FILE_MANAGER_ODS.CREATE_EXTERNAL_TABLE(
pTableName => 'C2D_A_UC_DISSEM_METADATA_LOADS_ODS',
pTemplateTableName => 'CT_ET_TEMPLATES.C2D_A_UC_DISSEM_METADATA_LOADS',
pPrefix => 'ODS/C2D/A_UC_DISSEM_METADATA_LOADS',
pBucketUri => CT_MRDS.ENV_MANAGER.gvDataBucketUri
);
END;
/
BEGIN
ODS.FILE_MANAGER_ODS.CREATE_EXTERNAL_TABLE(
pTableName => 'C2D_A_UC_DISSEM_METADATA_LOADS_ARCHIVE',
pTemplateTableName => 'CT_ET_TEMPLATES.C2D_A_UC_DISSEM_METADATA_LOADS',
pPrefix => 'ARCHIVE/C2D/A_UC_DISSEM_METADATA_LOADS',
pBucketUri => CT_MRDS.ENV_MANAGER.gvArchiveBucketUri
);
END;
/
-- Alternative using CURRENT_USER package (requires connection as ODS user):
-- BEGIN
-- CT_MRDS.FILE_MANAGER.CREATE_EXTERNAL_TABLE(
-- pTableName => 'C2D_A_UC_DISSEM_METADATA_LOADS_INBOX',
-- pTemplateTableName => 'CT_ET_TEMPLATES.C2D_A_UC_DISSEM_METADATA_LOADS',
-- pPrefix => 'INBOX/C2D/UC_DISSEM/A_UC_DISSEM_METADATA_LOADS',
-- pBucketUri => CT_MRDS.ENV_MANAGER.gvInboxBucketUri
-- );
-- END;
-- /
-- BEGIN
-- CT_MRDS.FILE_MANAGER.CREATE_EXTERNAL_TABLE(
-- pTableName => 'C2D_A_UC_DISSEM_METADATA_LOADS_ODS',
-- pTemplateTableName => 'CT_ET_TEMPLATES.C2D_A_UC_DISSEM_METADATA_LOADS',
-- pPrefix => 'ODS/C2D/A_UC_DISSEM_METADATA_LOADS',
-- pBucketUri => CT_MRDS.ENV_MANAGER.gvDataBucketUri
-- );
-- END;
-- /
-- BEGIN
-- CT_MRDS.FILE_MANAGER.CREATE_EXTERNAL_TABLE(
-- pTableName => 'C2D_A_UC_DISSEM_METADATA_LOADS_ARCHIVE',
-- pTemplateTableName => 'CT_ET_TEMPLATES.C2D_A_UC_DISSEM_METADATA_LOADS',
-- pPrefix => 'ARCHIVE/C2D/A_UC_DISSEM_METADATA_LOADS',
-- pBucketUri => CT_MRDS.ENV_MANAGER.gvArchiveBucketUri
-- );
-- END;
-- /
-- Step 4: Export data to ODS bucket (CSV format)
BEGIN
CT_MRDS.DATA_EXPORTER.EXPORT_TABLE_DATA_TO_CSV_BY_DATE(
pSchemaName => 'OU_C2D',
pTableName => 'A_UC_DISSEM_METADATA_LOADS',
pKeyColumnName => 'A_ETL_LOAD_SET_FK',
pBucketArea => 'ODS', -- Updated: uses bucket area instead of bucket name
pFolderName => 'C2D/UC_DISSEM/A_UC_DISSEM_METADATA_LOADS'
-- pMinDate defaults to DATE '1900-01-01' (exports all historical data)
-- pMaxDate defaults to SYSDATE (exports all data up to current date)
);
END;
/
-- Step 5: Export historical data to archive bucket (PARQUET format)
BEGIN
CT_MRDS.DATA_EXPORTER.EXPORT_TABLE_DATA_BY_DATE(
pSchemaName => 'OU_C2D',
pTableName => 'A_UC_DISSEM_METADATA_LOADS',
pKeyColumnName => 'A_ETL_LOAD_SET_FK',
pBucketArea => 'ARCHIVE', -- Updated: uses bucket area instead of bucket name
pFolderName => 'C2D/A_UC_DISSEM_METADATA_LOADS'
-- pMinDate defaults to DATE '1900-01-01' (exports all historical data)
-- pMaxDate defaults to SYSDATE (exports all data up to current date)
);
END;
/
-- Step 6: Create legacy backup
ALTER TABLE OU_C2D.A_UC_DISSEM_METADATA_LOADS RENAME TO A_UC_DISSEM_METADATA_LOADS_LEGACY;
-- Step 7: Create compatibility view
GRANT SELECT ON ods.C2D_A_UC_DISSEM_METADATA_LOADS_ODS TO OU_C2D;
CREATE OR REPLACE VIEW OU_C2D.A_UC_DISSEM_METADATA_LOADS AS
SELECT * FROM ods.C2D_A_UC_DISSEM_METADATA_LOADS_ODS;
-- Verify setup
SHOW ERRORS;
SELECT COUNT(*) FROM OU_C2D.A_UC_DISSEM_METADATA_LOADS;
```
## ENV_MANAGER Global Variables
The system uses `ENV_MANAGER` package global variables to manage different bucket URIs:
- `gvInboxBucketUri`: URI for incoming files bucket
- `gvDataBucketUri`: URI for operational data store bucket
- `gvArchiveBucketUri`: URI for historical/archive data bucket
This ensures consistent bucket configuration across all external table creation calls.
### Checking ENV_MANAGER Variable Values
To verify the current values of ENV_MANAGER global variables, use this script:
```sql
SET SERVEROUTPUT ON;
DECLARE
v_value1 VARCHAR2(4000);
v_value2 VARCHAR2(4000);
v_value3 VARCHAR2(4000);
BEGIN
v_value1 := CT_MRDS.ENV_MANAGER.gvInboxBucketUri;
v_value2 := CT_MRDS.ENV_MANAGER.gvDataBucketUri;
v_value3 := CT_MRDS.ENV_MANAGER.gvArchiveBucketUri;
DBMS_OUTPUT.PUT_LINE('------>>>> gvInboxBucketUri: ' || v_value1);
DBMS_OUTPUT.PUT_LINE('------>>>> gvDataBucketUri: ' || v_value2);
DBMS_OUTPUT.PUT_LINE('------>>>> gvArchiveBucketUri: ' || v_value3);
END;
/
```
This script helps troubleshoot external table creation issues by verifying that bucket URIs are properly configured.
## Summary
This process successfully migrates traditional Oracle tables to the FILE PROCESSOR framework, enabling cloud-based file processing while maintaining application compatibility. The migration includes:
1. **Template Creation** - Defining table structures in CT_ET_TEMPLATES schema
2. **FILE_MANAGER Configuration** - Setting up source systems, file processing rules, and date formats
3. **External Tables Setup** - Creating INBOX, ODS, and ARCHIVE external tables
4. **Data Export (CSV)** - Moving existing data to ODS bucket in CSV format for immediate processing
5. **Data Export (PARQUET)** - Moving historical data to archive bucket in PARQUET format for long-term storage
6. **Legacy Preservation** - Backing up original tables and creating compatibility views
After completion, your system will be ready for automated file processing with cloud-based storage and external table access patterns.

View File

@@ -0,0 +1,116 @@
# Simple Package Version Tracking
## Overview
Simple Oracle package version tracking system with a **clear package list** at the beginning of the script.
> **📖 For comprehensive deployment workflow, see:** [Package Deployment Guide](Package_Deployment_Guide.md)
> This document covers only the tracking script usage. For full deployment process, version update guidelines, and troubleshooting, refer to the main deployment guide.
## Available Scripts
1. **track_package_versions.sql** - Tracks package versions in ENV_MANAGER system
2. **verify_packages_version.sql** - Verifies all tracked packages for code changes
## How to Use?
### Step 1: Edit Package List
Open `track_package_versions.sql` and edit the package list section:
```sql
-- ===================================================================
-- PACKAGE LIST - Edit this array to specify packages to track
-- ===================================================================
vPackageList t_string_array := t_string_array(
'CT_MRDS.FILE_MANAGER',
'ODS.FILE_MANAGER_ODS'
-- Add more packages here:
-- ,'CT_MRDS.DATA_EXPORTER'
-- ,'MRDS_LOADER.CLOUD_WRAPPER'
);
-- ===================================================================
```
### Step 2: Use in Your Script
```sql
-- install_mars1049.sql
PROMPT Installing packages...
@@new_version/FILE_MANAGER.pkg
@@new_version/FILE_MANAGER.pkb
PROMPT Tracking versions...
@@track_package_versions.sql
PROMPT Verifying all packages...
@@verify_packages_version.sql
```
## Script Details
### track_package_versions.sql
- **Purpose**: Register package versions in A_PACKAGE_VERSION_TRACKING
- **When**: After installing/rolling back packages
- **Output**: Simple list of package versions
### verify_packages_version.sql
- **Purpose**: Verify all tracked packages for code changes
- **When**: At the end of install/rollback scripts
- **Output**: Detailed status for all tracked packages (OK/WARNING)
## Example Output
```
========================================
Package Version Tracking
========================================
Packages tracked: 2 of 2
CT_MRDS.FILE_MANAGER = 3.2.1
ODS.FILE_MANAGER_ODS = 2.1.0
========================================
```
## Configuration Examples
### MARS-1049 (FILE_MANAGER System)
```sql
vPackageList t_string_array := t_string_array(
'CT_MRDS.FILE_MANAGER',
'ODS.FILE_MANAGER_ODS'
);
```
### MARS-1011 (WORKFLOW_MANAGER)
```sql
vPackageList t_string_array := t_string_array(
'CT_MRDS.WORKFLOW_MANAGER'
);
```
### System with Multiple Packages
```sql
vPackageList t_string_array := t_string_array(
'CT_MRDS.FILE_MANAGER',
'CT_MRDS.DATA_EXPORTER',
'CT_MRDS.FILE_ARCHIVER',
'ODS.FILE_MANAGER_ODS',
'MRDS_LOADER.CLOUD_WRAPPER'
);
```
## Requirements
Each package in the list must have:
```sql
FUNCTION GET_VERSION RETURN VARCHAR2;
```
## Benefits
- **Clear List** - see immediately which packages are tracked
- **Easy Editing** - add/remove package lines
- **Zero Configuration** - aside from the package list
- **Universal** - copy script to any MARS issue
- **Simple Output** - just package names and versions

View File

@@ -0,0 +1,420 @@
# AGGREGATED_ALLOTMENT External Tables Setup
Ten dokument przedstawia kompletny przykład tworzenia trzech external tables dla tabeli `AGGREGATED_ALLOTMENT` według wzorca FILE PROCESSOR System.
## Przegląd tabeli źródłowej
**Tabela źródłowa:** `OU_TOP.AGGREGATED_ALLOTMENT`
**Kolumna kluczowa do partycjonowania:** `ALLOTMENT_DATE` (DATE)
**Liczba kolumn:** 43
**Główne kolumny dat:** ALLOTMENT_DATE, VALUE_DATE, MATURITY_DATE
## Proces krok po kroku
### Krok 1: Tworzenie tabeli template (jeśli nie istnieje)
```sql
-- Sprawdź czy template już istnieje
SELECT table_name FROM all_tables
WHERE owner = 'CT_ET_TEMPLATES' AND table_name = 'TOP_AGGREGATED_ALLOTMENT';
-- Jeśli nie istnieje, utwórz template table
CREATE TABLE CT_ET_TEMPLATES.TOP_AGGREGATED_ALLOTMENT
AS SELECT * FROM OU_TOP.AGGREGATED_ALLOTMENT WHERE 1=2;
-- Weryfikacja struktury template
SELECT column_name, data_type, data_length, nullable
FROM user_tab_columns
WHERE table_name = 'TOP_AGGREGATED_ALLOTMENT'
ORDER BY column_id;
```
**Cel:**
- Template definiuje strukturę dla external tables
- Używa `WHERE 1=2` aby skopiować tylko strukturę, bez danych
- Znajduje się w schemacie `CT_ET_TEMPLATES` dla centralnego zarządzania
### Krok 2: Tworzenie trzech external tables
**Metoda zalecana:** Użyj pakietu `ODS.FILE_MANAGER_ODS` (działa z każdego kontekstu użytkownika)
```sql
-- External table dla INBOX (pliki przychodzące)
BEGIN
ODS.FILE_MANAGER_ODS.CREATE_EXTERNAL_TABLE(
pTableName => 'TOP_AGGREGATED_ALLOTMENT_INBOX',
pTemplateTableName => 'CT_ET_TEMPLATES.TOP_AGGREGATED_ALLOTMENT',
pPrefix => 'INBOX/TOP/AGGREGATED_ALLOTMENT',
pBucketUri => CT_MRDS.ENV_MANAGER.gvInboxBucketUri
);
END;
/
-- External table dla ODS (dane operacyjne)
BEGIN
ODS.FILE_MANAGER_ODS.CREATE_EXTERNAL_TABLE(
pTableName => 'TOP_AGGREGATED_ALLOTMENT_ODS',
pTemplateTableName => 'CT_ET_TEMPLATES.TOP_AGGREGATED_ALLOTMENT',
pPrefix => 'ODS/TOP/AGGREGATED_ALLOTMENT',
pBucketUri => CT_MRDS.ENV_MANAGER.gvDataBucketUri
);
END;
/
-- External table dla ARCHIVE (dane historyczne)
BEGIN
ODS.FILE_MANAGER_ODS.CREATE_EXTERNAL_TABLE(
pTableName => 'TOP_AGGREGATED_ALLOTMENT_ARCHIVE',
pTemplateTableName => 'CT_ET_TEMPLATES.TOP_AGGREGATED_ALLOTMENT',
pPrefix => 'TOP/AGGREGATED_ALLOTMENT',
pBucketUri => CT_MRDS.ENV_MANAGER.gvArchiveBucketUri
);
END;
/
```
**Metoda alternatywna:** Użyj pakietu `CT_MRDS.FILE_MANAGER` (wymaga kontekstu użytkownika ODS)
```sql
-- UWAGA: Te komendy muszą być wykonane jako użytkownik ODS
-- ze względu na AUTHID CURRENT_USER w pakiecie FILE_MANAGER
-- External table dla INBOX (pliki przychodzące)
BEGIN
CT_MRDS.FILE_MANAGER.CREATE_EXTERNAL_TABLE(
pTableName => 'TOP_AGGREGATED_ALLOTMENT_INBOX',
pTemplateTableName => 'CT_ET_TEMPLATES.TOP_AGGREGATED_ALLOTMENT',
pPrefix => 'INBOX/TOP/AGGREGATED_ALLOTMENT',
pBucketUri => CT_MRDS.ENV_MANAGER.gvInboxBucketUri
);
END;
/
-- External table dla ODS (dane operacyjne)
BEGIN
CT_MRDS.FILE_MANAGER.CREATE_EXTERNAL_TABLE(
pTableName => 'TOP_AGGREGATED_ALLOTMENT_ODS',
pTemplateTableName => 'CT_ET_TEMPLATES.TOP_AGGREGATED_ALLOTMENT',
pPrefix => 'ODS/TOP/AGGREGATED_ALLOTMENT',
pBucketUri => CT_MRDS.ENV_MANAGER.gvDataBucketUri
);
END;
/
-- External table dla ARCHIVE (dane historyczne)
BEGIN
CT_MRDS.FILE_MANAGER.CREATE_EXTERNAL_TABLE(
pTableName => 'TOP_AGGREGATED_ALLOTMENT_ARCHIVE',
pTemplateTableName => 'CT_ET_TEMPLATES.TOP_AGGREGATED_ALLOTMENT',
pPrefix => 'TOP/AGGREGATED_ALLOTMENT',
pBucketUri => CT_MRDS.ENV_MANAGER.gvArchiveBucketUri
);
END;
/
```
### Krok 3: Eksport istniejących danych
```sql
-- Eksport wszystkich danych historycznych
BEGIN
CT_MRDS.FILE_MANAGER.EXPORT_TABLE_DATA_BY_DATE(
pSchemaName => 'OU_TOP', -- Schema źródłowa
pTableName => 'AGGREGATED_ALLOTMENT', -- Tabela źródłowa
pKeyColumnName => 'ALLOTMENT_DATE', -- Kolumna do partycjonowania
pBucketName => 'mrds_history_poc', -- Bucket docelowy
pFolderName => 'TOP/AGGREGATED_ALLOTMENT' -- Folder docelowy
-- pMinDate domyślnie DATE '1900-01-01' (eksportuje wszystkie dane historyczne)
-- pMaxDate domyślnie SYSDATE (eksportuje dane do bieżącej daty)
);
END;
/
-- Eksport danych z określonego zakresu dat
BEGIN
CT_MRDS.FILE_MANAGER.EXPORT_TABLE_DATA_BY_DATE(
pSchemaName => 'OU_TOP',
pTableName => 'AGGREGATED_ALLOTMENT',
pKeyColumnName => 'ALLOTMENT_DATE',
pBucketName => 'mrds_history_poc',
pFolderName => 'TOP/AGGREGATED_ALLOTMENT',
pMinDate => DATE '2024-01-01', -- Od 2024-01-01
pMaxDate => DATE '2024-12-31' -- Do 2024-12-31
);
END;
/
-- Eksport tylko najnowszych danych (ostatnie 90 dni)
BEGIN
CT_MRDS.FILE_MANAGER.EXPORT_TABLE_DATA_BY_DATE(
pSchemaName => 'OU_TOP',
pTableName => 'AGGREGATED_ALLOTMENT',
pKeyColumnName => 'ALLOTMENT_DATE',
pBucketName => 'mrds_history_poc',
pFolderName => 'TOP/AGGREGATED_ALLOTMENT',
pMinDate => SYSDATE - 90 -- Ostatnie 90 dni
);
END;
/
```
### Krok 4: Utworzenie kopii bezpieczeństwa (legacy table)
```sql
-- Zmiana nazwy oryginalnej tabeli na legacy
ALTER TABLE OU_TOP.AGGREGATED_ALLOTMENT
RENAME TO AGGREGATED_ALLOTMENT_LEGACY;
-- Weryfikacja
SELECT table_name FROM user_tables
WHERE table_name LIKE '%AGGREGATED_ALLOTMENT%';
```
### Krok 5: Utworzenie widoku kompatybilności
```sql
-- Nadanie uprawnień do external table ODS
GRANT SELECT ON ODS.TOP_AGGREGATED_ALLOTMENT_ODS TO OU_TOP;
-- Utworzenie widoku z oryginalną nazwą tabeli
CREATE OR REPLACE VIEW OU_TOP.AGGREGATED_ALLOTMENT AS
SELECT * FROM ODS.TOP_AGGREGATED_ALLOTMENT_ODS;
-- Weryfikacja dostępu
SELECT COUNT(*) FROM OU_TOP.AGGREGATED_ALLOTMENT;
```
## Parametry procedur
### CREATE_EXTERNAL_TABLE
```sql
PROCEDURE CREATE_EXTERNAL_TABLE (
pTableName IN VARCHAR2, -- Nazwa external table do utworzenia
pTemplateTableName IN VARCHAR2, -- Template table definiujący strukturę
pPrefix IN VARCHAR2, -- Ścieżka w Oracle Cloud Storage
pBucketUri IN VARCHAR2, -- URI bucket'a (domyślnie ENV_MANAGER.gvInboxBucketUri)
pFileName IN VARCHAR2, -- Nazwa pliku (opcjonalne)
pDelimiter IN VARCHAR2 -- Separator (domyślnie ',')
);
```
### EXPORT_TABLE_DATA_BY_DATE
```sql
PROCEDURE EXPORT_TABLE_DATA_BY_DATE (
pSchemaName IN VARCHAR2, -- Schema zawierający tabelę źródłową
pTableName IN VARCHAR2, -- Nazwa tabeli do eksportu
pKeyColumnName IN VARCHAR2, -- Kolumna do filtrowania dat
pBucketName IN VARCHAR2, -- Nazwa bucket'a Oracle Cloud Storage
pFolderName IN VARCHAR2, -- Ścieżka folderu w bucket'ie
pMinDate IN DATE, -- Data minimalna (domyślnie DATE '1900-01-01')
pMaxDate IN DATE, -- Data maksymalna (domyślnie SYSDATE)
pNamespace IN VARCHAR2, -- OCI namespace (domyślnie ENV_MANAGER.gvNameSpace)
pRegion IN VARCHAR2, -- OCI region (domyślnie ENV_MANAGER.gvRegion)
pCredentialName IN VARCHAR2 -- OCI credentials (domyślnie ENV_MANAGER.gvCredentialName)
);
```
## Struktura przechowywania w chmurze
```
Oracle Cloud Storage Bucket
├── INBOX/
│ └── TOP/
│ └── AGGREGATED_ALLOTMENT/
│ └── pliki_CSV_lub_PARQUET
├── ODS/
│ └── TOP/
│ └── AGGREGATED_ALLOTMENT/
│ └── pliki_CSV_lub_PARQUET
└── TOP/
└── AGGREGATED_ALLOTMENT/
├── 2024_01/
├── 2024_02/
└── ...
└── pliki_PARQUET_partycjonowane_miesięcznie
```
## Weryfikacja i testowanie
### Sprawdzenie utworzonych external tables
```sql
-- Lista utworzonych external tables
SELECT table_name, table_type
FROM user_tables
WHERE table_name LIKE '%AGGREGATED_ALLOTMENT%'
ORDER BY table_name;
-- Sprawdzenie external locations
SELECT table_name, location
FROM user_external_locations
WHERE table_name LIKE '%AGGREGATED_ALLOTMENT%'
ORDER BY table_name;
```
### Test dostępu do danych
```sql
-- Test external table INBOX (może być pusta)
SELECT COUNT(*) FROM ODS.TOP_AGGREGATED_ALLOTMENT_INBOX;
-- Test external table ODS (może być pusta do czasu załadowania danych)
SELECT COUNT(*) FROM ODS.TOP_AGGREGATED_ALLOTMENT_ODS;
-- Test external table ARCHIVE (powinna zawierać wyeksportowane dane)
SELECT COUNT(*) FROM ODS.TOP_AGGREGATED_ALLOTMENT_ARCHIVE;
-- Test widoku kompatybilności
SELECT COUNT(*) FROM OU_TOP.AGGREGATED_ALLOTMENT;
```
### Sprawdzenie wyeksportowanych plików
```sql
-- Lista plików w bucket'ie archive
SELECT object_name, size_in_bytes, time_created
FROM DBMS_CLOUD.LIST_OBJECTS(
credential_name => CT_MRDS.ENV_MANAGER.gvCredentialName,
location_uri => CT_MRDS.ENV_MANAGER.gvArchiveBucketUri,
prefix => 'TOP/AGGREGATED_ALLOTMENT'
)
ORDER BY time_created DESC;
```
## Konwencje nazewnictwa
- **Template tables:** `CT_ET_TEMPLATES.TOP_AGGREGATED_ALLOTMENT`
- **External tables:**
- `TOP_AGGREGATED_ALLOTMENT_INBOX` (pliki przychodzące)
- `TOP_AGGREGATED_ALLOTMENT_ODS` (dane operacyjne)
- `TOP_AGGREGATED_ALLOTMENT_ARCHIVE` (dane historyczne)
- **Legacy table:** `AGGREGATED_ALLOTMENT_LEGACY`
- **Widok kompatybilności:** `AGGREGATED_ALLOTMENT` (oryginalna nazwa)
## Rozwiązywanie problemów
### Problem: External table nie może znaleźć plików (ORA-29913)
```sql
-- Sprawdź lokalizację external table
SELECT table_name, location
FROM user_external_locations
WHERE table_name = 'TOP_AGGREGATED_ALLOTMENT_ARCHIVE';
-- Sprawdź czy pliki istnieją w bucket'ie
SELECT object_name
FROM DBMS_CLOUD.LIST_OBJECTS(
credential_name => CT_MRDS.ENV_MANAGER.gvCredentialName,
location_uri => CT_MRDS.ENV_MANAGER.gvArchiveBucketUri,
prefix => 'TOP/AGGREGATED_ALLOTMENT'
);
```
### Problem: Brak uprawnień
```sql
-- Nadaj uprawnienia SELECT na external tables
GRANT SELECT ON ODS.TOP_AGGREGATED_ALLOTMENT_INBOX TO OU_TOP;
GRANT SELECT ON ODS.TOP_AGGREGATED_ALLOTMENT_ODS TO OU_TOP;
GRANT SELECT ON ODS.TOP_AGGREGATED_ALLOTMENT_ARCHIVE TO OU_TOP;
-- Sprawdź uprawnienia
SELECT grantee, privilege, table_name
FROM user_tab_privs
WHERE table_name LIKE '%AGGREGATED_ALLOTMENT%';
```
## Kompletny skrypt setup
```sql
-- ============================================================================
-- AGGREGATED_ALLOTMENT External Tables Setup - Kompletny skrypt
-- ============================================================================
-- Krok 1: Sprawdź i utwórz template table
SELECT 'Sprawdzanie template table...' AS status FROM dual;
SELECT table_name FROM all_tables
WHERE owner = 'CT_ET_TEMPLATES' AND table_name = 'TOP_AGGREGATED_ALLOTMENT';
-- Jeśli template nie istnieje, utwórz go:
-- CREATE TABLE CT_ET_TEMPLATES.TOP_AGGREGATED_ALLOTMENT
-- AS SELECT * FROM OU_TOP.AGGREGATED_ALLOTMENT WHERE 1=2;
-- Krok 2: Tworzenie external tables (METODA ZALECANA)
SELECT 'Tworzenie external tables...' AS status FROM dual;
BEGIN
ODS.FILE_MANAGER_ODS.CREATE_EXTERNAL_TABLE(
pTableName => 'TOP_AGGREGATED_ALLOTMENT_INBOX',
pTemplateTableName => 'CT_ET_TEMPLATES.TOP_AGGREGATED_ALLOTMENT',
pPrefix => 'INBOX/TOP/AGGREGATED_ALLOTMENT',
pBucketUri => CT_MRDS.ENV_MANAGER.gvInboxBucketUri
);
DBMS_OUTPUT.PUT_LINE('✓ Created INBOX external table');
END;
/
BEGIN
ODS.FILE_MANAGER_ODS.CREATE_EXTERNAL_TABLE(
pTableName => 'TOP_AGGREGATED_ALLOTMENT_ODS',
pTemplateTableName => 'CT_ET_TEMPLATES.TOP_AGGREGATED_ALLOTMENT',
pPrefix => 'ODS/TOP/AGGREGATED_ALLOTMENT',
pBucketUri => CT_MRDS.ENV_MANAGER.gvDataBucketUri
);
DBMS_OUTPUT.PUT_LINE('✓ Created ODS external table');
END;
/
BEGIN
ODS.FILE_MANAGER_ODS.CREATE_EXTERNAL_TABLE(
pTableName => 'TOP_AGGREGATED_ALLOTMENT_ARCHIVE',
pTemplateTableName => 'CT_ET_TEMPLATES.TOP_AGGREGATED_ALLOTMENT',
pPrefix => 'TOP/AGGREGATED_ALLOTMENT',
pBucketUri => CT_MRDS.ENV_MANAGER.gvArchiveBucketUri
);
DBMS_OUTPUT.PUT_LINE('✓ Created ARCHIVE external table');
END;
/
-- Krok 3: Eksport istniejących danych
SELECT 'Eksportowanie danych...' AS status FROM dual;
BEGIN
CT_MRDS.FILE_MANAGER.EXPORT_TABLE_DATA_BY_DATE(
pSchemaName => 'OU_TOP',
pTableName => 'AGGREGATED_ALLOTMENT',
pKeyColumnName => 'ALLOTMENT_DATE',
pBucketName => 'mrds_history_poc',
pFolderName => 'TOP/AGGREGATED_ALLOTMENT'
);
DBMS_OUTPUT.PUT_LINE('✓ Export completed');
END;
/
-- Krok 4: Backup oryginalnej tabeli
SELECT 'Tworzenie kopii bezpieczeństwa...' AS status FROM dual;
-- ALTER TABLE OU_TOP.AGGREGATED_ALLOTMENT RENAME TO AGGREGATED_ALLOTMENT_LEGACY;
-- Krok 5: Utworzenie widoku kompatybilności
SELECT 'Tworzenie widoku kompatybilności...' AS status FROM dual;
GRANT SELECT ON ODS.TOP_AGGREGATED_ALLOTMENT_ODS TO OU_TOP;
-- CREATE OR REPLACE VIEW OU_TOP.AGGREGATED_ALLOTMENT AS
-- SELECT * FROM ODS.TOP_AGGREGATED_ALLOTMENT_ODS;
-- Weryfikacja
SELECT 'Weryfikacja setup...' AS status FROM dual;
SELECT table_name, table_type
FROM user_tables
WHERE table_name LIKE '%AGGREGATED_ALLOTMENT%'
ORDER BY table_name;
SELECT 'Setup zakończony pomyślnie!' AS status FROM dual;
```
Ten kompletny przykład pokazuje wszystkie kroki potrzebne do migracji tabeli `AGGREGATED_ALLOTMENT` do systemu FILE PROCESSOR z trzema external tables dla różnych typów przechowywania danych.

View File

@@ -0,0 +1,221 @@
# AGGREGATED_ALLOTMENT External Tables - Rozwiązanie problemów
## 🔍 Diagnoza problemów
### Problem który napotkałeś:
```sql
SELECT COUNT(*) FROM ODS.TOP_AGGREGATED_ALLOTMENT_ODS;
-- ORA-29913: error while processing ODCIEXTTABLEOPEN routine
-- ORA-20401: Authorization failed
```
### Przyczyny problemów:
#### 1. **Brak uprawnień do external tables**
- External tables zostały utworzone w schemacie **ODS**
- Próbujesz dostępu z schematu **CT_MRDS**
- Brak uprawnień SELECT na ODS.TOP_AGGREGATED_ALLOTMENT_*
#### 2. **Niepoprawna konfiguracja ARCHIVE table**
- **Użyty prefix:** `ARCHIVE/TOP/AGGREGATED_ALLOTMENT`
- **Faktyczne pliki:** `TOP/AGGREGATED_ALLOTMENT/YEAR=2025/MONTH=08/*.parquet`
- **Problem:** PARQUET files z Hive partitioning wymagają specjalnej konfiguracji
#### 3. **Weryfikacja plików** ✅
```bash
# ODS bucket - pliki CSV istnieją
oci os object list --bucket-name data --prefix "ODS/TOP/AGGREGATED_ALLOTMENT/"
# ✅ 14 plików CSV
# ARCHIVE bucket - pliki PARQUET z partycjonowaniem Hive
oci os object list --bucket-name history --prefix "ARCHIVE/TOP/AGGREGATED_ALLOTMENT/"
# ❌ Brak plików w tej lokalizacji
oci os object list --bucket-name history --prefix "TOP/AGGREGATED_ALLOTMENT/"
# ✅ 8 plików PARQUET w strukturze YEAR=2025/MONTH=08/
```
## 🔧 Rozwiązania
### Rozwiązanie 1: Nadanie uprawnień
```sql
-- Jako ODS user lub ADMIN
GRANT SELECT ON ODS.TOP_AGGREGATED_ALLOTMENT_ODS TO CT_MRDS;
GRANT SELECT ON ODS.TOP_AGGREGATED_ALLOTMENT_INBOX TO CT_MRDS;
GRANT SELECT ON ODS.TOP_AGGREGATED_ALLOTMENT_ARCHIVE TO CT_MRDS;
-- Test z CT_MRDS
SELECT COUNT(*) FROM ODS.TOP_AGGREGATED_ALLOTMENT_ODS;
```
### Rozwiązanie 2: Poprawa konfiguracji ARCHIVE table
```sql
-- Jako ODS user
DROP TABLE TOP_AGGREGATED_ALLOTMENT_ARCHIVE;
-- Poprawne utworzenie z właściwym prefixem
BEGIN
FILE_MANAGER_ODS.CREATE_EXTERNAL_TABLE(
pTableName => 'TOP_AGGREGATED_ALLOTMENT_ARCHIVE',
pTemplateTableName => 'CT_ET_TEMPLATES.TOP_AGGREGATED_ALLOTMENT',
pPrefix => 'TOP/AGGREGATED_ALLOTMENT', -- Bez "ARCHIVE/" prefix
pBucketUri => CT_MRDS.ENV_MANAGER.gvArchiveBucketUri
);
END;
/
```
### Rozwiązanie 3: Alternatywne podejście - tworzenie w CT_MRDS
```sql
-- Jako CT_MRDS user - utwórz external tables w swoim schemacie
BEGIN
CT_MRDS.FILE_MANAGER.CREATE_EXTERNAL_TABLE(
pTableName => 'TOP_AGGREGATED_ALLOTMENT_ODS_CT',
pTemplateTableName => 'CT_ET_TEMPLATES.TOP_AGGREGATED_ALLOTMENT',
pPrefix => 'ODS/TOP/AGGREGATED_ALLOTMENT',
pBucketUri => CT_MRDS.ENV_MANAGER.gvDataBucketUri
);
END;
/
BEGIN
CT_MRDS.FILE_MANAGER.CREATE_EXTERNAL_TABLE(
pTableName => 'TOP_AGGREGATED_ALLOTMENT_ARCHIVE_CT',
pTemplateTableName => 'CT_ET_TEMPLATES.TOP_AGGREGATED_ALLOTMENT',
pPrefix => 'TOP/AGGREGATED_ALLOTMENT',
pBucketUri => CT_MRDS.ENV_MANAGER.gvArchiveBucketUri
);
END;
/
```
## 🧪 Testowanie rozwiązań
### Test 1: Weryfikacja external locations
```sql
SELECT table_name, location
FROM all_external_locations
WHERE table_name LIKE '%TOP_AGGREGATED_ALLOTMENT%'
AND owner IN ('ODS', 'CT_MRDS')
ORDER BY owner, table_name;
```
### Test 2: Próba dostępu do danych
```sql
-- Test ODS (pliki CSV)
SELECT COUNT(*) FROM ODS.TOP_AGGREGATED_ALLOTMENT_ODS;
-- Test pierwszych 5 rekordów
SELECT * FROM ODS.TOP_AGGREGATED_ALLOTMENT_ODS WHERE ROWNUM <= 5;
```
### Test 3: Weryfikacja ARCHIVE (pliki PARQUET)
```sql
-- Test ARCHIVE (pliki PARQUET z partycjonowaniem)
SELECT COUNT(*) FROM ODS.TOP_AGGREGATED_ALLOTMENT_ARCHIVE;
-- Sprawdzenie partycji
SELECT DISTINCT year, month
FROM ODS.TOP_AGGREGATED_ALLOTMENT_ARCHIVE
ORDER BY year, month;
```
## 📊 Struktura plików w buckets
### Data bucket (ODS) - format CSV
```
data/
└── ODS/
└── TOP/
└── AGGREGATED_ALLOTMENT/
├── AGGREGATED_ALLOTMENT_202508_1_*.csv
├── AGGREGATED_ALLOTMENT_202508_2_*.csv
├── AGGREGATED_ALLOTMENT_202509_1_*.csv
└── ... (14 plików CSV)
```
### History bucket (ARCHIVE) - format PARQUET z partycjonowaniem Hive
```
history/
└── TOP/
└── AGGREGATED_ALLOTMENT/
└── YEAR=2025/
├── MONTH=08/
│ ├── 202508_1_*.parquet
│ ├── 202508_2_*.parquet
│ ├── 202508_3_*.parquet
│ └── 202508_4_*.parquet
└── MONTH=09/
├── 202509_1_*.parquet
├── 202509_2_*.parquet
├── 202509_3_*.parquet
└── 202509_4_*.parquet
```
## ⚠️ Uwagi dotyczące FILE_MANAGER
### Automatyczna detekcja formatu
Pakiet `FILE_MANAGER.CREATE_EXTERNAL_TABLE` automatycznie wykrywa format na podstawie `pBucketUri`:
- **Jeśli** `pBucketUri` zawiera `gvArchiveBucketUri` → format PARQUET z partycjonowaniem Hive
- **Jeśli** `pBucketUri` zawiera `gvDataBucketUri` lub `gvInboxBucketUri` → format CSV
### Konfiguracja środowiska
```sql
-- Sprawdzenie konfiguracji ENV_MANAGER
SET SERVEROUTPUT ON;
DECLARE
v_inbox VARCHAR2(4000);
v_data VARCHAR2(4000);
v_history VARCHAR2(4000);
BEGIN
v_inbox := CT_MRDS.ENV_MANAGER.gvInboxBucketUri;
v_data := CT_MRDS.ENV_MANAGER.gvDataBucketUri;
v_history := CT_MRDS.ENV_MANAGER.gvArchiveBucketUri;
DBMS_OUTPUT.PUT_LINE('Inbox: ' || v_inbox);
DBMS_OUTPUT.PUT_LINE('Data: ' || v_data);
DBMS_OUTPUT.PUT_LINE('History: ' || v_history);
END;
/
```
## 🎯 Zalecane działania
### Krok 1: Nadaj uprawnienia (NAJSZYBSZE)
```sql
-- Jako ADMIN lub z uprawnieniami GRANT
GRANT SELECT ON ODS.TOP_AGGREGATED_ALLOTMENT_ODS TO CT_MRDS;
GRANT SELECT ON ODS.TOP_AGGREGATED_ALLOTMENT_INBOX TO CT_MRDS;
```
### Krok 2: Przebuduj ARCHIVE table
```sql
-- Jako ODS user
DROP TABLE TOP_AGGREGATED_ALLOTMENT_ARCHIVE;
BEGIN
FILE_MANAGER_ODS.CREATE_EXTERNAL_TABLE(
pTableName => 'TOP_AGGREGATED_ALLOTMENT_ARCHIVE',
pTemplateTableName => 'CT_ET_TEMPLATES.TOP_AGGREGATED_ALLOTMENT',
pPrefix => 'TOP/AGGREGATED_ALLOTMENT',
pBucketUri => CT_MRDS.ENV_MANAGER.gvArchiveBucketUri
);
END;
/
-- Nadaj uprawnienia
GRANT SELECT ON TOP_AGGREGATED_ALLOTMENT_ARCHIVE TO CT_MRDS;
```
### Krok 3: Test końcowy
```sql
-- Z schematu CT_MRDS
SELECT 'ODS' as source, COUNT(*) as row_count FROM ODS.TOP_AGGREGATED_ALLOTMENT_ODS
UNION ALL
SELECT 'ARCHIVE' as source, COUNT(*) as row_count FROM ODS.TOP_AGGREGATED_ALLOTMENT_ARCHIVE;
```
Teraz external tables powinny działać poprawnie! 🎉

View File

@@ -0,0 +1,173 @@
# Oracle External Tables Tolerance Guide
## Podsumowanie Mechanizmów Tolerancji
Oracle External Tables mają zaskakująco wysoką tolerancję na różnice w strukturze plików CSV w porównaniu do definicji tabeli. To może prowadzić do nieoczekiwanych zachowań w systemach enterprise.
## 🎯 Kluczowe Mechanizmy
### 1. **Mapowanie Kolumn Po Nazwach (Nie Po Pozycji)**
```csv
-- Tabela: ID, NAME, STATUS
-- CSV 1: ID,NAME,STATUS ✅ Standardowa kolejność
-- CSV 2: STATUS,NAME,ID ✅ Inna kolejność - działa!
-- CSV 3: NAME,ID,STATUS ✅ Również działa!
```
**Wniosek**: Oracle mapuje po nagłówkach, nie po kolejności kolumn.
### 2. **Brakujące Kolumny = NULL**
```csv
-- Tabela: ID, NAME, STATUS, AMOUNT
-- CSV: ID,NAME ⚠️ STATUS i AMOUNT = NULL!
```
**Niebezpieczeństwo**: Plik z niepełną strukturą zostanie załadowany z NULL-ami.
### 3. **Dodatkowe Kolumny Są Ignorowane**
```csv
-- Tabela: ID, NAME, STATUS
-- CSV: ID,NAME,STATUS,EXTRA1,EXTRA2,BONUS ✅ Nadmiarowe kolumny ignorowane
```
**Wniosek**: Oracle nie protestuje przeciw dodatkowym danym.
### 4. **Walidacja Typów Danych Jest Egzekwowana**
```csv
-- Tabela: ID(NUMBER), NAME(VARCHAR2), CREATED_DATE(DATE)
-- CSV: "ABC","John","not-a-date" ❌ Błędy konwersji typów = FAILURE
```
**Wniosek**: Tutaj Oracle jest rygorystyczny!
## 📋 Praktyczne Przykłady
### Plik Demonstracyjny 1: Idealna Zgodność
**demo_perfect_match.csv**
```csv
ID,NAME,DESCRIPTION,CREATED_DATE,STATUS,AMOUNT
1,"Product A","Description A","2024-01-15","ACTIVE",100.50
2,"Product B","Description B","2024-01-16","INACTIVE",200.75
```
**Rezultat**: ✅ 100% sukces
### Plik Demonstracyjny 2: Brakujące Kolumny
**demo_missing_columns.csv**
```csv
ID,WRONG_COLUMN,INVALID_STRUCTURE
1,"Some Data","More Data"
2,"Other Data","Different Data"
```
**Rezultat**: ⚠️ Akceptowany z NULL-ami w brakujących kolumnach!
### Plik Demonstracyjny 3: Dodatkowe Kolumny
**demo_extra_columns.csv**
```csv
ID,NAME,DESCRIPTION,CREATED_DATE,STATUS,AMOUNT,EXTRA1,EXTRA2,BONUS_FIELD
1,"Product A","Description A","2024-01-15","ACTIVE",100.50,"Extra","More","Bonus"
```
**Rezultat**: ✅ Sukces - dodatkowe kolumny ignorowane
### Plik Demonstracyjny 4: Inna Kolejność
**demo_different_order.csv**
```csv
STATUS,AMOUNT,ID,NAME,DESCRIPTION,CREATED_DATE
"ACTIVE",100.50,1,"Product A","Description A","2024-01-15"
```
**Rezultat**: ✅ Sukces - mapowanie po nazwach
### Plik Demonstracyjny 5: Błędne Typy Danych
**demo_data_type_errors.csv**
```csv
ID,NAME,DESCRIPTION,CREATED_DATE,STATUS,AMOUNT
"NOT_NUMBER","Product A","Description A","INVALID_DATE","ACTIVE","NOT_AMOUNT"
```
**Rezultat**: ❌ Failure - błędy konwersji typów
## 🛡️ Implikacje Bezpieczeństwa i Biznesowe
### Problemy
1. **Niekompletne dane akceptowane jako poprawne**
2. **Brak walidacji reguł biznesowych**
3. **Ciche pomijanie dodatkowych informacji**
4. **Nieoczekiwane NULL-e w krytycznych polach**
### Rozwiązania
1. **Walidacja na poziomie aplikacji** - nie polegaj tylko na External Tables
2. **NOT NULL constraints** na krytycznych polach
3. **Niestandardowa procedura VALIDATE_SOURCE_FILE_RECEIVED**
4. **Dokładne monitorowanie statusów plików**
## 🔧 Implementacja w Naszym Systemie
### Dlaczego Naprawa VALIDATION_FAILED Była Krytyczna
Bez poprawnej obsługi wyjątków w `VALIDATE_SOURCE_FILE_RECEIVED`:
- Pliki z niepełną strukturą przechodziły jako "READY_FOR_INGESTION"
- Brak śledzenia problemów walidacji
- Nieszczęsne dane w systemie produkcyjnym
### Nasza Naprawa
```sql
-- PRZED: Nieprawidłowe propagowanie wyjątków
WHEN OTHERS THEN
RAISE; -- ❌ Błędne!
-- PO: Prawidłowe zachowanie
WHEN OTHERS THEN
COMMIT; -- Zapisz status VALIDATION_FAILED
RAISE ENV_MANAGER.ERR_FILE_VALIDATION_FAILED; -- ✅ Poprawne!
```
### Rozszerzona Walidacja Constraint
```sql
-- Dodano 'VALIDATION_FAILED' do dozwolonych statusów
PROCESSING_STATUS IN ('RECEIVED', 'VALIDATED', 'READY_FOR_INGESTION',
'INGESTED', 'ARCHIVED', 'VALIDATION_FAILED')
```
## 📊 Testowanie
### Pełny Test Demonstracyjny
Uruchom: `test_EXTERNAL_TABLE_COMPLETE_DEMO.sql`
Ten test pokazuje:
- 6 różnych scenariuszy tolerancji
- Dokładne rezultaty każdego przypadku
- Praktyczne wnioski dla systemu enterprise
### Oczekiwane Rezultaty
```
✅ Perfect Match → READY_FOR_INGESTION
⚠️ Missing Columns → READY_FOR_INGESTION (z NULL-ami!)
✅ Extra Columns → READY_FOR_INGESTION
✅ Different Order → READY_FOR_INGESTION
⚠️ Completely Wrong → READY_FOR_INGESTION (wszystko NULL!)
❌ Data Type Errors → VALIDATION_FAILED
```
## 💡 Najważniejsze Wnioski
1. **Oracle External Tables nie są "strict" w kwestii struktury**
2. **Walidacja typów danych działa, walidacja biznesowa - nie**
3. **Konieczne są dodatkowe mechanizmy kontroli jakości**
4. **Monitoring statusów plików jest absolutnie krytyczny**
5. **Nasze problemy z VALIDATION_FAILED były objawem szerszego problemu tolerancji**
## 🚨 Zalecenia Produkcyjne
### Obowiązkowe
- [ ] Zawsze implementuj walidację biznesową w aplikacji
- [ ] Monitoruj wszystkie statusy plików
- [ ] Testuj z różnymi strukturami plików CSV
- [ ] Dokumentuj wszystkie dozwolone formaty
### Opcjonalne ale Zalecane
- [ ] Dodaj CHECK constraints dla kluczowych pól
- [ ] Implementuj pre-processing walidację nagłówków
- [ ] Stwórz katalog akceptowalnych struktur plików
- [ ] Dodaj alerty na nieoczekiwane NULL-e
---
*Ten dokument powstał w wyniku odkrycia i naprawy krytycznego błędu w procedurze `PROCESS_SOURCE_FILE` oraz dogłębnego zbadania mechanizmów tolerancji Oracle External Tables.*

View File

@@ -0,0 +1,256 @@
# Konfiguracja Resour```
Name: database-resource-principal-dg
Description: Dynamic group for database instances using Resource Principal
Matching Rules:
ANY {instance.compartment.id = 'ocid1.compartment.oc1..aaaaaaaar57tzot3jyasvp7fayc2bpepcugg4prujw6ctql42nruxjh5w72a'}
# Alternatywnie, dla konkretnej instancji:
ANY {instance.id = 'ocid1.instance.oc1..aaaaaaaa...'}
# Lub dla wszystkich instancji w tenancy:
ANY {instance.compartment.id = 'ocid1.tenancy.oc1..aaaaaaaa4chmgn5j6rsdsrvghtasdhmjok67jdathiiusv6kdzmokzp6ajua'}
```dla Oracle Cloud Database
## Przegląd
Resource Principal to mechanizm uwierzytelniania w Oracle Cloud Infrastructure (OCI), który pozwala instancjom compute (w tym Oracle Database) na bezpieczny dostęp do usług OCI bez konieczności przechowywania kluczy API lub haseł.
## 1. Wymagania wstępne
- Oracle Cloud Database działająca na OCI
- Uprawnienia administratora w OCI Console
- Dostęp do bazy danych jako ADMIN lub użytkownik z uprawnieniami DBMS_CLOUD
## 2. Konfiguracja w OCI Console
### 2.1 Utworzenie Dynamic Group
1. Zaloguj się do **OCI Console**
2. Przejdź do **Identity & Security** → **Dynamic Groups**
3. Kliknij **Create Dynamic Group**
4. Wypełnij formularz:
```
Name: database-resource-principal-dg
Description: Dynamic group for database instances using Resource Principal
Matching Rules:
ANY {instance.compartment.id = 'ocid1.compartment.oc1..aaaaaaaa...'}
# Alternatywnie, dla konkretnej instancji:
ANY {instance.id = 'ocid1.instance.oc1..aaaaaaaa...'}
# Lub dla wszystkich instancji w tenancy:
ANY {instance.compartment.id = tenancy.id}
```
5. Kliknij **Create**
### 2.2 Znajdowanie OCID kompartmentu lub instancji
**Dla kompartmentu:**
- Identity & Security → Compartments → [Wybierz kompartment] → OCID
**Dla instancji database:**
- Oracle Database → [Wybierz bazę] → DB Connection → OCID
### 2.3 Utworzenie IAM Policy
1. Przejdź do **Identity & Security** → **Policies**
2. Kliknij **Create Policy**
3. Wypełnij formularz:
```
Name: database-resource-principal-policy
Description: Policy allowing database instances to access Object Storage
Policy Statements:
Allow dynamic-group database-resource-principal-dg to manage objects in compartment ManagedCompartmentForPaaS
Allow dynamic-group database-resource-principal-dg to manage buckets in compartment ManagedCompartmentForPaaS
Allow dynamic-group database-resource-principal-dg to read objectstorage-namespaces in tenancy
Allow dynamic-group database-resource-principal-dg to read autonomous-database-family in compartment ManagedCompartmentForPaaS
# Opcjonalnie, dla szerszego dostępu:
Allow dynamic-group database-resource-principal-dg to use cloud-shell in tenancy
```
4. Kliknij **Create**
## 3. Weryfikacja konfiguracji OCI
### 3.1 Sprawdzenie Dynamic Group
1. Przejdź do utworzonej Dynamic Group
2. Sprawdź czy **Matching Instances** pokazuje Twoją instancję database
3. Jeśli nie ma instancji, sprawdź reguły dopasowania
### 3.2 Test połączenia (opcjonalnie)
Jeśli masz dostęp SSH do compute instance:
```bash
# Test dostępu do metadata service
curl -H "Authorization: Bearer Oracle" \
http://169.254.169.254/opc/v2/identity/cert.pem
# Jeśli zwraca certyfikat X.509, Resource Principal jest dostępny
```
## 4. Konfiguracja w Oracle Database
### 4.1 Uruchomienie skryptu konfiguracyjnego
Wykonaj skrypt `configure_resource_principal.sql` utworzony wcześniej:
```sql
-- Połącz się jako ADMIN
@configure_resource_principal.sql
```
### 4.2 Ręczna konfiguracja (alternatywa)
Jeśli skrypt automatyczny nie zadziała:
```sql
-- Połącz się jako ADMIN
CONNECT ADMIN/[password]@[service_name]
-- Sprawdź czy OCI$RESOURCE_PRINCIPAL jest dostępny
SELECT credential_name FROM user_credentials
WHERE credential_name = 'OCI$RESOURCE_PRINCIPAL';
-- Jeśli nie istnieje, utwórz własny:
BEGIN
DBMS_CLOUD.CREATE_CREDENTIAL(
credential_name => 'OCI_RESOURCE_PRINCIPAL',
username => '', -- Pusty dla Resource Principal
password => '' -- Pusty dla Resource Principal
);
END;
/
```
### 4.3 Test konfiguracji
```sql
-- Test listowania obiektów w Object Storage
SELECT object_name
FROM DBMS_CLOUD.LIST_OBJECTS(
credential_name => 'OCI$RESOURCE_PRINCIPAL', -- lub 'OCI_RESOURCE_PRINCIPAL'
location_uri => 'https://objectstorage.eu-frankfurt-1.oraclecloud.com/n/frtgjxu7zl7c/'
)
WHERE ROWNUM <= 5;
```
## 5. Konfiguracja dla schematów aplikacji
### 5.1 Nadanie uprawnień
```sql
-- Jako ADMIN, nadaj uprawnienia schematom aplikacji
GRANT EXECUTE ON DBMS_CLOUD TO ODS;
GRANT EXECUTE ON DBMS_CLOUD TO CT_MRDS;
-- Jeśli używasz własnego credential, nadaj dostęp:
-- GRANT READ ON CREDENTIAL OCI_RESOURCE_PRINCIPAL TO ODS;
```
### 5.2 Test z schematu aplikacji
```sql
-- Połącz się jako ODS
CONNECT ODS/[password]@[service_name]
-- Test wywołania
EXEC FILE_MANAGER_ODS.CREATE_EXTERNAL_TABLE('TEST_TABLE', 'https://...', 'OCI$RESOURCE_PRINCIPAL');
```
## 6. Aktualizacja kodu aplikacji
### 6.1 Aktualizacja FILE_MANAGER
Zaktualizuj kod aby użyć Resource Principal:
```sql
-- W procedurze CREATE_EXTERNAL_TABLE
-- Zmień z:
v_credential_name := 'OCI_RESOURCE_PRINCIPAL';
-- Na:
v_credential_name := 'OCI$RESOURCE_PRINCIPAL'; -- Jeśli dostępny
```
### 6.2 Konfiguracja ENV_MANAGER
Dodaj konfigurację do tabeli A_FILE_MANAGER_CONFIG:
```sql
INSERT INTO A_FILE_MANAGER_CONFIG (
CONFIG_KEY,
CONFIG_VALUE,
DESCRIPTION
) VALUES (
'DEFAULT_CREDENTIAL_NAME',
'OCI$RESOURCE_PRINCIPAL',
'Default OCI credential for Resource Principal authentication'
);
```
## 7. Rozwiązywanie problemów
### 7.1 Sprawdzenie statusu Dynamic Group
```sql
-- W SQL Developer Web lub SQLcl:
SELECT
DBMS_CLOUD.GET_OBJECT(
credential_name => 'OCI$RESOURCE_PRINCIPAL',
object_uri => 'https://objectstorage.eu-frankfurt-1.oraclecloud.com/n/frtgjxu7zl7c/b/data/o/test.txt'
) as content
FROM dual;
```
### 7.2 Typowe błędy
| Błąd | Przyczyna | Rozwiązanie |
|------|-----------|-------------|
| `ORA-20407: Invalid Credentials` | Dynamic Group nie zawiera instancji | Sprawdź reguły dopasowania w Dynamic Group |
| `ORA-20421: Object not found` | Brak uprawnień IAM | Sprawdź Policy dla Dynamic Group |
| `OCI$RESOURCE_PRINCIPAL not found` | Resource Principal nie skonfigurowany | Użyj własnego credential lub skonfiguruj ponownie |
### 7.3 Weryfikacja uprawnień
```sql
-- Sprawdź jakie credentials są dostępne
SELECT credential_name, username
FROM user_credentials
ORDER BY credential_name;
-- Sprawdź uprawnienia DBMS_CLOUD
SELECT grantee, privilege
FROM user_tab_privs
WHERE table_name = 'DBMS_CLOUD';
```
## 8. Zalecenia bezpieczeństwa
1. **Minimalne uprawnienia**: Nadaj tylko niezbędne uprawnienia w IAM Policy
2. **Monitoring**: Monitoruj użycie Resource Principal w OCI Audit
3. **Rotacja**: Resource Principal automatycznie rotuje certyfikaty
4. **Scope**: Ogranicz Dynamic Group do konkretnych instancji, nie całego compartment
## 9. Następne kroki
Po skonfigurowaniu Resource Principal:
1. Przetestuj wszystkie operacje FILE_MANAGER
2. Zaktualizuj dokumentację aplikacji
3. Skonfiguruj monitoring użycia Object Storage
4. Rozważ implementację dla innych schematów
## 10. Referencje
- [Oracle Cloud Infrastructure Resource Principal Documentation](https://docs.oracle.com/en-us/iaas/Content/Identity/Tasks/callingservicesfrominstances.htm)
- [DBMS_CLOUD Package Documentation](https://docs.oracle.com/en/cloud/paas/autonomous-database/adbsa/dbms-cloud-package.html)
- [Dynamic Groups Configuration](https://docs.oracle.com/en-us/iaas/Content/Identity/Tasks/managingdynamicgroups.htm)