46 KiB
FILE_ARCHIVER Configuration Guide
This document describes the archival strategies available in the FILE_ARCHIVER package for managing data lifecycle across OCI buckets (INBOX → ODS → ARCHIVE).
Overview
The FILE_ARCHIVER package provides flexible archival strategies that accommodate different data retention policies across source systems. It manages the movement of processed data from operational storage (ODS bucket) to long-term archival storage (ARCHIVE bucket) based on configurable strategies.
Key Features
- Three Archival Strategies: THRESHOLD_BASED, MINIMUM_AGE_MONTHS (with 0=current month only), HYBRID
- Flexible Configuration: Per-table archival strategy configuration via A_SOURCE_FILE_CONFIG
- Validation: Automatic validation of strategy-specific configuration requirements
- OCI Integration: Works seamlessly with DBMS_CLOUD operations via cloud_wrapper
Package Information
- Schema: CT_MRDS
- Package: FILE_ARCHIVER
- Current Version: 3.3.0
- Dependencies: ENV_MANAGER, FILE_MANAGER, cloud_wrapper, A_SOURCE_FILE_CONFIG, A_SOURCE_FILE_RECEIVED, A_WORKFLOW_HISTORY
Critical Prerequisites
⚠️ IMPORTANT: FILE_ARCHIVER requires data to be registered in CT_MRDS.A_SOURCE_FILE_RECEIVED table. This table is automatically populated when files are processed through the modern Airflow + DBT workflow via FILE_MANAGER.PROCESS_SOURCE_FILE.
For legacy data migrated from Informatica + WLA system:
- Legacy data exported using
DATA_EXPORTERdoes NOT automatically createA_SOURCE_FILE_RECEIVEDrecords - Without these records, FILE_ARCHIVER CANNOT archive the data
- See System Migration Guide for workaround strategies
Recommendation for legacy data: Export directly to ARCHIVE bucket using DATA_EXPORTER.EXPORT_TABLE_DATA_BY_DATE with pBucketArea => 'ARCHIVE' to bypass this requirement
Archival Strategies
Strategy Overview
| Strategy | WHERE Clause Logic | Configuration Required | Primary Use Case |
|---|---|---|---|
THRESHOLD_BASED |
Days since workflow start > threshold | DAYS_FOR_ARCHIVE_THRESHOLD | Simple time-based archival |
MINIMUM_AGE_MONTHS |
Archive data older than X months (0=current month only) | MINIMUM_AGE_MONTHS (≥0) | All sources - flexible retention (0 for LM, 6 for CSDB) |
HYBRID |
Combines month boundary + minimum age | MINIMUM_AGE_MONTHS | Advanced retention scenarios |
1. THRESHOLD_BASED (Default)
Archives data based on number of days since workflow start.
WHERE Clause:
extract(day from (systimestamp - workflow_start)) > DAYS_FOR_ARCHIVE_THRESHOLD
Configuration:
UPDATE CT_MRDS.A_SOURCE_FILE_CONFIG
SET ARCHIVAL_STRATEGY = 'THRESHOLD_BASED',
DAYS_FOR_ARCHIVE_THRESHOLD = 30,
MINIMUM_AGE_MONTHS = NULL
WHERE SOURCE_FILE_TYPE = 'INPUT'
AND SOURCE_FILE_ID = 'C2D_DATA'
AND TABLE_ID = 'C2D_TABLE';
Use Case: Simple time-based archival.
2. MINIMUM_AGE_MONTHS
Archives data older than specified number of months. Special case: MINIMUM_AGE_MONTHS = 0 archives all data before current month.
WHERE Clause:
workflow_start < ADD_MONTHS(TRUNC(SYSDATE, 'MM'), -MINIMUM_AGE_MONTHS)
-- When MINIMUM_AGE_MONTHS = 0: workflow_start < TRUNC(SYSDATE, 'MM')
Configuration Examples:
-- LM: Keep only current month data (MINIMUM_AGE_MONTHS = 0)
UPDATE CT_MRDS.A_SOURCE_FILE_CONFIG
SET ARCHIVAL_STRATEGY = 'MINIMUM_AGE_MONTHS',
MINIMUM_AGE_MONTHS = 0
WHERE SOURCE_FILE_TYPE = 'INPUT'
AND SOURCE_FILE_ID = 'DistributeStandingFacilities'
AND TABLE_ID = 'LM_STANDING_FACILITIES';
-- CSDB: Retain 6 months of data (MINIMUM_AGE_MONTHS = 6)
UPDATE CT_MRDS.A_SOURCE_FILE_CONFIG
SET ARCHIVAL_STRATEGY = 'MINIMUM_AGE_MONTHS',
MINIMUM_AGE_MONTHS = 6
WHERE SOURCE_FILE_TYPE = 'INPUT'
AND SOURCE_FILE_ID = 'CSDB'
AND TABLE_ID IN ('CSDB_DEBT', 'CSDB_DEBT_DAILY');
Use Cases:
- MINIMUM_AGE_MONTHS = 0: LM dissemination feeds requiring current month only (daily/intraday updates)
- MINIMUM_AGE_MONTHS = 6: CSDB securities/ratings data requiring 6-month retention
- MINIMUM_AGE_MONTHS = N: Regulatory compliance with specific N-month retention periods
Behavior Examples:
-
With MINIMUM_AGE_MONTHS = 0:
- January data: Archived on February 1st
- February data: Remains in ODS bucket during February
- March 1st: February data archived, March data active
-
With MINIMUM_AGE_MONTHS = 6:
- February 2026: Archives data from July 2025 and earlier
- March 2026: Archives data from August 2025 and earlier
- Keeps current month + 6 previous months (7 months total) in ODS bucket
3. HYBRID
Combines month boundary check with minimum age threshold - archives data from previous months AND older than minimum age.
WHERE Clause:
TRUNC(workflow_start, 'MM') < TRUNC(SYSDATE, 'MM')
AND workflow_start < ADD_MONTHS(TRUNC(SYSDATE, 'MM'), -MINIMUM_AGE_MONTHS)
Configuration:
-- Advanced: Current month + 3 months minimum
UPDATE CT_MRDS.A_SOURCE_FILE_CONFIG
SET ARCHIVAL_STRATEGY = 'HYBRID',
MINIMUM_AGE_MONTHS = 3
WHERE SOURCE_FILE_TYPE = 'INPUT'
AND SOURCE_FILE_ID = 'SPECIAL_SOURCE'
AND TABLE_ID = 'SPECIAL_TABLE';
Use Case: Advanced scenarios requiring both current month retention AND minimum age threshold.
Archival Triggering Logic
Strategy-Specific Execution Behavior
The FILE_ARCHIVER package uses different triggering logic depending on the configured archival strategy:
MINIMUM_AGE_MONTHS Strategy (Threshold-Independent)
Behavior: Archives data immediately when age criteria is met, without checking archival thresholds.
-- Executed when MINIMUM_AGE_MONTHS strategy is configured
IF vSourceFileConfig.ARCHIVAL_STRATEGY = 'MINIMUM_AGE_MONTHS' THEN
vArchivalTriggeredBy := 'AGE_BASED';
-- Proceeds with archival regardless of FILES_COUNT, ROWS_COUNT, or BYTES_SUM
END IF;
Why: This strategy is designed for strict retention policies where data must be archived based on age alone (e.g., regulatory compliance requiring current month only).
THRESHOLD_BASED and HYBRID Strategies (Threshold-Dependent)
Behavior: Archives data only when at least one of the following thresholds is exceeded:
- FILES_COUNT_OVER_ARCHIVE_THRESHOLD - Number of files eligible for archival
- ROWS_COUNT_OVER_ARCHIVE_THRESHOLD - Number of rows eligible for archival
- BYTES_SUM_OVER_ARCHIVE_THRESHOLD - Total size in bytes eligible for archival
-- Executed for THRESHOLD_BASED and HYBRID strategies
IF vTableStat.OVER_ARCH_THRESOLD_FILE_COUNT >= vSourceFileConfig.FILES_COUNT_OVER_ARCHIVE_THRESHOLD THEN
vArchivalTriggeredBy := 'FILES_COUNT';
ELSIF vTableStat.OVER_ARCH_THRESOLD_ROW_COUNT >= vSourceFileConfig.ROWS_COUNT_OVER_ARCHIVE_THRESHOLD THEN
vArchivalTriggeredBy := 'ROWS_COUNT';
ELSIF vTableStat.OVER_ARCH_THRESOLD_SIZE >= vSourceFileConfig.BYTES_SUM_OVER_ARCHIVE_THRESHOLD THEN
vArchivalTriggeredBy := 'BYTES_SUM';
END IF;
Why: These strategies provide performance optimization by avoiding unnecessary archival operations when data volume is small.
Configuration Example:
-- Set archival thresholds for THRESHOLD_BASED strategy
UPDATE CT_MRDS.A_SOURCE_FILE_CONFIG
SET FILES_COUNT_OVER_ARCHIVE_THRESHOLD = 10, -- Archive when 10+ files eligible
ROWS_COUNT_OVER_ARCHIVE_THRESHOLD = 100000, -- Archive when 100k+ rows eligible
BYTES_SUM_OVER_ARCHIVE_THRESHOLD = 104857600 -- Archive when 100MB+ eligible
WHERE ARCHIVAL_STRATEGY = 'THRESHOLD_BASED'
AND TABLE_ID = 'YOUR_TABLE';
Important: For MINIMUM_AGE_MONTHS strategy, these threshold values are ignored - archival proceeds based on age alone.
Configuration Validation
Validation Trigger
Trigger: TRG_BI_A_SRC_FILE_CFG_ARCH_VAL
Automatically validates archival configuration on INSERT/UPDATE to A_SOURCE_FILE_CONFIG:
Validation Rules:
-
MINIMUM_AGE_MONTHS: Requires
MINIMUM_AGE_MONTHS IS NOT NULL AND MINIMUM_AGE_MONTHS >= 0- Error: "Strategy MINIMUM_AGE_MONTHS requires MINIMUM_AGE_MONTHS to be set (≥0)"
-
HYBRID: Requires
MINIMUM_AGE_MONTHS IS NOT NULL- Error: "Strategy HYBRID requires MINIMUM_AGE_MONTHS to be set"
Example Validation Error:
-- This will fail validation
UPDATE CT_MRDS.A_SOURCE_FILE_CONFIG
SET ARCHIVAL_STRATEGY = 'MINIMUM_AGE_MONTHS',
MINIMUM_AGE_MONTHS = NULL -- ERROR: Required for this strategy
WHERE ...;
-- Error: ORA-20001: Strategy MINIMUM_AGE_MONTHS requires MINIMUM_AGE_MONTHS to be set
Archival Control Configuration
ARCHIVE_ENABLED Column
Controls whether archival is enabled for specific table configuration.
Column: A_SOURCE_FILE_CONFIG.ARCHIVE_ENABLED (VARCHAR2(1), DEFAULT 'Y')
Values:
'Y'(default) - Table is eligible for archival processing'N'- Table is excluded from archival (batch operations skip this config)
Use Cases:
- Disable archival for specific tables without removing configuration
- Temporarily suspend archival during data migration or troubleshooting
- Selective archival in batch operations
Configuration Example:
-- Disable archival for specific table
UPDATE CT_MRDS.A_SOURCE_FILE_CONFIG
SET ARCHIVE_ENABLED = 'N'
WHERE SOURCE_FILE_TYPE = 'INPUT'
AND SOURCE_FILE_ID = 'CSDB'
AND TABLE_ID = 'CSDB_DEBT';
COMMIT;
-- Re-enable archival
UPDATE CT_MRDS.A_SOURCE_FILE_CONFIG
SET ARCHIVE_ENABLED = 'Y'
WHERE SOURCE_FILE_TYPE = 'INPUT'
AND SOURCE_FILE_ID = 'CSDB'
AND TABLE_ID = 'CSDB_DEBT';
COMMIT;
-- Check archival status
SELECT
SOURCE_FILE_ID,
TABLE_ID,
ARCHIVE_ENABLED,
ARCHIVAL_STRATEGY
FROM CT_MRDS.A_SOURCE_FILE_CONFIG
WHERE SOURCE_FILE_TYPE = 'INPUT'
ORDER BY SOURCE_FILE_ID, TABLE_ID;
KEEP_IN_TRASH Column
Controls TRASH folder retention policy for archived files.
Column: A_SOURCE_FILE_CONFIG.KEEP_IN_TRASH (VARCHAR2(1), DEFAULT 'Y')
Values:
'Y'(default) - CSV files kept in TRASH folder after archival (status: ARCHIVED_AND_TRASHED)'N'- CSV files deleted from TRASH folder after archival (status: ARCHIVED_AND_PURGED)
Benefits of TRASH Retention (TRUE):
- Safety net for rollback if archival issues discovered
- Supports compliance and audit requirements
- Enables file restoration via
RESTORE_FILE_FROM_TRASHprocedure
Benefits of TRASH Cleanup (FALSE):
- Reduces storage costs in DATA bucket
- Simplifies bucket management
- Appropriate for non-critical or test data
Configuration Example:
-- Production: Keep files in TRASH (recommended)
UPDATE CT_MRDS.A_SOURCE_FILE_CONFIG
SET KEEP_IN_TRASH = 'Y'
WHERE SOURCE_FILE_TYPE = 'INPUT'
AND SOURCE_FILE_ID = 'LM'
AND TABLE_ID LIKE 'LM_%';
COMMIT;
-- Test environment: Cleanup TRASH to save storage
UPDATE CT_MRDS.A_SOURCE_FILE_CONFIG
SET KEEP_IN_TRASH = 'N'
WHERE SOURCE_FILE_TYPE = 'INPUT'
AND SOURCE_FILE_ID = 'TEST_SOURCE';
COMMIT;
-- Bulk configuration by source
UPDATE CT_MRDS.A_SOURCE_FILE_CONFIG
SET KEEP_IN_TRASH = 'Y'
WHERE SOURCE_FILE_TYPE = 'INPUT'
AND SOURCE_FILE_ID IN ('CSDB', 'C2D', 'LM');
COMMIT;
Data Lifecycle Workflow
Status Tracking in A_SOURCE_FILE_RECEIVED
The FILE_ARCHIVER tracks file lifecycle through the PROCESSING_STATUS column in CT_MRDS.A_SOURCE_FILE_RECEIVED table:
Status Progression:
INGESTED → ARCHIVED_AND_TRASHED → ARCHIVED_AND_PURGED (optional)
↓
INGESTED (via RESTORE_FILE_FROM_TRASH)
Status Descriptions:
- INGESTED: File successfully processed through Airflow+DBT, residing in ODS bucket
- ARCHIVED_AND_TRASHED: File archived to Parquet in ARCHIVE bucket, CSV retained in TRASH folder (DATA bucket)
- ARCHIVED_AND_PURGED: File archived to Parquet, CSV deleted from TRASH folder (when KEEP_IN_TRASH='N')
Associated Columns Updated During Archival:
UPDATE CT_MRDS.A_SOURCE_FILE_RECEIVED
SET PROCESSING_STATUS = 'ARCHIVED_AND_TRASHED', -- Status change
ARCH_PATH = 'archive_directory_prefix/', -- Directory with Parquet files
PARTITION_YEAR = 2026, -- Year partition value
PARTITION_MONTH = 02 -- Month partition value
WHERE SOURCE_FILE_NAME = 'file.csv';
ARCH_PATH Column: Contains the directory prefix (URI) where archived Parquet files are located in the ARCHIVE bucket. Since DBMS_CLOUD.EXPORT_DATA may create multiple Parquet files with parallel execution, the system stores the directory location rather than individual filenames.
Example ARCH_PATH:
https://objectstorage.eu-frankfurt-1.oraclecloud.com/n/namespace/b/archive/o/ARCHIVE/LM/STANDING_FACILITIES/PARTITION_YEAR=2026/PARTITION_MONTH=02/
Standard File Processing Flow
┌─────────────────────────────────────────────────────────────┐
│ FILE PROCESSING LIFECYCLE │
└─────────────────────────────────────────────────────────────┘
1. INBOX Bucket (Validation)
├─ File arrives from source system
├─ FILE_MANAGER.PROCESS_SOURCE_FILE validates structure
├─ Status: RECEIVED → VALIDATED → READY_FOR_INGESTION
└─ FILE_MANAGER.MOVE_FILE relocates to ODS bucket
2. ODS Bucket (Operational Data)
├─ Active data processing (Airflow + DBT)
├─ External tables read data from bucket
├─ Status: INGESTED
├─ FILE_ARCHIVER.ARCHIVE_TABLE_DATA archives based on strategy
└─ CSV files moved to TRASH subfolder (ODS → TRASH/)
2.1 TRASH Subfolder (DATA Bucket - File Retention)
├─ Located in DATA bucket (e.g., TRASH/LM/TABLE_NAME)
├─ Stores CSV files after archival to Parquet
├─ Status: ARCHIVED_AND_TRASHED (default, controlled by KEEP_IN_TRASH config)
├─ Enables rollback if archival issues occur
└─ Optional cleanup: ARCHIVED_AND_PURGED (when KEEP_IN_TRASH = 'N')
3. ARCHIVE Bucket (Long-term Storage)
├─ Historical data in Parquet format
├─ Hive-style partitioning: PARTITION_YEAR=/PARTITION_MONTH=
├─ Status: ARCHIVED_AND_TRASHED or ARCHIVED_AND_PURGED
└─ Optimized for big data analytics (Spark, Hive)
**Key Procedures**:
- `ARCHIVE_TABLE_DATA(pSourceFileConfigKey)` - Main archival procedure using strategy-specific WHERE clause
- TRASH folder retention controlled by `KEEP_IN_TRASH` column in A_SOURCE_FILE_CONFIG
- `ARCHIVE_ALL(pSourceFileConfigKey, pSourceKey, pArchiveAll)` - Batch archival with 3-level granularity and error handling
- **Level 3 (Highest Priority)**: Single configuration via `pSourceFileConfigKey`
- **Level 2 (Medium Priority)**: All configurations for source via `pSourceKey`
- **Level 1 (Lowest Priority)**: All configurations system-wide via `pArchiveAll`
- **Error Handling**: Continues processing other tables on individual failures
- **Filtering**: Respects `ARCHIVE_ENABLED='Y'` (skips disabled configurations)
- **Individual TRASH Policy**: Each table's `KEEP_IN_TRASH` setting applied independently
- **Summary Reporting**: Returns counts of Archived/Skipped/Failed tables
- `GET_ARCHIVAL_WHERE_CLAUSE` - Returns WHERE clause based on configured strategy
- `GATHER_TABLE_STAT` - Calculates archival statistics using strategy logic
- `GATHER_TABLE_STAT_ALL(pSourceFileConfigKey, pSourceKey, pGatherAll)` - Batch statistics with 3-level granularity
- `RESTORE_FILE_FROM_TRASH(pSourceFileConfigKey, pSourceKey, pRestoreAll)` - Restore archived files from TRASH
- `PURGE_TRASH_FOLDER(pSourceFileConfigKey, pSourceKey, pPurgeAll)` - Purge TRASH folder with 3-level granularity
**Archival Execution**:
```sql
-- Single table archival (TRASH retention controlled by KEEP_IN_TRASH config)
BEGIN
CT_MRDS.FILE_ARCHIVER.ARCHIVE_TABLE_DATA(
pSourceFileConfigKey => vSourceFileConfigKey
);
END;
/
-- Batch archival: All tables for specific source
BEGIN
CT_MRDS.FILE_ARCHIVER.ARCHIVE_ALL(
pSourceFileConfigKey => NULL,
pSourceKey => 'LM', -- Archive all LM tables
pArchiveAll => FALSE
);
END;
/
-- Batch archival: All tables system-wide
BEGIN
CT_MRDS.FILE_ARCHIVER.ARCHIVE_ALL(
pSourceFileConfigKey => NULL,
pSourceKey => NULL,
pArchiveAll => TRUE -- Archive all configured tables
);
END;
/
Strategy-Based Filtering:
- Package retrieves ARCHIVAL_STRATEGY from A_SOURCE_FILE_CONFIG
- GET_ARCHIVAL_WHERE_CLAUSE generates appropriate WHERE clause
- Only tables with ARCHIVE_ENABLED = 'Y' are processed
- Data matching criteria moved from ODS to ARCHIVE bucket
- CSV files moved to TRASH subfolder in DATA bucket (ODS/ → TRASH/)
- Parquet format with Hive-style partitioning applied to ARCHIVE bucket
- TRASH retention controlled by KEEP_IN_TRASH column in A_SOURCE_FILE_CONFIG
Automatic Rollback Mechanism
FILE_ARCHIVER implements automatic rollback to ensure data integrity if archival process fails:
Process Flow:
- Export to ARCHIVE: Data exported to Parquet format in ARCHIVE bucket
- Status Update: A_SOURCE_FILE_RECEIVED records updated to 'ARCHIVED_AND_TRASHED'
- Move to TRASH: CSV files moved from ODS to TRASH folder (DATA bucket)
- Optional Cleanup: If KEEP_IN_TRASH='N', files deleted from TRASH
Automatic Rollback Trigger: If any error occurs during step 3 (Move to TRASH), the system:
- Reverts all files: Moves successfully processed files from TRASH back to ODS
- Rolls back status: Resets A_SOURCE_FILE_RECEIVED status to 'INGESTED'
- Logs error: Records detailed error information in A_PROCESS_LOG
- Raises exception: Propagates error to calling process
Rollback Logic (from code):
-- If MOVE_FILE_TO_TRASH fails for any file
ELSIF vProcessControlStatus = 'MOVE_FILE_TO_TRASH_FAILURE' THEN
FOR f in (files already moved to TRASH) LOOP
-- Move file back from TRASH to ODS
DBMS_CLOUD.MOVE_OBJECT(
source_object_uri => 'TRASH/.../filename',
target_object_uri => 'ODS/.../filename'
);
-- Revert status back to INGESTED
UPDATE A_SOURCE_FILE_RECEIVED
SET PROCESSING_STATUS = 'INGESTED'
WHERE source_file_name = f.filename;
END LOOP;
END IF;
Why This Matters: Ensures all-or-nothing archival - either all files for a YEAR_MONTH partition are successfully archived, or none are (maintains data consistency).
TRASH Management Procedures
RESTORE_FILE_FROM_TRASH
Restores files from TRASH folder back to ODS with 3-level granularity:
Level 3 (Highest Priority) - Single File Restore:
-- Restore specific file by A_SOURCE_FILE_RECEIVED_KEY
CALL FILE_ARCHIVER.RESTORE_FILE_FROM_TRASH(
pSourceFileReceivedKey => 12345
);
Level 2 (Medium Priority) - Configuration-Based Restore:
-- Restore all files for specific table configuration
CALL FILE_ARCHIVER.RESTORE_FILE_FROM_TRASH(
pSourceFileConfigKey => 341
);
Level 1 (Lowest Priority) - Global Restore:
-- Restore ALL files with ARCHIVED_AND_TRASHED status system-wide
CALL FILE_ARCHIVER.RESTORE_FILE_FROM_TRASH(
pRestoreAll => TRUE
);
Restore Operations:
- Moves files: TRASH folder → ODS folder (using DBMS_CLOUD.MOVE_OBJECT)
- Updates status: ARCHIVED_AND_TRASHED → INGESTED
- Clears metadata: Sets ARCH_PATH, PARTITION_YEAR, PARTITION_MONTH to NULL
- Returns files to active processing: Makes data available for Airflow+DBT pipeline
PURGE_TRASH_FOLDER
Permanently deletes files from TRASH with 3-level granularity:
Level 3 (Highest Priority) - Single File Purge:
-- Delete specific file from TRASH
CALL FILE_ARCHIVER.PURGE_TRASH_FOLDER(
pSourceFileReceivedKey => 12345
);
Level 2 (Medium Priority) - Configuration-Based Purge:
-- Delete all TRASH files for specific table configuration
CALL FILE_ARCHIVER.PURGE_TRASH_FOLDER(
pSourceFileConfigKey => 341
);
Level 1 (Lowest Priority) - Global Purge:
-- Delete ALL files with ARCHIVED_AND_TRASHED status system-wide
CALL FILE_ARCHIVER.PURGE_TRASH_FOLDER(
pPurgeAll => TRUE
);
Purge Operations:
- Deletes files: Permanently removes from TRASH folder (using DBMS_CLOUD.DELETE_OBJECT)
- Updates status: ARCHIVED_AND_TRASHED → ARCHIVED_AND_PURGED
- Warning: Irreversible operation - files cannot be restored after purge
- Use case: Storage optimization, compliance with data retention policies
Important: Purge is not automatic - must be explicitly called. This provides additional safety layer for data retention.
Configuration Examples
Example 1: Configure LM Standing Facilities (Current Month Only)
-- Keep only current month data in ODS bucket (MINIMUM_AGE_MONTHS = 0)
UPDATE CT_MRDS.A_SOURCE_FILE_CONFIG
SET ARCHIVAL_STRATEGY = 'MINIMUM_AGE_MONTHS',
MINIMUM_AGE_MONTHS = 0 -- 0 = archives all data before current month
WHERE SOURCE_FILE_TYPE = 'INPUT'
AND SOURCE_FILE_ID = 'DistributeStandingFacilities'
AND TABLE_ID = 'LM_STANDING_FACILITIES';
COMMIT;
-- Verify configuration
SELECT
SOURCE_FILE_ID,
TABLE_ID,
ARCHIVAL_STRATEGY,
MINIMUM_AGE_MONTHS
FROM CT_MRDS.A_SOURCE_FILE_CONFIG
WHERE SOURCE_FILE_ID = 'DistributeStandingFacilities';
Example 2: Configure CSDB Debt (MINIMUM_AGE_MONTHS)
-- Retain 6 months of data in ODS bucket
UPDATE CT_MRDS.A_SOURCE_FILE_CONFIG
SET ARCHIVAL_STRATEGY = 'MINIMUM_AGE_MONTHS',
MINIMUM_AGE_MONTHS = 6
WHERE SOURCE_FILE_TYPE = 'INPUT'
AND SOURCE_FILE_ID = 'CSDB'
AND TABLE_ID = 'CSDB_DEBT';
COMMIT;
-- Verify configuration
SELECT
SOURCE_FILE_ID,
TABLE_ID,
ARCHIVAL_STRATEGY,
MINIMUM_AGE_MONTHS
FROM CT_MRDS.A_SOURCE_FILE_CONFIG
WHERE TABLE_ID = 'CSDB_DEBT';
Example 3: Bulk Configuration for LM Source
-- Configure all 19 LM tables with MINIMUM_AGE_MONTHS = 0 (current month only)
UPDATE CT_MRDS.A_SOURCE_FILE_CONFIG
SET ARCHIVAL_STRATEGY = 'MINIMUM_AGE_MONTHS',
MINIMUM_AGE_MONTHS = 0 -- 0 = keep only current month
WHERE SOURCE_FILE_TYPE = 'INPUT'
AND SOURCE_FILE_ID IN (
'DistributeStandingFacilities',
'DistributeTTS',
'DistributeAdHocAdjustments',
'DistributeBalanceSheet',
'DistributeCSMAdjustments',
'DistributeCurrentAccounts',
'DistributeForecast',
'DistributeQREAdjustments'
);
COMMIT;
-- Verify bulk configuration
SELECT
SOURCE_FILE_ID,
COUNT(*) AS TABLE_COUNT,
MAX(ARCHIVAL_STRATEGY) AS STRATEGY,
MAX(MINIMUM_AGE_MONTHS) AS MIN_AGE
FROM CT_MRDS.A_SOURCE_FILE_CONFIG
WHERE SOURCE_FILE_ID LIKE 'Distribute%'
GROUP BY SOURCE_FILE_ID
ORDER BY SOURCE_FILE_ID;
Example 4: View Current Archival Configuration
-- All configured tables with their archival strategies
SELECT
A_SOURCE_KEY,
SOURCE_FILE_ID,
TABLE_ID,
ARCHIVAL_STRATEGY,
MINIMUM_AGE_MONTHS,
DAYS_FOR_ARCHIVE_THRESHOLD
FROM CT_MRDS.A_SOURCE_FILE_CONFIG
WHERE SOURCE_FILE_TYPE = 'INPUT'
ORDER BY A_SOURCE_KEY, SOURCE_FILE_ID, TABLE_ID;
-- Summary by strategy
SELECT
ARCHIVAL_STRATEGY,
COUNT(*) AS TABLE_COUNT,
MIN(MINIMUM_AGE_MONTHS) AS MIN_AGE_MIN,
MAX(MINIMUM_AGE_MONTHS) AS MIN_AGE_MAX
FROM CT_MRDS.A_SOURCE_FILE_CONFIG
WHERE SOURCE_FILE_TYPE = 'INPUT'
GROUP BY ARCHIVAL_STRATEGY
ORDER BY ARCHIVAL_STRATEGY;
Example 5: Configure Archival Control Settings
-- Complete configuration with all archival settings
UPDATE CT_MRDS.A_SOURCE_FILE_CONFIG
SET ARCHIVAL_STRATEGY = 'MINIMUM_AGE_MONTHS',
MINIMUM_AGE_MONTHS = 6,
ARCHIVE_ENABLED = 'Y', -- Enable archival
KEEP_IN_TRASH = 'Y' -- Keep files in TRASH for safety
WHERE SOURCE_FILE_TYPE = 'INPUT'
AND SOURCE_FILE_ID = 'CSDB'
AND TABLE_ID = 'CSDB_DEBT';
COMMIT;
-- Disable archival temporarily for troubleshooting
UPDATE CT_MRDS.A_SOURCE_FILE_CONFIG
SET ARCHIVE_ENABLED = 'N' -- Batch operations will skip this table
WHERE TABLE_ID = 'CSDB_DEBT';
COMMIT;
-- Configure TRASH cleanup for test environment
UPDATE CT_MRDS.A_SOURCE_FILE_CONFIG
SET KEEP_IN_TRASH = 'N' -- Delete files from TRASH after archival
WHERE SOURCE_FILE_TYPE = 'INPUT'
AND SOURCE_FILE_ID = 'TEST_SOURCE';
COMMIT;
-- View complete configuration
SELECT
SOURCE_FILE_ID,
TABLE_ID,
ARCHIVAL_STRATEGY,
MINIMUM_AGE_MONTHS,
ARCHIVE_ENABLED,
KEEP_IN_TRASH
FROM CT_MRDS.A_SOURCE_FILE_CONFIG
WHERE SOURCE_FILE_TYPE = 'INPUT'
ORDER BY SOURCE_FILE_ID, TABLE_ID;
-- Summary by archival status
SELECT
ARCHIVE_ENABLED,
KEEP_IN_TRASH,
COUNT(*) AS TABLE_COUNT
FROM CT_MRDS.A_SOURCE_FILE_CONFIG
WHERE SOURCE_FILE_TYPE = 'INPUT'
GROUP BY ARCHIVE_ENABLED, KEEP_IN_TRASH
ORDER BY ARCHIVE_ENABLED DESC, KEEP_IN_TRASH DESC;
Release 01 Configuration
Configured Tables (MARS-828)
The following 25 Release 01 tables were configured with archival strategies:
LM Tables (19 total) - MINIMUM_AGE_MONTHS = 0 (current month only):
- LM_STANDING_FACILITIES
- LM_STANDING_FACILITIES_HEADER
- LM_TTS_HEADER
- LM_TTS_ITEM
- LM_ADHOC_ADJUSTMENTS_HEADER
- LM_ADHOC_ADJUSTMENTS_ITEM
- LM_ADHOC_ADJUSTMENTS_ITEM_HEADER
- LM_BALANCESHEET_HEADER
- LM_BALANCESHEET_ITEM
- LM_CSM_ADJUSTMENTS_HEADER
- LM_CSM_ADJUSTMENTS_ITEM
- LM_CSM_ADJUSTMENTS_ITEM_HEADER
- LM_CURRENT_ACCOUNTS_HEADER
- LM_CURRENT_ACCOUNTS_ITEM
- LM_FORECAST_HEADER
- LM_FORECAST_ITEM
- LM_QRE_ADJUSTMENTS_HEADER
- LM_QRE_ADJUSTMENTS_ITEM
- LM_QRE_ADJUSTMENTS_ITEM_HEADER
CSDB Tables (6 total):
MINIMUM_AGE_MONTHS = 6 (6-month retention):
- CSDB_DEBT
- CSDB_DEBT_DAILY
MINIMUM_AGE_MONTHS = 0 (current month only):
- CSDB_INSTR_RAT_FULL
- CSDB_INSTR_DESC_FULL
- CSDB_ISSUER_RAT_FULL
- CSDB_ISSUER_DESC_FULL
Verification Query:
-- Check Release 01 configuration
SELECT
CASE
WHEN TABLE_ID LIKE 'LM_%' THEN 'LM'
WHEN TABLE_ID LIKE 'CSDB_%' THEN 'CSDB'
END AS SOURCE_GROUP,
ARCHIVAL_STRATEGY,
MINIMUM_AGE_MONTHS,
COUNT(*) AS TABLE_COUNT
FROM CT_MRDS.A_SOURCE_FILE_CONFIG
WHERE SOURCE_FILE_TYPE = 'INPUT'
AND TABLE_ID IN (
-- 25 Release 01 tables
'LM_STANDING_FACILITIES', 'LM_STANDING_FACILITIES_HEADER',
'LM_TTS_HEADER', 'LM_TTS_ITEM',
-- ... other tables
)
GROUP BY
CASE
WHEN TABLE_ID LIKE 'LM_%' THEN 'LM'
WHEN TABLE_ID LIKE 'CSDB_%' THEN 'CSDB'
END,
ARCHIVAL_STRATEGY,
MINIMUM_AGE_MONTHS
ORDER BY SOURCE_GROUP, ARCHIVAL_STRATEGY;
Troubleshooting
Common Issues
Issue 1: Validation Error on Configuration Update
Error:
ORA-20001: Strategy MINIMUM_AGE_MONTHS requires MINIMUM_AGE_MONTHS to be set
Cause: Trigger validation failed - strategy requires MINIMUM_AGE_MONTHS but value is NULL
Solution:
-- Provide required MINIMUM_AGE_MONTHS value
UPDATE CT_MRDS.A_SOURCE_FILE_CONFIG
SET ARCHIVAL_STRATEGY = 'MINIMUM_AGE_MONTHS',
MINIMUM_AGE_MONTHS = 6 -- Required for this strategy
WHERE ...;
Issue 2: Archival Not Triggering Despite Configuration
Scenario A: MINIMUM_AGE_MONTHS strategy not archiving
-- Check files that should be archived
SELECT
SFR.A_SOURCE_FILE_RECEIVED_KEY,
SFR.SOURCE_FILE_NAME,
SFR.PROCESSING_STATUS,
LH.LOAD_START,
TRUNC(MONTHS_BETWEEN(SYSDATE, LH.LOAD_START)) AS MONTHS_AGE,
SFC.MINIMUM_AGE_MONTHS AS THRESHOLD
FROM CT_MRDS.A_SOURCE_FILE_RECEIVED SFR
JOIN CT_ODS.A_LOAD_HISTORY LH ON SFR.A_WORKFLOW_HISTORY_KEY = LH.A_WORKFLOW_HISTORY_KEY
JOIN CT_MRDS.A_SOURCE_FILE_CONFIG SFC ON SFR.A_SOURCE_FILE_CONFIG_KEY = SFC.A_SOURCE_FILE_CONFIG_KEY
WHERE SFC.ARCHIVAL_STRATEGY = 'MINIMUM_AGE_MONTHS'
AND SFR.PROCESSING_STATUS = 'INGESTED'
AND SFC.ARCHIVE_ENABLED = 'Y'
ORDER BY LH.LOAD_START;
-- Note: MINIMUM_AGE_MONTHS archives immediately (threshold-independent)
-- If files not archived, check ARCHIVE_ENABLED='Y' and run ARCHIVE_TABLE_DATA
Scenario B: THRESHOLD_BASED or HYBRID strategy not archiving
-- Check if threshold reached for specific configuration
SELECT
SFC.SOURCE_FILE_ID,
SFC.TABLE_ID,
SFC.ARCHIVAL_STRATEGY,
SFC.FILES_COUNT_OVER_ARCHIVE_THRESHOLD AS FILE_THRESHOLD,
SFC.ROWS_COUNT_OVER_ARCHIVE_THRESHOLD AS ROW_THRESHOLD,
SFC.BYTES_SUM_OVER_ARCHIVE_THRESHOLD AS BYTE_THRESHOLD,
COUNT(SFR.A_SOURCE_FILE_RECEIVED_KEY) AS CURRENT_FILES,
SUM(SFR.TOTAL_RECORDS) AS CURRENT_ROWS,
SUM(SFR.FILE_SIZE_BYTES) AS CURRENT_BYTES
FROM CT_MRDS.A_SOURCE_FILE_CONFIG SFC
LEFT JOIN CT_MRDS.A_SOURCE_FILE_RECEIVED SFR
ON SFC.A_SOURCE_FILE_CONFIG_KEY = SFR.A_SOURCE_FILE_CONFIG_KEY
AND SFR.PROCESSING_STATUS = 'INGESTED'
WHERE SFC.ARCHIVAL_STRATEGY IN ('THRESHOLD_BASED', 'HYBRID')
AND SFC.ARCHIVE_ENABLED = 'Y'
AND SFC.A_SOURCE_FILE_CONFIG_KEY = :yourConfigKey
GROUP BY
SFC.SOURCE_FILE_ID, SFC.TABLE_ID, SFC.ARCHIVAL_STRATEGY,
SFC.FILES_COUNT_OVER_ARCHIVE_THRESHOLD,
SFC.ROWS_COUNT_OVER_ARCHIVE_THRESHOLD,
SFC.BYTES_SUM_OVER_ARCHIVE_THRESHOLD;
-- Expected: At least ONE threshold (FILE/ROW/BYTE) must be exceeded
-- If no threshold exceeded, archival will NOT trigger (threshold-dependent behavior)
Issue 3: ARCH_PATH Contains Directory Not Filename
Symptoms: A_SOURCE_FILE_RECEIVED.ARCH_PATH shows folder path instead of specific file
Explanation: This is expected behavior:
-- Example ARCH_PATH value
SELECT ARCH_PATH
FROM CT_MRDS.A_SOURCE_FILE_RECEIVED
WHERE PROCESSING_STATUS = 'ARCHIVED_AND_TRASHED'
AND ROWNUM = 1;
-- Result (example):
-- https://objectstorage.../ARCHIVE/LM/STANDING_FACILITIES/PARTITION_YEAR=2026/PARTITION_MONTH=02/
-- Reason: DBMS_CLOUD.EXPORT_DATA with parallel execution creates multiple Parquet files:
-- - STANDING_FACILITIES_part_00001.parquet
-- - STANDING_FACILITIES_part_00002.parquet
-- - ...
-- System stores directory prefix to track ALL generated files
To List Actual Parquet Files:
-- Use DBMS_CLOUD.LIST_OBJECTS with ARCH_PATH as prefix
SELECT object_name, bytes, created
FROM TABLE(DBMS_CLOUD.LIST_OBJECTS(
credential_name => 'OCI$RESOURCE_PRINCIPAL',
location_uri => 'https://objectstorage.../b/archive/o/'
))
WHERE object_name LIKE 'ARCHIVE/LM/STANDING_FACILITIES/PARTITION_YEAR=2026/PARTITION_MONTH=02/%';
Issue 4: Files Remain in TRASH Folder
Symptoms: Files not deleted from TRASH after archival
Cause: Configuration has KEEP_IN_TRASH='Y' (retain files in TRASH)
Verification:
-- Check TRASH policy for configuration
SELECT
SOURCE_FILE_ID,
TABLE_ID,
KEEP_IN_TRASH,
CASE KEEP_IN_TRASH
WHEN 'Y' THEN 'Files RETAINED in TRASH (manual purge required)'
WHEN 'N' THEN 'Files DELETED immediately after archival'
END AS TRASH_BEHAVIOR
FROM CT_MRDS.A_SOURCE_FILE_CONFIG
WHERE TABLE_ID = 'YOUR_TABLE';
Solutions:
-- Option A: Change configuration to auto-delete (permanent change)
UPDATE CT_MRDS.A_SOURCE_FILE_CONFIG
SET KEEP_IN_TRASH = 'N' -- Auto-delete from TRASH after archival
WHERE TABLE_ID = 'YOUR_TABLE';
COMMIT;
-- Option B: Manually purge TRASH for specific table (one-time action)
BEGIN
CT_MRDS.FILE_ARCHIVER.PURGE_TRASH_FOLDER(
pSourceFileConfigKey => :yourConfigKey
);
END;
/
-- Option C: Purge all TRASH system-wide (use with caution)
BEGIN
CT_MRDS.FILE_ARCHIVER.PURGE_TRASH_FOLDER(
pPurgeAll => TRUE
);
END;
/
Issue 5: Automatic Rollback Occurred
Symptoms: Files unexpectedly back in INGESTED status, archival process reported failure
Cause: Error during "Move to TRASH" step triggered automatic rollback
Investigation:
-- Check process logs for rollback events
SELECT
PROCESS_LOG_KEY,
LOG_LEVEL,
LOG_MESSAGE,
PARAMETERS,
LOG_TIMESTAMP
FROM CT_MRDS.A_PROCESS_LOG
WHERE PROCESS_NAME = 'ARCHIVE_TABLE_DATA'
AND LOG_MESSAGE LIKE '%rollback%' OR LOG_MESSAGE LIKE '%MOVE_FILE_TO_TRASH_FAILURE%'
ORDER BY LOG_TIMESTAMP DESC
FETCH FIRST 10 ROWS ONLY;
-- Check files that were rolled back
SELECT
A_SOURCE_FILE_RECEIVED_KEY,
SOURCE_FILE_NAME,
PROCESSING_STATUS, -- Should be INGESTED after rollback
ARCH_PATH, -- Should be NULL after rollback
PARTITION_YEAR, -- Should be NULL after rollback
PARTITION_MONTH -- Should be NULL after rollback
FROM CT_MRDS.A_SOURCE_FILE_RECEIVED
WHERE A_SOURCE_FILE_CONFIG_KEY = :yourConfigKey
AND UPDATED_AT > SYSDATE - 1 -- Last 24 hours
ORDER BY UPDATED_AT DESC;
Resolution:
- Investigate root cause: Check error messages in A_PROCESS_LOG
- Fix underlying issue: OCI permissions, bucket access, wrong credentials, etc.
- Re-run archival: Call ARCHIVE_TABLE_DATA again after fix
Issue 6: Archival Not Working as Expected
Symptoms: Data not being archived according to strategy
Diagnostic Steps:
-- 1. Check configuration
SELECT
SOURCE_FILE_ID,
TABLE_ID,
ARCHIVAL_STRATEGY,
MINIMUM_AGE_MONTHS,
DAYS_FOR_ARCHIVE_THRESHOLD
FROM CT_MRDS.A_SOURCE_FILE_CONFIG
WHERE TABLE_ID = 'YOUR_TABLE';
-- 2. Check package version
SELECT CT_MRDS.FILE_ARCHIVER.GET_VERSION() FROM DUAL;
-- Expected: 3.0.0 or higher
-- 3. Check process logs
SELECT
PROCESS_LOG_KEY,
PROCESS_NAME,
LOG_MESSAGE,
LOG_LEVEL,
LOG_TIMESTAMP
FROM CT_MRDS.A_PROCESS_LOG
WHERE PROCESS_NAME LIKE '%ARCHIVE%'
ORDER BY LOG_TIMESTAMP DESC
FETCH FIRST 20 ROWS ONLY;
-- 4. Test WHERE clause generation
DECLARE
vConfig CT_MRDS.A_SOURCE_FILE_CONFIG%ROWTYPE;
vWhereClause VARCHAR2(4000);
BEGIN
SELECT * INTO vConfig
FROM CT_MRDS.A_SOURCE_FILE_CONFIG
WHERE TABLE_ID = 'YOUR_TABLE'
AND ROWNUM = 1;
vWhereClause := CT_MRDS.FILE_ARCHIVER.GET_ARCHIVAL_WHERE_CLAUSE(vConfig);
DBMS_OUTPUT.PUT_LINE('WHERE Clause: ' || vWhereClause);
END;
/
Issue 3: Package Compilation Errors After Upgrade
Symptoms: FILE_ARCHIVER package shows INVALID status
Solution:
-- Check compilation errors
SELECT * FROM USER_ERRORS
WHERE NAME = 'FILE_ARCHIVER'
AND TYPE IN ('PACKAGE', 'PACKAGE BODY')
ORDER BY SEQUENCE;
-- Recompile package
ALTER PACKAGE CT_MRDS.FILE_ARCHIVER COMPILE SPECIFICATION;
ALTER PACKAGE CT_MRDS.FILE_ARCHIVER COMPILE BODY;
-- Verify status
SELECT object_name, object_type, status
FROM user_objects
WHERE object_name = 'FILE_ARCHIVER';
Diagnostic Queries for Monitoring
Query 1: Status Distribution Across All Files
-- Overall file status distribution
SELECT
PROCESSING_STATUS,
COUNT(*) AS FILE_COUNT,
ROUND(COUNT(*) * 100.0 / SUM(COUNT(*)) OVER (), 2) AS PERCENTAGE,
MIN(CREATED_AT) AS OLDEST_FILE,
MAX(CREATED_AT) AS NEWEST_FILE
FROM CT_MRDS.A_SOURCE_FILE_RECEIVED
GROUP BY PROCESSING_STATUS
ORDER BY FILE_COUNT DESC;
Query 2: Files in TRASH (Archived but Not Purged)
-- Files currently in TRASH folder (status ARCHIVED_AND_TRASHED)
SELECT
SFR.A_SOURCE_FILE_RECEIVED_KEY,
SFC.SOURCE_FILE_ID,
SFC.TABLE_ID,
SFR.SOURCE_FILE_NAME,
SFR.ARCH_PATH,
SFR.PARTITION_YEAR,
SFR.PARTITION_MONTH,
SFR.FILE_SIZE_BYTES,
SFR.UPDATED_AT AS ARCHIVED_AT,
TRUNC(SYSDATE - SFR.UPDATED_AT) AS DAYS_IN_TRASH,
SFC.KEEP_IN_TRASH AS TRASH_POLICY
FROM CT_MRDS.A_SOURCE_FILE_RECEIVED SFR
JOIN CT_MRDS.A_SOURCE_FILE_CONFIG SFC ON SFR.A_SOURCE_FILE_CONFIG_KEY = SFC.A_SOURCE_FILE_CONFIG_KEY
WHERE SFR.PROCESSING_STATUS = 'ARCHIVED_AND_TRASHED'
ORDER BY SFR.UPDATED_AT DESC;
Query 3: Archival Activity by Configuration
-- Archival statistics per table configuration
SELECT
SFC.SOURCE_FILE_ID,
SFC.TABLE_ID,
SFC.ARCHIVAL_STRATEGY,
SFC.ARCHIVE_ENABLED,
SFC.KEEP_IN_TRASH,
COUNT(CASE WHEN SFR.PROCESSING_STATUS = 'INGESTED' THEN 1 END) AS PENDING_ARCHIVE,
COUNT(CASE WHEN SFR.PROCESSING_STATUS = 'ARCHIVED_AND_TRASHED' THEN 1 END) AS IN_TRASH,
COUNT(CASE WHEN SFR.PROCESSING_STATUS = 'ARCHIVED_AND_PURGED' THEN 1 END) AS PURGED,
MAX(SFR.UPDATED_AT) FILTER (WHERE SFR.PROCESSING_STATUS LIKE 'ARCHIVED%') AS LAST_ARCHIVAL
FROM CT_MRDS.A_SOURCE_FILE_CONFIG SFC
LEFT JOIN CT_MRDS.A_SOURCE_FILE_RECEIVED SFR ON SFC.A_SOURCE_FILE_CONFIG_KEY = SFR.A_SOURCE_FILE_CONFIG_KEY
WHERE SFC.SOURCE_FILE_TYPE = 'INPUT'
GROUP BY
SFC.SOURCE_FILE_ID, SFC.TABLE_ID, SFC.ARCHIVAL_STRATEGY,
SFC.ARCHIVE_ENABLED, SFC.KEEP_IN_TRASH
ORDER BY SFC.SOURCE_FILE_ID, SFC.TABLE_ID;
Query 4: Files Eligible for Archival (MINIMUM_AGE_MONTHS)
-- Files that should be archived based on MINIMUM_AGE_MONTHS strategy
SELECT
SFC.SOURCE_FILE_ID,
SFC.TABLE_ID,
SFC.MINIMUM_AGE_MONTHS AS AGE_THRESHOLD,
COUNT(*) AS ELIGIBLE_FILES,
SUM(SFR.FILE_SIZE_BYTES) AS TOTAL_SIZE_BYTES,
SUM(SFR.TOTAL_RECORDS) AS TOTAL_ROWS,
MIN(LH.LOAD_START) AS OLDEST_FILE,
MAX(LH.LOAD_START) AS NEWEST_ELIGIBLE
FROM CT_MRDS.A_SOURCE_FILE_CONFIG SFC
JOIN CT_MRDS.A_SOURCE_FILE_RECEIVED SFR ON SFC.A_SOURCE_FILE_CONFIG_KEY = SFR.A_SOURCE_FILE_CONFIG_KEY
JOIN CT_ODS.A_LOAD_HISTORY LH ON SFR.A_WORKFLOW_HISTORY_KEY = LH.A_WORKFLOW_HISTORY_KEY
WHERE SFC.ARCHIVAL_STRATEGY = 'MINIMUM_AGE_MONTHS'
AND SFC.ARCHIVE_ENABLED = 'Y'
AND SFR.PROCESSING_STATUS = 'INGESTED'
AND LH.LOAD_START < ADD_MONTHS(TRUNC(SYSDATE, 'MM'), -SFC.MINIMUM_AGE_MONTHS)
GROUP BY SFC.SOURCE_FILE_ID, SFC.TABLE_ID, SFC.MINIMUM_AGE_MONTHS
ORDER BY ELIGIBLE_FILES DESC;
Query 5: Archival Performance Metrics
-- Recent archival operations with timing
SELECT
PROCESS_LOG_KEY,
SUBSTR(PARAMETERS, 1, 100) AS CONFIG_INFO,
LOG_TIMESTAMP AS START_TIME,
LEAD(LOG_TIMESTAMP) OVER (PARTITION BY SUBSTR(PARAMETERS, 1, 100) ORDER BY LOG_TIMESTAMP) AS END_TIME,
ROUND((LEAD(LOG_TIMESTAMP) OVER (PARTITION BY SUBSTR(PARAMETERS, 1, 100) ORDER BY LOG_TIMESTAMP)
- LOG_TIMESTAMP) * 24 * 60, 2) AS DURATION_MINUTES,
CASE
WHEN LOG_LEVEL = 'ERROR' THEN 'FAILED'
WHEN LOG_MESSAGE LIKE '%Archival completed%' THEN 'SUCCESS'
ELSE 'IN_PROGRESS'
END AS STATUS
FROM CT_MRDS.A_PROCESS_LOG
WHERE PROCESS_NAME = 'ARCHIVE_TABLE_DATA'
AND LOG_TIMESTAMP > SYSDATE - 7 -- Last 7 days
ORDER BY LOG_TIMESTAMP DESC;
Query 6: TRASH Storage Usage
-- Estimate TRASH folder storage usage
SELECT
SFC.SOURCE_FILE_ID,
COUNT(*) AS FILES_IN_TRASH,
ROUND(SUM(SFR.FILE_SIZE_BYTES) / 1024 / 1024 / 1024, 2) AS SIZE_GB,
MIN(SFR.UPDATED_AT) AS OLDEST_IN_TRASH,
MAX(SFR.UPDATED_AT) AS NEWEST_IN_TRASH,
SFC.KEEP_IN_TRASH AS POLICY
FROM CT_MRDS.A_SOURCE_FILE_RECEIVED SFR
JOIN CT_MRDS.A_SOURCE_FILE_CONFIG SFC ON SFR.A_SOURCE_FILE_CONFIG_KEY = SFC.A_SOURCE_FILE_CONFIG_KEY
WHERE SFR.PROCESSING_STATUS = 'ARCHIVED_AND_TRASHED'
GROUP BY SFC.SOURCE_FILE_ID, SFC.KEEP_IN_TRASH
ORDER BY SIZE_GB DESC;
Version History
v3.3.0 (Current - 2026-02-11)
- BREAKING CHANGE: Removed
pKeepInTrashparameter from ARCHIVE_TABLE_DATA - Added
ARCHIVE_ENABLEDcolumn to A_SOURCE_FILE_CONFIG for selective archiving control - Added
KEEP_IN_TRASHcolumn to A_SOURCE_FILE_CONFIG (replaces pKeepInTrash parameter) - Added batch procedures with 3-level granularity (config/source/all):
- ARCHIVE_ALL - Batch archival procedure
- GATHER_TABLE_STAT_ALL - Batch statistics procedure
- RESTORE_FILE_FROM_TRASH - Restore files from TRASH folder
- PURGE_TRASH_FOLDER - Purge TRASH folder files
- TRASH retention now configuration-based instead of parameter-based
- Enhanced flexibility for archival orchestration and monitoring
v3.2.1 (2026-02-10)
- Fixed critical bug: Status update ARCHIVED → ARCHIVED_AND_TRASHED when moving files to TRASH folder
- Ensures proper status tracking for files retained in TRASH
v3.2.0 (2026-02-06)
- Added
pKeepInTrashparameter (DEFAULT TRUE) to ARCHIVE_TABLE_DATA - TRASH folder retention control for safety and compliance
- Files kept in TRASH subfolder by default for rollback capability
v3.1.0 (2026-02-05)
- BREAKING CHANGE: Removed CURRENT_MONTH_ONLY strategy (replaced by MINIMUM_AGE_MONTHS = 0)
- Mathematical equivalence: CURRENT_MONTH_ONLY ≡ MINIMUM_AGE_MONTHS = 0
- Updated trigger validation to allow MINIMUM_AGE_MONTHS >= 0 (previously >= 1)
- Simplified architecture from 4 strategies to 3
- Enhanced error handling
- All 25 Release 01 tables migrated to MINIMUM_AGE_MONTHS (23 with value 0, 2 with value 6)
v3.0.0 (MARS-828 - 2026-02-04)
- Added ARCHIVAL_STRATEGY configuration column
- Implemented four archival strategies (later reduced to three in v3.1.0):
- THRESHOLD_BASED (backward compatible)
- CURRENT_MONTH_ONLY (deprecated in v3.1.0, use MINIMUM_AGE_MONTHS = 0)
- MINIMUM_AGE_MONTHS
- HYBRID
- Added GET_ARCHIVAL_WHERE_CLAUSE function
- Created validation trigger TRG_BI_A_SRC_FILE_CFG_ARCH_VAL
- Configured 25 Release 01 tables with appropriate strategies
v2.0.0 (Legacy)
- Initial FILE_ARCHIVER package
- THRESHOLD_BASED archival only
- Fixed DAYS_FOR_ARCHIVE_THRESHOLD configuration
Related Documentation
- FILE_MANAGER Configuration Guide - File processing and validation
- Package Deployment Guide - Package deployment standards
- Universal Package Tracking System - Version tracking
- MARS-828 README - Detailed implementation notes
Dependencies
Required Packages
- CT_MRDS.ENV_MANAGER v3.x - Error handling, logging, version tracking
- CT_MRDS.FILE_MANAGER v3.x - Bucket URI resolution, file processing
- MRDS_LOADER.cloud_wrapper - DBMS_CLOUD operations wrapper
Database Objects
- Table: CT_MRDS.A_SOURCE_FILE_CONFIG - Configuration storage
- Table: CT_MRDS.A_SOURCE_FILE_RECEIVED - File processing tracking
- Table: CT_MRDS.A_WORKFLOW_HISTORY - Workflow execution tracking (Airflow + DBT)
- Trigger: TRG_BI_A_SRC_FILE_CFG_ARCH_VAL - Configuration validation
- Credential: DEF_CRED_ARN - OCI bucket access
OCI Buckets
- INBOX: Incoming file validation (
'INBOX/{SOURCE}/{SOURCE_FILE_ID}/{TABLE_NAME}/') - ODS/DATA: Operational data processing (
'ODS/{SOURCE}/{TABLE_NAME}/') - TRASH: File retention subfolder in DATA bucket (
'TRASH/{SOURCE}/{TABLE_NAME}/') - CSV files after archival - ARCHIVE: Historical data storage (
'ARCHIVE/{SOURCE}/{TABLE_NAME}/PARTITION_YEAR=/PARTITION_MONTH=/')
Note: TRASH is NOT a separate bucket - it's a subfolder within the DATA bucket for file retention and rollback capability.
Best Practices
Strategy Selection Guidelines
-
Use MINIMUM_AGE_MONTHS when:
- MINIMUM_AGE_MONTHS = 0: Current month only retention
- Data updated frequently (daily/intraday)
- Historical data access is rare
- ODS bucket space is limited
- Example: LM dissemination feeds
- MINIMUM_AGE_MONTHS = N (N > 0): Multi-month retention
- Regulatory compliance requires specific retention period
- Analytical workloads need N-month access
- Data updates are infrequent
- Example: CSDB securities data (MINIMUM_AGE_MONTHS = 6)
- MINIMUM_AGE_MONTHS = 0: Current month only retention
-
Use THRESHOLD_BASED when:
- Simple time-based archival is sufficient
-
Use HYBRID when:
- Complex retention requirements
- Combining month boundary check with minimum age threshold
- Advanced scenarios not covered by other strategies
Configuration Best Practices
-
Test Configuration Changes:
-- Test on single table first UPDATE CT_MRDS.A_SOURCE_FILE_CONFIG SET ARCHIVAL_STRATEGY = 'MINIMUM_AGE_MONTHS', MINIMUM_AGE_MONTHS = 0 -- 0 = current month only WHERE SOURCE_FILE_ID = 'TEST_FILE' AND TABLE_ID = 'TEST_TABLE'; -- Monitor archival behavior -- Expand to other tables after validation -
Verify Before Bulk Updates:
-- Preview changes with SELECT SELECT SOURCE_FILE_ID, TABLE_ID, 'MINIMUM_AGE_MONTHS' AS NEW_STRATEGY, 0 AS NEW_MIN_AGE, -- 0 = current month only ARCHIVAL_STRATEGY AS OLD_STRATEGY, MINIMUM_AGE_MONTHS AS OLD_MIN_AGE FROM CT_MRDS.A_SOURCE_FILE_CONFIG WHERE SOURCE_FILE_ID LIKE 'Distribute%'; -- Then execute UPDATE -
Document Configuration Decisions:
- Record why specific strategy was chosen
- Note business requirements driving retention policy
- Track configuration changes in version control
-
Monitor Archival Performance:
-- Check archival execution logs SELECT PROCESS_NAME, LOG_MESSAGE, LOG_TIMESTAMP FROM CT_MRDS.A_PROCESS_LOG WHERE PROCESS_NAME LIKE '%ARCHIVE%' AND LOG_TIMESTAMP > SYSDATE - 7 ORDER BY LOG_TIMESTAMP DESC; -
Regular Configuration Reviews:
- Verify strategies still match business requirements
- Check for tables without archival configuration
- Optimize MINIMUM_AGE_MONTHS based on actual usage patterns
TRASH Folder Retention Best Practices
-
Default Behavior (KEEP_IN_TRASH = 'Y' - Recommended):
- Keeps CSV files in TRASH folder after archival
- Provides safety net for rollback if archival issues occur
- Supports compliance and audit requirements
- Status: ARCHIVED_AND_TRASHED
- Use for: Production environments, regulatory compliance, critical data
- Configuration:
UPDATE CT_MRDS.A_SOURCE_FILE_CONFIG SET KEEP_IN_TRASH = 'Y' WHERE SOURCE_FILE_TYPE = 'INPUT' AND TABLE_ID = 'YOUR_TABLE';
-
TRASH Cleanup (KEEP_IN_TRASH = 'N'):
- Deletes CSV files from TRASH folder after successful archival
- Reduces storage costs in DATA bucket
- Status: ARCHIVED_AND_PURGED
- Use for: Non-critical data, storage optimization, test environments
- Configuration:
UPDATE CT_MRDS.A_SOURCE_FILE_CONFIG SET KEEP_IN_TRASH = 'N' WHERE SOURCE_FILE_TYPE = 'INPUT' AND TABLE_ID = 'YOUR_TABLE';
-
Monitoring TRASH Folder:
-- Check files in TRASH retention SELECT SOURCE_FILE_NAME, PROCESSING_STATUS, ARCH_FILE_NAME, PARTITION_YEAR, PARTITION_MONTH FROM CT_MRDS.A_SOURCE_FILE_RECEIVED WHERE PROCESSING_STATUS IN ('ARCHIVED_AND_TRASHED', 'ARCHIVED_AND_PURGED') AND RECEPTION_DATE > SYSDATE - 30 ORDER BY PROCESSING_STATUS, RECEPTION_DATE DESC; -
TRASH Folder Structure:
DATA Bucket: ├── ODS/LM/STANDING_FACILITIES/file.csv -- Active operational data └── TRASH/LM/STANDING_FACILITIES/file.csv -- Retained after archival ARCHIVE Bucket: └── ARCHIVE/LM/STANDING_FACILITIES/ └── PARTITION_YEAR=2026/ └── PARTITION_MONTH=02/ └── *.parquet -- Archived data
Author
Created by: Grzegorz Michalski
Date: 2026-02-11
Schema: CT_MRDS
Package: FILE_ARCHIVER
Version: 3.3.0