# FILE_ARCHIVER Configuration Guide This document describes the archival strategies available in the FILE_ARCHIVER package for managing data lifecycle across OCI buckets (INBOX → ODS → ARCHIVE). ## Overview The FILE_ARCHIVER package provides flexible archival strategies that accommodate different data retention policies across source systems. It manages the movement of processed data from operational storage (ODS bucket) to long-term archival storage (ARCHIVE bucket) based on configurable strategies. ### Key Features - **Three Archival Strategies**: THRESHOLD_BASED, MINIMUM_AGE_MONTHS (with 0=current month only), HYBRID - **Flexible Configuration**: Per-table archival strategy configuration via A_SOURCE_FILE_CONFIG - **Backward Compatible**: Default THRESHOLD_BASED strategy maintains existing behavior - **Validation**: Automatic validation of strategy-specific configuration requirements - **OCI Integration**: Works seamlessly with DBMS_CLOUD operations via cloud_wrapper ### Package Information - **Schema**: CT_MRDS - **Package**: FILE_ARCHIVER - **Current Version**: 3.2.0 - **Dependencies**: ENV_MANAGER, FILE_MANAGER, cloud_wrapper, A_SOURCE_FILE_CONFIG, A_SOURCE_FILE_RECEIVED, A_WORKFLOW_HISTORY ### Critical Prerequisites ⚠️ **IMPORTANT**: FILE_ARCHIVER requires data to be registered in `CT_MRDS.A_SOURCE_FILE_RECEIVED` table. This table is automatically populated when files are processed through the modern Airflow + DBT workflow via `FILE_MANAGER.PROCESS_SOURCE_FILE`. **For legacy data migrated from Informatica + WLA system:** - Legacy data exported using `DATA_EXPORTER` does NOT automatically create `A_SOURCE_FILE_RECEIVED` records - Without these records, FILE_ARCHIVER **CANNOT** archive the data - See [System Migration Guide](System_Migration_Informatica_to_Airflow_DBT.md) for workaround strategies **Recommendation for legacy data**: Export directly to ARCHIVE bucket using `DATA_EXPORTER.EXPORT_TABLE_DATA_BY_DATE` with `pBucketArea => 'ARCHIVE'` to bypass this requirement ## Archival Strategies ### Strategy Overview | Strategy | WHERE Clause Logic | Configuration Required | Primary Use Case | |----------|-------------------|----------------------|------------------| | `THRESHOLD_BASED` | Days since workflow start > threshold | DAYS_FOR_ARCHIVE_THRESHOLD | Legacy compatibility, simple time-based archival | | `MINIMUM_AGE_MONTHS` | Archive data older than X months (0=current month only) | MINIMUM_AGE_MONTHS (≥0) | All sources - flexible retention (0 for LM, 6 for CSDB) | | `HYBRID` | Combines month boundary + minimum age | MINIMUM_AGE_MONTHS | Advanced retention scenarios | ### 1. THRESHOLD_BASED (Default) Archives data based on number of days since workflow start. **WHERE Clause**: ```sql extract(day from (systimestamp - workflow_start)) > DAYS_FOR_ARCHIVE_THRESHOLD ``` **Configuration**: ```sql UPDATE CT_MRDS.A_SOURCE_FILE_CONFIG SET ARCHIVAL_STRATEGY = 'THRESHOLD_BASED', DAYS_FOR_ARCHIVE_THRESHOLD = 30, MINIMUM_AGE_MONTHS = NULL WHERE SOURCE_FILE_TYPE = 'INPUT' AND SOURCE_FILE_ID = 'C2D_DATA' AND TABLE_ID = 'C2D_TABLE'; ``` **Use Case**: Simple time-based archival, backward compatible with FILE_ARCHIVER v2.0.0 behavior. ### 2. MINIMUM_AGE_MONTHS Archives data older than specified number of months. **Special case**: MINIMUM_AGE_MONTHS = 0 archives all data before current month (replaces deprecated CURRENT_MONTH_ONLY strategy). **WHERE Clause**: ```sql workflow_start < ADD_MONTHS(TRUNC(SYSDATE, 'MM'), -MINIMUM_AGE_MONTHS) -- When MINIMUM_AGE_MONTHS = 0: workflow_start < TRUNC(SYSDATE, 'MM') ``` **Configuration Examples**: ```sql -- LM: Keep only current month data (MINIMUM_AGE_MONTHS = 0) UPDATE CT_MRDS.A_SOURCE_FILE_CONFIG SET ARCHIVAL_STRATEGY = 'MINIMUM_AGE_MONTHS', MINIMUM_AGE_MONTHS = 0 WHERE SOURCE_FILE_TYPE = 'INPUT' AND SOURCE_FILE_ID = 'DistributeStandingFacilities' AND TABLE_ID = 'LM_STANDING_FACILITIES'; -- CSDB: Retain 6 months of data (MINIMUM_AGE_MONTHS = 6) UPDATE CT_MRDS.A_SOURCE_FILE_CONFIG SET ARCHIVAL_STRATEGY = 'MINIMUM_AGE_MONTHS', MINIMUM_AGE_MONTHS = 6 WHERE SOURCE_FILE_TYPE = 'INPUT' AND SOURCE_FILE_ID = 'CSDB' AND TABLE_ID IN ('CSDB_DEBT', 'CSDB_DEBT_DAILY'); ``` **Use Cases**: - **MINIMUM_AGE_MONTHS = 0**: LM dissemination feeds requiring current month only (daily/intraday updates) - **MINIMUM_AGE_MONTHS = 6**: CSDB securities/ratings data requiring 6-month retention - **MINIMUM_AGE_MONTHS = N**: Regulatory compliance with specific N-month retention periods **Behavior Examples**: - **With MINIMUM_AGE_MONTHS = 0**: - January data: Archived on February 1st - February data: Remains in ODS bucket during February - March 1st: February data archived, March data active - **With MINIMUM_AGE_MONTHS = 6**: - February 2026: Archives data from July 2025 and earlier - March 2026: Archives data from August 2025 and earlier - Keeps current month + 6 previous months (7 months total) in ODS bucket ### 3. HYBRID Combines month boundary check with minimum age threshold - archives data from previous months AND older than minimum age. **WHERE Clause**: ```sql TRUNC(workflow_start, 'MM') < TRUNC(SYSDATE, 'MM') AND workflow_start < ADD_MONTHS(TRUNC(SYSDATE, 'MM'), -MINIMUM_AGE_MONTHS) ``` **Configuration**: ```sql -- Advanced: Current month + 3 months minimum UPDATE CT_MRDS.A_SOURCE_FILE_CONFIG SET ARCHIVAL_STRATEGY = 'HYBRID', MINIMUM_AGE_MONTHS = 3 WHERE SOURCE_FILE_TYPE = 'INPUT' AND SOURCE_FILE_ID = 'SPECIAL_SOURCE' AND TABLE_ID = 'SPECIAL_TABLE'; ``` **Use Case**: Advanced scenarios requiring both current month retention AND minimum age threshold. ## Configuration Validation ### Validation Trigger **Trigger**: `TRG_BI_A_SRC_FILE_CFG_ARCH_VAL` Automatically validates archival configuration on INSERT/UPDATE to A_SOURCE_FILE_CONFIG: **Validation Rules**: 1. **MINIMUM_AGE_MONTHS**: Requires `MINIMUM_AGE_MONTHS IS NOT NULL AND MINIMUM_AGE_MONTHS >= 0` - Error: "Strategy MINIMUM_AGE_MONTHS requires MINIMUM_AGE_MONTHS to be set (≥0)" 2. **HYBRID**: Requires `MINIMUM_AGE_MONTHS IS NOT NULL` - Error: "Strategy HYBRID requires MINIMUM_AGE_MONTHS to be set" **Example Validation Error**: ```sql -- This will fail validation UPDATE CT_MRDS.A_SOURCE_FILE_CONFIG SET ARCHIVAL_STRATEGY = 'MINIMUM_AGE_MONTHS', MINIMUM_AGE_MONTHS = NULL -- ERROR: Required for this strategy WHERE ...; -- Error: ORA-20001: Strategy MINIMUM_AGE_MONTHS requires MINIMUM_AGE_MONTHS to be set ``` ## Data Lifecycle Workflow ### Standard File Processing Flow ``` ┌─────────────────────────────────────────────────────────────┐ │ FILE PROCESSING LIFECYCLE │ └─────────────────────────────────────────────────────────────┘ 1. INBOX Bucket (Validation) ├─ File arrives from source system ├─ FILE_MANAGER.PROCESS_SOURCE_FILE validates structure ├─ Status: RECEIVED → VALIDATED → READY_FOR_INGESTION └─ FILE_MANAGER.MOVE_FILE relocates to ODS bucket 2. ODS Bucket (Operational Data) ├─ Active data processing (Airflow + DBT) ├─ External tables read data from bucket ├─ Status: INGESTED ├─ FILE_ARCHIVER.ARCHIVE_TABLE_DATA archives based on strategy └─ CSV files moved to TRASH subfolder (ODS → TRASH/) 2.1 TRASH Subfolder (DATA Bucket - File Retention) ├─ Located in DATA bucket (e.g., TRASH/LM/TABLE_NAME) ├─ Stores CSV files after archival to Parquet ├─ Status: ARCHIVED_AND_TRASHED (default retention) ├─ Enables rollback if archival issues occur └─ Optional cleanup: ARCHIVED_AND_PURGED (pKeepInTrash=FALSE) 3. ARCHIVE Bucket (Long-term Storage) ├─ Historical data in Parquet format ├─ Hive-style partitioning: PARTITION_YEAR=/PARTITION_MONTH= ├─ Status: ARCHIVED_AND_TRASHED or ARCHIVED_AND_PURGED └─ Optimized for big data analytics (Spark, Hive) **Key Procedures**: - `ARCHIVE_TABLE_DATA(pSourceFileConfigKey, pKeepInTrash)` - Main archival procedure using strategy-specific WHERE clause - `pKeepInTrash` (BOOLEAN, DEFAULT TRUE) - Controls TRASH folder retention - TRUE: Files kept in TRASH folder for safety and rollback capability (default) - FALSE: Files deleted from TRASH folder after successful archival - `GET_ARCHIVAL_WHERE_CLAUSE` - Returns WHERE clause based on configured strategy - `GATHER_TABLE_STAT` - Calculates archival statistics using strategy logic **Archival Execution**: ```sql -- Default behavior: Keep files in TRASH folder (ARCHIVED_AND_TRASHED status) BEGIN CT_MRDS.FILE_ARCHIVER.ARCHIVE_TABLE_DATA( pSourceFileConfigKey => vSourceFileConfigKey, pKeepInTrash => TRUE -- DEFAULT value ); END; / -- Optional: Delete files from TRASH after archival (ARCHIVED_AND_PURGED status) BEGIN CT_MRDS.FILE_ARCHIVER.ARCHIVE_TABLE_DATA( pSourceFileConfigKey => vSourceFileConfigKey, pKeepInTrash => FALSE -- Cleanup TRASH folder ); END; / ``` **Strategy-Based Filtering**: - Package retrieves ARCHIVAL_STRATEGY from A_SOURCE_FILE_CONFIG - GET_ARCHIVAL_WHERE_CLAUSE generates appropriate WHERE clause - Data matching criteria moved from ODS to ARCHIVE bucket - CSV files moved to TRASH subfolder in DATA bucket (ODS/ → TRASH/) - Parquet format with Hive-style partitioning applied to ARCHIVE bucket - TRASH retention controlled by pKeepInTrash parameter ## Configuration Examples ### Example 1: Configure LM Standing Facilities (Current Month Only) ```sql -- Keep only current month data in ODS bucket (MINIMUM_AGE_MONTHS = 0) UPDATE CT_MRDS.A_SOURCE_FILE_CONFIG SET ARCHIVAL_STRATEGY = 'MINIMUM_AGE_MONTHS', MINIMUM_AGE_MONTHS = 0 -- 0 = archives all data before current month WHERE SOURCE_FILE_TYPE = 'INPUT' AND SOURCE_FILE_ID = 'DistributeStandingFacilities' AND TABLE_ID = 'LM_STANDING_FACILITIES'; COMMIT; -- Verify configuration SELECT SOURCE_FILE_ID, TABLE_ID, ARCHIVAL_STRATEGY, MINIMUM_AGE_MONTHS FROM CT_MRDS.A_SOURCE_FILE_CONFIG WHERE SOURCE_FILE_ID = 'DistributeStandingFacilities'; ``` ### Example 2: Configure CSDB Debt (MINIMUM_AGE_MONTHS) ```sql -- Retain 6 months of data in ODS bucket UPDATE CT_MRDS.A_SOURCE_FILE_CONFIG SET ARCHIVAL_STRATEGY = 'MINIMUM_AGE_MONTHS', MINIMUM_AGE_MONTHS = 6 WHERE SOURCE_FILE_TYPE = 'INPUT' AND SOURCE_FILE_ID = 'CSDB' AND TABLE_ID = 'CSDB_DEBT'; COMMIT; -- Verify configuration SELECT SOURCE_FILE_ID, TABLE_ID, ARCHIVAL_STRATEGY, MINIMUM_AGE_MONTHS FROM CT_MRDS.A_SOURCE_FILE_CONFIG WHERE TABLE_ID = 'CSDB_DEBT'; ``` ### Example 3: Bulk Configuration for LM Source ```sql -- Configure all 19 LM tables with MINIMUM_AGE_MONTHS = 0 (current month only) UPDATE CT_MRDS.A_SOURCE_FILE_CONFIG SET ARCHIVAL_STRATEGY = 'MINIMUM_AGE_MONTHS', MINIMUM_AGE_MONTHS = 0 -- 0 = keep only current month WHERE SOURCE_FILE_TYPE = 'INPUT' AND SOURCE_FILE_ID IN ( 'DistributeStandingFacilities', 'DistributeTTS', 'DistributeAdHocAdjustments', 'DistributeBalanceSheet', 'DistributeCSMAdjustments', 'DistributeCurrentAccounts', 'DistributeForecast', 'DistributeQREAdjustments' ); COMMIT; -- Verify bulk configuration SELECT SOURCE_FILE_ID, COUNT(*) AS TABLE_COUNT, MAX(ARCHIVAL_STRATEGY) AS STRATEGY, MAX(MINIMUM_AGE_MONTHS) AS MIN_AGE FROM CT_MRDS.A_SOURCE_FILE_CONFIG WHERE SOURCE_FILE_ID LIKE 'Distribute%' GROUP BY SOURCE_FILE_ID ORDER BY SOURCE_FILE_ID; ``` ### Example 4: View Current Archival Configuration ```sql -- All configured tables with their archival strategies SELECT A_SOURCE_KEY, SOURCE_FILE_ID, TABLE_ID, ARCHIVAL_STRATEGY, MINIMUM_AGE_MONTHS, DAYS_FOR_ARCHIVE_THRESHOLD FROM CT_MRDS.A_SOURCE_FILE_CONFIG WHERE SOURCE_FILE_TYPE = 'INPUT' ORDER BY A_SOURCE_KEY, SOURCE_FILE_ID, TABLE_ID; -- Summary by strategy SELECT ARCHIVAL_STRATEGY, COUNT(*) AS TABLE_COUNT, MIN(MINIMUM_AGE_MONTHS) AS MIN_AGE_MIN, MAX(MINIMUM_AGE_MONTHS) AS MIN_AGE_MAX FROM CT_MRDS.A_SOURCE_FILE_CONFIG WHERE SOURCE_FILE_TYPE = 'INPUT' GROUP BY ARCHIVAL_STRATEGY ORDER BY ARCHIVAL_STRATEGY; ``` ## Release 01 Configuration ### Configured Tables (MARS-828) The following 25 Release 01 tables were configured with archival strategies: **LM Tables (19 total) - MINIMUM_AGE_MONTHS = 0 (current month only)**: - LM_STANDING_FACILITIES - LM_STANDING_FACILITIES_HEADER - LM_TTS_HEADER - LM_TTS_ITEM - LM_ADHOC_ADJUSTMENTS_HEADER - LM_ADHOC_ADJUSTMENTS_ITEM - LM_ADHOC_ADJUSTMENTS_ITEM_HEADER - LM_BALANCESHEET_HEADER - LM_BALANCESHEET_ITEM - LM_CSM_ADJUSTMENTS_HEADER - LM_CSM_ADJUSTMENTS_ITEM - LM_CSM_ADJUSTMENTS_ITEM_HEADER - LM_CURRENT_ACCOUNTS_HEADER - LM_CURRENT_ACCOUNTS_ITEM - LM_FORECAST_HEADER - LM_FORECAST_ITEM - LM_QRE_ADJUSTMENTS_HEADER - LM_QRE_ADJUSTMENTS_ITEM - LM_QRE_ADJUSTMENTS_ITEM_HEADER **CSDB Tables (6 total)**: *MINIMUM_AGE_MONTHS = 6 (6-month retention)*: - CSDB_DEBT - CSDB_DEBT_DAILY *MINIMUM_AGE_MONTHS = 0 (current month only)*: - CSDB_INSTR_RAT_FULL - CSDB_INSTR_DESC_FULL - CSDB_ISSUER_RAT_FULL - CSDB_ISSUER_DESC_FULL **Verification Query**: ```sql -- Check Release 01 configuration SELECT CASE WHEN TABLE_ID LIKE 'LM_%' THEN 'LM' WHEN TABLE_ID LIKE 'CSDB_%' THEN 'CSDB' END AS SOURCE_GROUP, ARCHIVAL_STRATEGY, MINIMUM_AGE_MONTHS, COUNT(*) AS TABLE_COUNT FROM CT_MRDS.A_SOURCE_FILE_CONFIG WHERE SOURCE_FILE_TYPE = 'INPUT' AND TABLE_ID IN ( -- 25 Release 01 tables 'LM_STANDING_FACILITIES', 'LM_STANDING_FACILITIES_HEADER', 'LM_TTS_HEADER', 'LM_TTS_ITEM', -- ... other tables ) GROUP BY CASE WHEN TABLE_ID LIKE 'LM_%' THEN 'LM' WHEN TABLE_ID LIKE 'CSDB_%' THEN 'CSDB' END, ARCHIVAL_STRATEGY, MINIMUM_AGE_MONTHS ORDER BY SOURCE_GROUP, ARCHIVAL_STRATEGY; ``` ## Troubleshooting ### Common Issues #### Issue 1: Validation Error on Configuration Update **Error**: ``` ORA-20001: Strategy MINIMUM_AGE_MONTHS requires MINIMUM_AGE_MONTHS to be set ``` **Cause**: Trigger validation failed - strategy requires MINIMUM_AGE_MONTHS but value is NULL **Solution**: ```sql -- Provide required MINIMUM_AGE_MONTHS value UPDATE CT_MRDS.A_SOURCE_FILE_CONFIG SET ARCHIVAL_STRATEGY = 'MINIMUM_AGE_MONTHS', MINIMUM_AGE_MONTHS = 6 -- Required for this strategy WHERE ...; ``` #### Issue 2: Archival Not Working as Expected **Symptoms**: Data not being archived according to strategy **Diagnostic Steps**: ```sql -- 1. Check configuration SELECT SOURCE_FILE_ID, TABLE_ID, ARCHIVAL_STRATEGY, MINIMUM_AGE_MONTHS, DAYS_FOR_ARCHIVE_THRESHOLD FROM CT_MRDS.A_SOURCE_FILE_CONFIG WHERE TABLE_ID = 'YOUR_TABLE'; -- 2. Check package version SELECT CT_MRDS.FILE_ARCHIVER.GET_VERSION() FROM DUAL; -- Expected: 3.0.0 or higher -- 3. Check process logs SELECT PROCESS_LOG_KEY, PROCESS_NAME, LOG_MESSAGE, LOG_LEVEL, LOG_TIMESTAMP FROM CT_MRDS.A_PROCESS_LOG WHERE PROCESS_NAME LIKE '%ARCHIVE%' ORDER BY LOG_TIMESTAMP DESC FETCH FIRST 20 ROWS ONLY; -- 4. Test WHERE clause generation DECLARE vConfig CT_MRDS.A_SOURCE_FILE_CONFIG%ROWTYPE; vWhereClause VARCHAR2(4000); BEGIN SELECT * INTO vConfig FROM CT_MRDS.A_SOURCE_FILE_CONFIG WHERE TABLE_ID = 'YOUR_TABLE' AND ROWNUM = 1; vWhereClause := CT_MRDS.FILE_ARCHIVER.GET_ARCHIVAL_WHERE_CLAUSE(vConfig); DBMS_OUTPUT.PUT_LINE('WHERE Clause: ' || vWhereClause); END; / ``` #### Issue 3: Package Compilation Errors After Upgrade **Symptoms**: FILE_ARCHIVER package shows INVALID status **Solution**: ```sql -- Check compilation errors SELECT * FROM USER_ERRORS WHERE NAME = 'FILE_ARCHIVER' AND TYPE IN ('PACKAGE', 'PACKAGE BODY') ORDER BY SEQUENCE; -- Recompile package ALTER PACKAGE CT_MRDS.FILE_ARCHIVER COMPILE SPECIFICATION; ALTER PACKAGE CT_MRDS.FILE_ARCHIVER COMPILE BODY; -- Verify status SELECT object_name, object_type, status FROM user_objects WHERE object_name = 'FILE_ARCHIVER'; ``` ## Version History ### v3.1.0 (Current - 2026-02-05) - **BREAKING CHANGE**: Removed CURRENT_MONTH_ONLY strategy (replaced by MINIMUM_AGE_MONTHS = 0) - Mathematical equivalence: CURRENT_MONTH_ONLY ≡ MINIMUM_AGE_MONTHS = 0 - Updated trigger validation to allow MINIMUM_AGE_MONTHS >= 0 (previously >= 1) - Simplified architecture from 4 strategies to 3 - Enhanced error handling - All 25 Release 01 tables migrated to MINIMUM_AGE_MONTHS (23 with value 0, 2 with value 6) ### v3.0.0 (MARS-828 - 2026-02-04) - Added ARCHIVAL_STRATEGY configuration column - Implemented four archival strategies (later reduced to three in v3.1.0): - THRESHOLD_BASED (backward compatible) - CURRENT_MONTH_ONLY (deprecated in v3.1.0, use MINIMUM_AGE_MONTHS = 0) - MINIMUM_AGE_MONTHS - HYBRID - Added GET_ARCHIVAL_WHERE_CLAUSE function - Created validation trigger TRG_BI_A_SRC_FILE_CFG_ARCH_VAL - Configured 25 Release 01 tables with appropriate strategies ### v2.0.0 (Legacy) - Initial FILE_ARCHIVER package - THRESHOLD_BASED archival only - Fixed DAYS_FOR_ARCHIVE_THRESHOLD configuration ## Related Documentation - [FILE_MANAGER Configuration Guide](FILE_MANAGER_Configuration_Guide.md) - File processing and validation - [Package Deployment Guide](Package_Deployment_Guide.md) - Package deployment standards - [Universal Package Tracking System](Universal_Package_Tracking_System.md) - Version tracking - [MARS-828 README](../MARS_Packages/REL01_ADDITIONS/MARS-828/README.md) - Detailed implementation notes ## Dependencies ### Required Packages - **CT_MRDS.ENV_MANAGER** v3.x - Error handling, logging, version tracking - **CT_MRDS.FILE_MANAGER** v3.x - Bucket URI resolution, file processing - **MRDS_LOADER.cloud_wrapper** - DBMS_CLOUD operations wrapper ### Database Objects - **Table**: CT_MRDS.A_SOURCE_FILE_CONFIG - Configuration storage - **Table**: CT_MRDS.A_SOURCE_FILE_RECEIVED - File processing tracking - **Table**: CT_MRDS.A_WORKFLOW_HISTORY - Workflow execution tracking (Airflow + DBT) - **Trigger**: TRG_BI_A_SRC_FILE_CFG_ARCH_VAL - Configuration validation - **Credential**: DEF_CRED_ARN - OCI bucket access ### OCI Buckets - **INBOX**: Incoming file validation (`'INBOX/{SOURCE}/{SOURCE_FILE_ID}/{TABLE_NAME}/'`) - **ODS/DATA**: Operational data processing (`'ODS/{SOURCE}/{TABLE_NAME}/'`) - **TRASH**: File retention subfolder in DATA bucket (`'TRASH/{SOURCE}/{TABLE_NAME}/'`) - CSV files after archival - **ARCHIVE**: Historical data storage (`'ARCHIVE/{SOURCE}/{TABLE_NAME}/PARTITION_YEAR=/PARTITION_MONTH=/'`) **Note**: TRASH is NOT a separate bucket - it's a subfolder within the DATA bucket for file retention and rollback capability. ## Best Practices ### Strategy Selection Guidelines 1. **Use MINIMUM_AGE_MONTHS when**: - **MINIMUM_AGE_MONTHS = 0**: Current month only retention - Data updated frequently (daily/intraday) - Historical data access is rare - ODS bucket space is limited - Example: LM dissemination feeds - **MINIMUM_AGE_MONTHS = N (N > 0)**: Multi-month retention - Regulatory compliance requires specific retention period - Analytical workloads need N-month access - Data updates are infrequent - Example: CSDB securities data (MINIMUM_AGE_MONTHS = 6) 2. **Use THRESHOLD_BASED when**: - Maintaining backward compatibility with legacy behavior - Simple time-based archival is sufficient - Migration from FILE_ARCHIVER v2.0.0 3. **Use HYBRID when**: - Complex retention requirements - Combining month boundary check with minimum age threshold - Advanced scenarios not covered by other strategies ### Configuration Best Practices 1. **Test Configuration Changes**: ```sql -- Test on single table first UPDATE CT_MRDS.A_SOURCE_FILE_CONFIG SET ARCHIVAL_STRATEGY = 'MINIMUM_AGE_MONTHS', MINIMUM_AGE_MONTHS = 0 -- 0 = current month only WHERE SOURCE_FILE_ID = 'TEST_FILE' AND TABLE_ID = 'TEST_TABLE'; -- Monitor archival behavior -- Expand to other tables after validation ``` 2. **Verify Before Bulk Updates**: ```sql -- Preview changes with SELECT SELECT SOURCE_FILE_ID, TABLE_ID, 'MINIMUM_AGE_MONTHS' AS NEW_STRATEGY, 0 AS NEW_MIN_AGE, -- 0 = current month only ARCHIVAL_STRATEGY AS OLD_STRATEGY, MINIMUM_AGE_MONTHS AS OLD_MIN_AGE FROM CT_MRDS.A_SOURCE_FILE_CONFIG WHERE SOURCE_FILE_ID LIKE 'Distribute%'; -- Then execute UPDATE ``` 3. **Document Configuration Decisions**: - Record why specific strategy was chosen - Note business requirements driving retention policy - Track configuration changes in version control 4. **Monitor Archival Performance**: ```sql -- Check archival execution logs SELECT PROCESS_NAME, LOG_MESSAGE, LOG_TIMESTAMP FROM CT_MRDS.A_PROCESS_LOG WHERE PROCESS_NAME LIKE '%ARCHIVE%' AND LOG_TIMESTAMP > SYSDATE - 7 ORDER BY LOG_TIMESTAMP DESC; ``` 5. **Regular Configuration Reviews**: - Verify strategies still match business requirements - Check for tables without archival configuration - Optimize MINIMUM_AGE_MONTHS based on actual usage patterns ### TRASH Folder Retention Best Practices 1. **Default Behavior (pKeepInTrash = TRUE - Recommended)**: - Keeps CSV files in TRASH folder after archival - Provides safety net for rollback if archival issues occur - Supports compliance and audit requirements - Status: ARCHIVED_AND_TRASHED - Use for: Production environments, regulatory compliance, critical data 2. **TRASH Cleanup (pKeepInTrash = FALSE)**: - Deletes CSV files from TRASH folder after successful archival - Reduces storage costs in DATA bucket - Status: ARCHIVED_AND_PURGED - Use for: Non-critical data, storage optimization, test environments 3. **Monitoring TRASH Folder**: ```sql -- Check files in TRASH retention SELECT SOURCE_FILE_NAME, PROCESSING_STATUS, ARCH_FILE_NAME, PARTITION_YEAR, PARTITION_MONTH FROM CT_MRDS.A_SOURCE_FILE_RECEIVED WHERE PROCESSING_STATUS IN ('ARCHIVED_AND_TRASHED', 'ARCHIVED_AND_PURGED') AND RECEPTION_DATE > SYSDATE - 30 ORDER BY PROCESSING_STATUS, RECEPTION_DATE DESC; ``` 4. **TRASH Folder Structure**: ``` DATA Bucket: ├── ODS/LM/STANDING_FACILITIES/file.csv -- Active operational data └── TRASH/LM/STANDING_FACILITIES/file.csv -- Retained after archival ARCHIVE Bucket: └── ARCHIVE/LM/STANDING_FACILITIES/ └── PARTITION_YEAR=2026/ └── PARTITION_MONTH=02/ └── *.parquet -- Archived data ``` ## Author Created by: Grzegorz Michalski Date: 2026-02-06 Schema: CT_MRDS Package: FILE_ARCHIVER Version: 3.2.0