fix(README): Correct typo in active tasks and update installation commands for MARS-1057
This commit is contained in:
@@ -14,7 +14,7 @@ REL01_ADDITIONS
|
|||||||
MARS-826
|
MARS-826
|
||||||
|
|
||||||
-- AKtualnie pracuje nad:
|
-- AKtualnie pracuje nad:
|
||||||
MARS-828
|
MARS-828s
|
||||||
|
|
||||||
-- Poniżej czeka na wdrożenie
|
-- Poniżej czeka na wdrożenie
|
||||||
REL03
|
REL03
|
||||||
@@ -69,8 +69,8 @@ sql "ADMIN/Cloudpass#34@ggmichalski_high" "@rollback_mars835_prehook.sql"
|
|||||||
|
|
||||||
|
|
||||||
cd .\MARS_Packages\REL03\MARS-1057
|
cd .\MARS_Packages\REL03\MARS-1057
|
||||||
sql "ADMIN/Cloudpass#34@ggmichalski_high" "@install_mars1057.sql"
|
echo 'yes' | sql "ADMIN/Cloudpass#34@ggmichalski_high" "@install_mars1057.sql"
|
||||||
sql "ADMIN/Cloudpass#34@ggmichalski_high" "@rollback_mars1057.sql"
|
echo 'yes' | sql "ADMIN/Cloudpass#34@ggmichalski_high" "@rollback_mars1057.sql"
|
||||||
7z a -pMojeSuperHaslo#123 -mhe=on M1057_arch.7z MARS-1057
|
7z a -pMojeSuperHaslo#123 -mhe=on M1057_arch.7z MARS-1057
|
||||||
|
|
||||||
|
|
||||||
|
|||||||
@@ -10,7 +10,6 @@ The FILE_ARCHIVER package provides flexible archival strategies that accommodate
|
|||||||
|
|
||||||
- **Three Archival Strategies**: THRESHOLD_BASED, MINIMUM_AGE_MONTHS (with 0=current month only), HYBRID
|
- **Three Archival Strategies**: THRESHOLD_BASED, MINIMUM_AGE_MONTHS (with 0=current month only), HYBRID
|
||||||
- **Flexible Configuration**: Per-table archival strategy configuration via A_SOURCE_FILE_CONFIG
|
- **Flexible Configuration**: Per-table archival strategy configuration via A_SOURCE_FILE_CONFIG
|
||||||
- **Backward Compatible**: Default THRESHOLD_BASED strategy maintains existing behavior
|
|
||||||
- **Validation**: Automatic validation of strategy-specific configuration requirements
|
- **Validation**: Automatic validation of strategy-specific configuration requirements
|
||||||
- **OCI Integration**: Works seamlessly with DBMS_CLOUD operations via cloud_wrapper
|
- **OCI Integration**: Works seamlessly with DBMS_CLOUD operations via cloud_wrapper
|
||||||
|
|
||||||
@@ -18,7 +17,7 @@ The FILE_ARCHIVER package provides flexible archival strategies that accommodate
|
|||||||
|
|
||||||
- **Schema**: CT_MRDS
|
- **Schema**: CT_MRDS
|
||||||
- **Package**: FILE_ARCHIVER
|
- **Package**: FILE_ARCHIVER
|
||||||
- **Current Version**: 3.2.0
|
- **Current Version**: 3.3.0
|
||||||
- **Dependencies**: ENV_MANAGER, FILE_MANAGER, cloud_wrapper, A_SOURCE_FILE_CONFIG, A_SOURCE_FILE_RECEIVED, A_WORKFLOW_HISTORY
|
- **Dependencies**: ENV_MANAGER, FILE_MANAGER, cloud_wrapper, A_SOURCE_FILE_CONFIG, A_SOURCE_FILE_RECEIVED, A_WORKFLOW_HISTORY
|
||||||
|
|
||||||
### Critical Prerequisites
|
### Critical Prerequisites
|
||||||
@@ -38,7 +37,7 @@ The FILE_ARCHIVER package provides flexible archival strategies that accommodate
|
|||||||
|
|
||||||
| Strategy | WHERE Clause Logic | Configuration Required | Primary Use Case |
|
| Strategy | WHERE Clause Logic | Configuration Required | Primary Use Case |
|
||||||
|----------|-------------------|----------------------|------------------|
|
|----------|-------------------|----------------------|------------------|
|
||||||
| `THRESHOLD_BASED` | Days since workflow start > threshold | DAYS_FOR_ARCHIVE_THRESHOLD | Legacy compatibility, simple time-based archival |
|
| `THRESHOLD_BASED` | Days since workflow start > threshold | DAYS_FOR_ARCHIVE_THRESHOLD | Simple time-based archival |
|
||||||
| `MINIMUM_AGE_MONTHS` | Archive data older than X months (0=current month only) | MINIMUM_AGE_MONTHS (≥0) | All sources - flexible retention (0 for LM, 6 for CSDB) |
|
| `MINIMUM_AGE_MONTHS` | Archive data older than X months (0=current month only) | MINIMUM_AGE_MONTHS (≥0) | All sources - flexible retention (0 for LM, 6 for CSDB) |
|
||||||
| `HYBRID` | Combines month boundary + minimum age | MINIMUM_AGE_MONTHS | Advanced retention scenarios |
|
| `HYBRID` | Combines month boundary + minimum age | MINIMUM_AGE_MONTHS | Advanced retention scenarios |
|
||||||
|
|
||||||
@@ -62,11 +61,11 @@ WHERE SOURCE_FILE_TYPE = 'INPUT'
|
|||||||
AND TABLE_ID = 'C2D_TABLE';
|
AND TABLE_ID = 'C2D_TABLE';
|
||||||
```
|
```
|
||||||
|
|
||||||
**Use Case**: Simple time-based archival, backward compatible with FILE_ARCHIVER v2.0.0 behavior.
|
**Use Case**: Simple time-based archival.
|
||||||
|
|
||||||
### 2. MINIMUM_AGE_MONTHS
|
### 2. MINIMUM_AGE_MONTHS
|
||||||
|
|
||||||
Archives data older than specified number of months. **Special case**: MINIMUM_AGE_MONTHS = 0 archives all data before current month (replaces deprecated CURRENT_MONTH_ONLY strategy).
|
Archives data older than specified number of months. **Special case**: MINIMUM_AGE_MONTHS = 0 archives all data before current month.
|
||||||
|
|
||||||
**WHERE Clause**:
|
**WHERE Clause**:
|
||||||
```sql
|
```sql
|
||||||
@@ -132,6 +131,60 @@ WHERE SOURCE_FILE_TYPE = 'INPUT'
|
|||||||
|
|
||||||
**Use Case**: Advanced scenarios requiring both current month retention AND minimum age threshold.
|
**Use Case**: Advanced scenarios requiring both current month retention AND minimum age threshold.
|
||||||
|
|
||||||
|
## Archival Triggering Logic
|
||||||
|
|
||||||
|
### Strategy-Specific Execution Behavior
|
||||||
|
|
||||||
|
The FILE_ARCHIVER package uses **different triggering logic** depending on the configured archival strategy:
|
||||||
|
|
||||||
|
#### MINIMUM_AGE_MONTHS Strategy (Threshold-Independent)
|
||||||
|
|
||||||
|
**Behavior**: Archives data **immediately** when age criteria is met, **without checking** archival thresholds.
|
||||||
|
|
||||||
|
```sql
|
||||||
|
-- Executed when MINIMUM_AGE_MONTHS strategy is configured
|
||||||
|
IF vSourceFileConfig.ARCHIVAL_STRATEGY = 'MINIMUM_AGE_MONTHS' THEN
|
||||||
|
vArchivalTriggeredBy := 'AGE_BASED';
|
||||||
|
-- Proceeds with archival regardless of FILES_COUNT, ROWS_COUNT, or BYTES_SUM
|
||||||
|
END IF;
|
||||||
|
```
|
||||||
|
|
||||||
|
**Why**: This strategy is designed for **strict retention policies** where data **must** be archived based on age alone (e.g., regulatory compliance requiring current month only).
|
||||||
|
|
||||||
|
#### THRESHOLD_BASED and HYBRID Strategies (Threshold-Dependent)
|
||||||
|
|
||||||
|
**Behavior**: Archives data **only when** at least one of the following thresholds is exceeded:
|
||||||
|
|
||||||
|
1. **FILES_COUNT_OVER_ARCHIVE_THRESHOLD** - Number of files eligible for archival
|
||||||
|
2. **ROWS_COUNT_OVER_ARCHIVE_THRESHOLD** - Number of rows eligible for archival
|
||||||
|
3. **BYTES_SUM_OVER_ARCHIVE_THRESHOLD** - Total size in bytes eligible for archival
|
||||||
|
|
||||||
|
```sql
|
||||||
|
-- Executed for THRESHOLD_BASED and HYBRID strategies
|
||||||
|
IF vTableStat.OVER_ARCH_THRESOLD_FILE_COUNT >= vSourceFileConfig.FILES_COUNT_OVER_ARCHIVE_THRESHOLD THEN
|
||||||
|
vArchivalTriggeredBy := 'FILES_COUNT';
|
||||||
|
ELSIF vTableStat.OVER_ARCH_THRESOLD_ROW_COUNT >= vSourceFileConfig.ROWS_COUNT_OVER_ARCHIVE_THRESHOLD THEN
|
||||||
|
vArchivalTriggeredBy := 'ROWS_COUNT';
|
||||||
|
ELSIF vTableStat.OVER_ARCH_THRESOLD_SIZE >= vSourceFileConfig.BYTES_SUM_OVER_ARCHIVE_THRESHOLD THEN
|
||||||
|
vArchivalTriggeredBy := 'BYTES_SUM';
|
||||||
|
END IF;
|
||||||
|
```
|
||||||
|
|
||||||
|
**Why**: These strategies provide **performance optimization** by avoiding unnecessary archival operations when data volume is small.
|
||||||
|
|
||||||
|
**Configuration Example**:
|
||||||
|
```sql
|
||||||
|
-- Set archival thresholds for THRESHOLD_BASED strategy
|
||||||
|
UPDATE CT_MRDS.A_SOURCE_FILE_CONFIG
|
||||||
|
SET FILES_COUNT_OVER_ARCHIVE_THRESHOLD = 10, -- Archive when 10+ files eligible
|
||||||
|
ROWS_COUNT_OVER_ARCHIVE_THRESHOLD = 100000, -- Archive when 100k+ rows eligible
|
||||||
|
BYTES_SUM_OVER_ARCHIVE_THRESHOLD = 104857600 -- Archive when 100MB+ eligible
|
||||||
|
WHERE ARCHIVAL_STRATEGY = 'THRESHOLD_BASED'
|
||||||
|
AND TABLE_ID = 'YOUR_TABLE';
|
||||||
|
```
|
||||||
|
|
||||||
|
**Important**: For **MINIMUM_AGE_MONTHS** strategy, these threshold values are **ignored** - archival proceeds based on age alone.
|
||||||
|
|
||||||
## Configuration Validation
|
## Configuration Validation
|
||||||
|
|
||||||
### Validation Trigger
|
### Validation Trigger
|
||||||
@@ -158,8 +211,132 @@ WHERE ...;
|
|||||||
-- Error: ORA-20001: Strategy MINIMUM_AGE_MONTHS requires MINIMUM_AGE_MONTHS to be set
|
-- Error: ORA-20001: Strategy MINIMUM_AGE_MONTHS requires MINIMUM_AGE_MONTHS to be set
|
||||||
```
|
```
|
||||||
|
|
||||||
|
## Archival Control Configuration
|
||||||
|
|
||||||
|
### ARCHIVE_ENABLED Column
|
||||||
|
|
||||||
|
Controls whether archival is enabled for specific table configuration.
|
||||||
|
|
||||||
|
**Column**: `A_SOURCE_FILE_CONFIG.ARCHIVE_ENABLED` (VARCHAR2(1), DEFAULT 'Y')
|
||||||
|
|
||||||
|
**Values**:
|
||||||
|
- `'Y'` (default) - Table is eligible for archival processing
|
||||||
|
- `'N'` - Table is excluded from archival (batch operations skip this config)
|
||||||
|
|
||||||
|
**Use Cases**:
|
||||||
|
- Disable archival for specific tables without removing configuration
|
||||||
|
- Temporarily suspend archival during data migration or troubleshooting
|
||||||
|
- Selective archival in batch operations
|
||||||
|
|
||||||
|
**Configuration Example**:
|
||||||
|
```sql
|
||||||
|
-- Disable archival for specific table
|
||||||
|
UPDATE CT_MRDS.A_SOURCE_FILE_CONFIG
|
||||||
|
SET ARCHIVE_ENABLED = 'N'
|
||||||
|
WHERE SOURCE_FILE_TYPE = 'INPUT'
|
||||||
|
AND SOURCE_FILE_ID = 'CSDB'
|
||||||
|
AND TABLE_ID = 'CSDB_DEBT';
|
||||||
|
COMMIT;
|
||||||
|
|
||||||
|
-- Re-enable archival
|
||||||
|
UPDATE CT_MRDS.A_SOURCE_FILE_CONFIG
|
||||||
|
SET ARCHIVE_ENABLED = 'Y'
|
||||||
|
WHERE SOURCE_FILE_TYPE = 'INPUT'
|
||||||
|
AND SOURCE_FILE_ID = 'CSDB'
|
||||||
|
AND TABLE_ID = 'CSDB_DEBT';
|
||||||
|
COMMIT;
|
||||||
|
|
||||||
|
-- Check archival status
|
||||||
|
SELECT
|
||||||
|
SOURCE_FILE_ID,
|
||||||
|
TABLE_ID,
|
||||||
|
ARCHIVE_ENABLED,
|
||||||
|
ARCHIVAL_STRATEGY
|
||||||
|
FROM CT_MRDS.A_SOURCE_FILE_CONFIG
|
||||||
|
WHERE SOURCE_FILE_TYPE = 'INPUT'
|
||||||
|
ORDER BY SOURCE_FILE_ID, TABLE_ID;
|
||||||
|
```
|
||||||
|
|
||||||
|
### KEEP_IN_TRASH Column
|
||||||
|
|
||||||
|
Controls TRASH folder retention policy for archived files.
|
||||||
|
|
||||||
|
**Column**: `A_SOURCE_FILE_CONFIG.KEEP_IN_TRASH` (VARCHAR2(1), DEFAULT 'Y')
|
||||||
|
|
||||||
|
**Values**:
|
||||||
|
- `'Y'` (default) - CSV files kept in TRASH folder after archival (status: ARCHIVED_AND_TRASHED)
|
||||||
|
- `'N'` - CSV files deleted from TRASH folder after archival (status: ARCHIVED_AND_PURGED)
|
||||||
|
|
||||||
|
**Benefits of TRASH Retention (TRUE)**:
|
||||||
|
- Safety net for rollback if archival issues discovered
|
||||||
|
- Supports compliance and audit requirements
|
||||||
|
- Enables file restoration via `RESTORE_FILE_FROM_TRASH` procedure
|
||||||
|
|
||||||
|
**Benefits of TRASH Cleanup (FALSE)**:
|
||||||
|
- Reduces storage costs in DATA bucket
|
||||||
|
- Simplifies bucket management
|
||||||
|
- Appropriate for non-critical or test data
|
||||||
|
|
||||||
|
**Configuration Example**:
|
||||||
|
```sql
|
||||||
|
-- Production: Keep files in TRASH (recommended)
|
||||||
|
UPDATE CT_MRDS.A_SOURCE_FILE_CONFIG
|
||||||
|
SET KEEP_IN_TRASH = 'Y'
|
||||||
|
WHERE SOURCE_FILE_TYPE = 'INPUT'
|
||||||
|
AND SOURCE_FILE_ID = 'LM'
|
||||||
|
AND TABLE_ID LIKE 'LM_%';
|
||||||
|
COMMIT;
|
||||||
|
|
||||||
|
-- Test environment: Cleanup TRASH to save storage
|
||||||
|
UPDATE CT_MRDS.A_SOURCE_FILE_CONFIG
|
||||||
|
SET KEEP_IN_TRASH = 'N'
|
||||||
|
WHERE SOURCE_FILE_TYPE = 'INPUT'
|
||||||
|
AND SOURCE_FILE_ID = 'TEST_SOURCE';
|
||||||
|
COMMIT;
|
||||||
|
|
||||||
|
-- Bulk configuration by source
|
||||||
|
UPDATE CT_MRDS.A_SOURCE_FILE_CONFIG
|
||||||
|
SET KEEP_IN_TRASH = 'Y'
|
||||||
|
WHERE SOURCE_FILE_TYPE = 'INPUT'
|
||||||
|
AND SOURCE_FILE_ID IN ('CSDB', 'C2D', 'LM');
|
||||||
|
COMMIT;
|
||||||
|
```
|
||||||
|
|
||||||
## Data Lifecycle Workflow
|
## Data Lifecycle Workflow
|
||||||
|
|
||||||
|
### Status Tracking in A_SOURCE_FILE_RECEIVED
|
||||||
|
|
||||||
|
The FILE_ARCHIVER tracks file lifecycle through the `PROCESSING_STATUS` column in `CT_MRDS.A_SOURCE_FILE_RECEIVED` table:
|
||||||
|
|
||||||
|
**Status Progression**:
|
||||||
|
```
|
||||||
|
INGESTED → ARCHIVED_AND_TRASHED → ARCHIVED_AND_PURGED (optional)
|
||||||
|
↓
|
||||||
|
INGESTED (via RESTORE_FILE_FROM_TRASH)
|
||||||
|
```
|
||||||
|
|
||||||
|
**Status Descriptions**:
|
||||||
|
- **INGESTED**: File successfully processed through Airflow+DBT, residing in ODS bucket
|
||||||
|
- **ARCHIVED_AND_TRASHED**: File archived to Parquet in ARCHIVE bucket, CSV retained in TRASH folder (DATA bucket)
|
||||||
|
- **ARCHIVED_AND_PURGED**: File archived to Parquet, CSV deleted from TRASH folder (when KEEP_IN_TRASH='N')
|
||||||
|
|
||||||
|
**Associated Columns Updated During Archival**:
|
||||||
|
```sql
|
||||||
|
UPDATE CT_MRDS.A_SOURCE_FILE_RECEIVED
|
||||||
|
SET PROCESSING_STATUS = 'ARCHIVED_AND_TRASHED', -- Status change
|
||||||
|
ARCH_PATH = 'archive_directory_prefix/', -- Directory with Parquet files
|
||||||
|
PARTITION_YEAR = 2026, -- Year partition value
|
||||||
|
PARTITION_MONTH = 02 -- Month partition value
|
||||||
|
WHERE SOURCE_FILE_NAME = 'file.csv';
|
||||||
|
```
|
||||||
|
|
||||||
|
**ARCH_PATH Column**: Contains the **directory prefix** (URI) where archived Parquet files are located in the ARCHIVE bucket. Since `DBMS_CLOUD.EXPORT_DATA` may create multiple Parquet files with parallel execution, the system stores the directory location rather than individual filenames.
|
||||||
|
|
||||||
|
**Example ARCH_PATH**:
|
||||||
|
```
|
||||||
|
https://objectstorage.eu-frankfurt-1.oraclecloud.com/n/namespace/b/archive/o/ARCHIVE/LM/STANDING_FACILITIES/PARTITION_YEAR=2026/PARTITION_MONTH=02/
|
||||||
|
```
|
||||||
|
|
||||||
### Standard File Processing Flow
|
### Standard File Processing Flow
|
||||||
|
|
||||||
```
|
```
|
||||||
@@ -183,9 +360,9 @@ WHERE ...;
|
|||||||
2.1 TRASH Subfolder (DATA Bucket - File Retention)
|
2.1 TRASH Subfolder (DATA Bucket - File Retention)
|
||||||
├─ Located in DATA bucket (e.g., TRASH/LM/TABLE_NAME)
|
├─ Located in DATA bucket (e.g., TRASH/LM/TABLE_NAME)
|
||||||
├─ Stores CSV files after archival to Parquet
|
├─ Stores CSV files after archival to Parquet
|
||||||
├─ Status: ARCHIVED_AND_TRASHED (default retention)
|
├─ Status: ARCHIVED_AND_TRASHED (default, controlled by KEEP_IN_TRASH config)
|
||||||
├─ Enables rollback if archival issues occur
|
├─ Enables rollback if archival issues occur
|
||||||
└─ Optional cleanup: ARCHIVED_AND_PURGED (pKeepInTrash=FALSE)
|
└─ Optional cleanup: ARCHIVED_AND_PURGED (when KEEP_IN_TRASH = 'N')
|
||||||
|
|
||||||
3. ARCHIVE Bucket (Long-term Storage)
|
3. ARCHIVE Bucket (Long-term Storage)
|
||||||
├─ Historical data in Parquet format
|
├─ Historical data in Parquet format
|
||||||
@@ -194,29 +371,48 @@ WHERE ...;
|
|||||||
└─ Optimized for big data analytics (Spark, Hive)
|
└─ Optimized for big data analytics (Spark, Hive)
|
||||||
|
|
||||||
**Key Procedures**:
|
**Key Procedures**:
|
||||||
- `ARCHIVE_TABLE_DATA(pSourceFileConfigKey, pKeepInTrash)` - Main archival procedure using strategy-specific WHERE clause
|
- `ARCHIVE_TABLE_DATA(pSourceFileConfigKey)` - Main archival procedure using strategy-specific WHERE clause
|
||||||
- `pKeepInTrash` (BOOLEAN, DEFAULT TRUE) - Controls TRASH folder retention
|
- TRASH folder retention controlled by `KEEP_IN_TRASH` column in A_SOURCE_FILE_CONFIG
|
||||||
- TRUE: Files kept in TRASH folder for safety and rollback capability (default)
|
- `ARCHIVE_ALL(pSourceFileConfigKey, pSourceKey, pArchiveAll)` - Batch archival with 3-level granularity and error handling
|
||||||
- FALSE: Files deleted from TRASH folder after successful archival
|
- **Level 3 (Highest Priority)**: Single configuration via `pSourceFileConfigKey`
|
||||||
|
- **Level 2 (Medium Priority)**: All configurations for source via `pSourceKey`
|
||||||
|
- **Level 1 (Lowest Priority)**: All configurations system-wide via `pArchiveAll`
|
||||||
|
- **Error Handling**: Continues processing other tables on individual failures
|
||||||
|
- **Filtering**: Respects `ARCHIVE_ENABLED='Y'` (skips disabled configurations)
|
||||||
|
- **Individual TRASH Policy**: Each table's `KEEP_IN_TRASH` setting applied independently
|
||||||
|
- **Summary Reporting**: Returns counts of Archived/Skipped/Failed tables
|
||||||
- `GET_ARCHIVAL_WHERE_CLAUSE` - Returns WHERE clause based on configured strategy
|
- `GET_ARCHIVAL_WHERE_CLAUSE` - Returns WHERE clause based on configured strategy
|
||||||
- `GATHER_TABLE_STAT` - Calculates archival statistics using strategy logic
|
- `GATHER_TABLE_STAT` - Calculates archival statistics using strategy logic
|
||||||
|
- `GATHER_TABLE_STAT_ALL(pSourceFileConfigKey, pSourceKey, pGatherAll)` - Batch statistics with 3-level granularity
|
||||||
|
- `RESTORE_FILE_FROM_TRASH(pSourceFileConfigKey, pSourceKey, pRestoreAll)` - Restore archived files from TRASH
|
||||||
|
- `PURGE_TRASH_FOLDER(pSourceFileConfigKey, pSourceKey, pPurgeAll)` - Purge TRASH folder with 3-level granularity
|
||||||
|
|
||||||
**Archival Execution**:
|
**Archival Execution**:
|
||||||
```sql
|
```sql
|
||||||
-- Default behavior: Keep files in TRASH folder (ARCHIVED_AND_TRASHED status)
|
-- Single table archival (TRASH retention controlled by KEEP_IN_TRASH config)
|
||||||
BEGIN
|
BEGIN
|
||||||
CT_MRDS.FILE_ARCHIVER.ARCHIVE_TABLE_DATA(
|
CT_MRDS.FILE_ARCHIVER.ARCHIVE_TABLE_DATA(
|
||||||
pSourceFileConfigKey => vSourceFileConfigKey,
|
pSourceFileConfigKey => vSourceFileConfigKey
|
||||||
pKeepInTrash => TRUE -- DEFAULT value
|
|
||||||
);
|
);
|
||||||
END;
|
END;
|
||||||
/
|
/
|
||||||
|
|
||||||
-- Optional: Delete files from TRASH after archival (ARCHIVED_AND_PURGED status)
|
-- Batch archival: All tables for specific source
|
||||||
BEGIN
|
BEGIN
|
||||||
CT_MRDS.FILE_ARCHIVER.ARCHIVE_TABLE_DATA(
|
CT_MRDS.FILE_ARCHIVER.ARCHIVE_ALL(
|
||||||
pSourceFileConfigKey => vSourceFileConfigKey,
|
pSourceFileConfigKey => NULL,
|
||||||
pKeepInTrash => FALSE -- Cleanup TRASH folder
|
pSourceKey => 'LM', -- Archive all LM tables
|
||||||
|
pArchiveAll => FALSE
|
||||||
|
);
|
||||||
|
END;
|
||||||
|
/
|
||||||
|
|
||||||
|
-- Batch archival: All tables system-wide
|
||||||
|
BEGIN
|
||||||
|
CT_MRDS.FILE_ARCHIVER.ARCHIVE_ALL(
|
||||||
|
pSourceFileConfigKey => NULL,
|
||||||
|
pSourceKey => NULL,
|
||||||
|
pArchiveAll => TRUE -- Archive all configured tables
|
||||||
);
|
);
|
||||||
END;
|
END;
|
||||||
/
|
/
|
||||||
@@ -225,10 +421,121 @@ END;
|
|||||||
**Strategy-Based Filtering**:
|
**Strategy-Based Filtering**:
|
||||||
- Package retrieves ARCHIVAL_STRATEGY from A_SOURCE_FILE_CONFIG
|
- Package retrieves ARCHIVAL_STRATEGY from A_SOURCE_FILE_CONFIG
|
||||||
- GET_ARCHIVAL_WHERE_CLAUSE generates appropriate WHERE clause
|
- GET_ARCHIVAL_WHERE_CLAUSE generates appropriate WHERE clause
|
||||||
|
- Only tables with ARCHIVE_ENABLED = 'Y' are processed
|
||||||
- Data matching criteria moved from ODS to ARCHIVE bucket
|
- Data matching criteria moved from ODS to ARCHIVE bucket
|
||||||
- CSV files moved to TRASH subfolder in DATA bucket (ODS/ → TRASH/)
|
- CSV files moved to TRASH subfolder in DATA bucket (ODS/ → TRASH/)
|
||||||
- Parquet format with Hive-style partitioning applied to ARCHIVE bucket
|
- Parquet format with Hive-style partitioning applied to ARCHIVE bucket
|
||||||
- TRASH retention controlled by pKeepInTrash parameter
|
- TRASH retention controlled by KEEP_IN_TRASH column in A_SOURCE_FILE_CONFIG
|
||||||
|
|
||||||
|
### Automatic Rollback Mechanism
|
||||||
|
|
||||||
|
FILE_ARCHIVER implements **automatic rollback** to ensure data integrity if archival process fails:
|
||||||
|
|
||||||
|
**Process Flow**:
|
||||||
|
1. **Export to ARCHIVE**: Data exported to Parquet format in ARCHIVE bucket
|
||||||
|
2. **Status Update**: A_SOURCE_FILE_RECEIVED records updated to 'ARCHIVED_AND_TRASHED'
|
||||||
|
3. **Move to TRASH**: CSV files moved from ODS to TRASH folder (DATA bucket)
|
||||||
|
4. **Optional Cleanup**: If KEEP_IN_TRASH='N', files deleted from TRASH
|
||||||
|
|
||||||
|
**Automatic Rollback Trigger**:
|
||||||
|
If **any error occurs** during step 3 (Move to TRASH), the system:
|
||||||
|
- **Reverts all files**: Moves successfully processed files from TRASH back to ODS
|
||||||
|
- **Rolls back status**: Resets A_SOURCE_FILE_RECEIVED status to 'INGESTED'
|
||||||
|
- **Logs error**: Records detailed error information in A_PROCESS_LOG
|
||||||
|
- **Raises exception**: Propagates error to calling process
|
||||||
|
|
||||||
|
**Rollback Logic (from code)**:
|
||||||
|
```sql
|
||||||
|
-- If MOVE_FILE_TO_TRASH fails for any file
|
||||||
|
ELSIF vProcessControlStatus = 'MOVE_FILE_TO_TRASH_FAILURE' THEN
|
||||||
|
FOR f in (files already moved to TRASH) LOOP
|
||||||
|
-- Move file back from TRASH to ODS
|
||||||
|
DBMS_CLOUD.MOVE_OBJECT(
|
||||||
|
source_object_uri => 'TRASH/.../filename',
|
||||||
|
target_object_uri => 'ODS/.../filename'
|
||||||
|
);
|
||||||
|
|
||||||
|
-- Revert status back to INGESTED
|
||||||
|
UPDATE A_SOURCE_FILE_RECEIVED
|
||||||
|
SET PROCESSING_STATUS = 'INGESTED'
|
||||||
|
WHERE source_file_name = f.filename;
|
||||||
|
END LOOP;
|
||||||
|
END IF;
|
||||||
|
```
|
||||||
|
|
||||||
|
**Why This Matters**: Ensures **all-or-nothing** archival - either all files for a YEAR_MONTH partition are successfully archived, or **none** are (maintains data consistency).
|
||||||
|
|
||||||
|
### TRASH Management Procedures
|
||||||
|
|
||||||
|
#### RESTORE_FILE_FROM_TRASH
|
||||||
|
|
||||||
|
Restores files from TRASH folder back to ODS with **3-level granularity**:
|
||||||
|
|
||||||
|
**Level 3 (Highest Priority)** - Single File Restore:
|
||||||
|
```sql
|
||||||
|
-- Restore specific file by A_SOURCE_FILE_RECEIVED_KEY
|
||||||
|
CALL FILE_ARCHIVER.RESTORE_FILE_FROM_TRASH(
|
||||||
|
pSourceFileReceivedKey => 12345
|
||||||
|
);
|
||||||
|
```
|
||||||
|
|
||||||
|
**Level 2 (Medium Priority)** - Configuration-Based Restore:
|
||||||
|
```sql
|
||||||
|
-- Restore all files for specific table configuration
|
||||||
|
CALL FILE_ARCHIVER.RESTORE_FILE_FROM_TRASH(
|
||||||
|
pSourceFileConfigKey => 341
|
||||||
|
);
|
||||||
|
```
|
||||||
|
|
||||||
|
**Level 1 (Lowest Priority)** - Global Restore:
|
||||||
|
```sql
|
||||||
|
-- Restore ALL files with ARCHIVED_AND_TRASHED status system-wide
|
||||||
|
CALL FILE_ARCHIVER.RESTORE_FILE_FROM_TRASH(
|
||||||
|
pRestoreAll => TRUE
|
||||||
|
);
|
||||||
|
```
|
||||||
|
|
||||||
|
**Restore Operations**:
|
||||||
|
- **Moves files**: TRASH folder → ODS folder (using DBMS_CLOUD.MOVE_OBJECT)
|
||||||
|
- **Updates status**: ARCHIVED_AND_TRASHED → INGESTED
|
||||||
|
- **Clears metadata**: Sets ARCH_PATH, PARTITION_YEAR, PARTITION_MONTH to NULL
|
||||||
|
- **Returns files to active processing**: Makes data available for Airflow+DBT pipeline
|
||||||
|
|
||||||
|
#### PURGE_TRASH_FOLDER
|
||||||
|
|
||||||
|
Permanently deletes files from TRASH with **3-level granularity**:
|
||||||
|
|
||||||
|
**Level 3 (Highest Priority)** - Single File Purge:
|
||||||
|
```sql
|
||||||
|
-- Delete specific file from TRASH
|
||||||
|
CALL FILE_ARCHIVER.PURGE_TRASH_FOLDER(
|
||||||
|
pSourceFileReceivedKey => 12345
|
||||||
|
);
|
||||||
|
```
|
||||||
|
|
||||||
|
**Level 2 (Medium Priority)** - Configuration-Based Purge:
|
||||||
|
```sql
|
||||||
|
-- Delete all TRASH files for specific table configuration
|
||||||
|
CALL FILE_ARCHIVER.PURGE_TRASH_FOLDER(
|
||||||
|
pSourceFileConfigKey => 341
|
||||||
|
);
|
||||||
|
```
|
||||||
|
|
||||||
|
**Level 1 (Lowest Priority)** - Global Purge:
|
||||||
|
```sql
|
||||||
|
-- Delete ALL files with ARCHIVED_AND_TRASHED status system-wide
|
||||||
|
CALL FILE_ARCHIVER.PURGE_TRASH_FOLDER(
|
||||||
|
pPurgeAll => TRUE
|
||||||
|
);
|
||||||
|
```
|
||||||
|
|
||||||
|
**Purge Operations**:
|
||||||
|
- **Deletes files**: Permanently removes from TRASH folder (using DBMS_CLOUD.DELETE_OBJECT)
|
||||||
|
- **Updates status**: ARCHIVED_AND_TRASHED → ARCHIVED_AND_PURGED
|
||||||
|
- **Warning**: **Irreversible operation** - files cannot be restored after purge
|
||||||
|
- **Use case**: Storage optimization, compliance with data retention policies
|
||||||
|
|
||||||
|
**Important**: Purge is **not automatic** - must be explicitly called. This provides additional safety layer for data retention.
|
||||||
|
|
||||||
## Configuration Examples
|
## Configuration Examples
|
||||||
|
|
||||||
@@ -335,6 +642,56 @@ GROUP BY ARCHIVAL_STRATEGY
|
|||||||
ORDER BY ARCHIVAL_STRATEGY;
|
ORDER BY ARCHIVAL_STRATEGY;
|
||||||
```
|
```
|
||||||
|
|
||||||
|
### Example 5: Configure Archival Control Settings
|
||||||
|
|
||||||
|
```sql
|
||||||
|
-- Complete configuration with all archival settings
|
||||||
|
UPDATE CT_MRDS.A_SOURCE_FILE_CONFIG
|
||||||
|
SET ARCHIVAL_STRATEGY = 'MINIMUM_AGE_MONTHS',
|
||||||
|
MINIMUM_AGE_MONTHS = 6,
|
||||||
|
ARCHIVE_ENABLED = 'Y', -- Enable archival
|
||||||
|
KEEP_IN_TRASH = 'Y' -- Keep files in TRASH for safety
|
||||||
|
WHERE SOURCE_FILE_TYPE = 'INPUT'
|
||||||
|
AND SOURCE_FILE_ID = 'CSDB'
|
||||||
|
AND TABLE_ID = 'CSDB_DEBT';
|
||||||
|
COMMIT;
|
||||||
|
|
||||||
|
-- Disable archival temporarily for troubleshooting
|
||||||
|
UPDATE CT_MRDS.A_SOURCE_FILE_CONFIG
|
||||||
|
SET ARCHIVE_ENABLED = 'N' -- Batch operations will skip this table
|
||||||
|
WHERE TABLE_ID = 'CSDB_DEBT';
|
||||||
|
COMMIT;
|
||||||
|
|
||||||
|
-- Configure TRASH cleanup for test environment
|
||||||
|
UPDATE CT_MRDS.A_SOURCE_FILE_CONFIG
|
||||||
|
SET KEEP_IN_TRASH = 'N' -- Delete files from TRASH after archival
|
||||||
|
WHERE SOURCE_FILE_TYPE = 'INPUT'
|
||||||
|
AND SOURCE_FILE_ID = 'TEST_SOURCE';
|
||||||
|
COMMIT;
|
||||||
|
|
||||||
|
-- View complete configuration
|
||||||
|
SELECT
|
||||||
|
SOURCE_FILE_ID,
|
||||||
|
TABLE_ID,
|
||||||
|
ARCHIVAL_STRATEGY,
|
||||||
|
MINIMUM_AGE_MONTHS,
|
||||||
|
ARCHIVE_ENABLED,
|
||||||
|
KEEP_IN_TRASH
|
||||||
|
FROM CT_MRDS.A_SOURCE_FILE_CONFIG
|
||||||
|
WHERE SOURCE_FILE_TYPE = 'INPUT'
|
||||||
|
ORDER BY SOURCE_FILE_ID, TABLE_ID;
|
||||||
|
|
||||||
|
-- Summary by archival status
|
||||||
|
SELECT
|
||||||
|
ARCHIVE_ENABLED,
|
||||||
|
KEEP_IN_TRASH,
|
||||||
|
COUNT(*) AS TABLE_COUNT
|
||||||
|
FROM CT_MRDS.A_SOURCE_FILE_CONFIG
|
||||||
|
WHERE SOURCE_FILE_TYPE = 'INPUT'
|
||||||
|
GROUP BY ARCHIVE_ENABLED, KEEP_IN_TRASH
|
||||||
|
ORDER BY ARCHIVE_ENABLED DESC, KEEP_IN_TRASH DESC;
|
||||||
|
```
|
||||||
|
|
||||||
## Release 01 Configuration
|
## Release 01 Configuration
|
||||||
|
|
||||||
### Configured Tables (MARS-828)
|
### Configured Tables (MARS-828)
|
||||||
@@ -425,7 +782,180 @@ SET ARCHIVAL_STRATEGY = 'MINIMUM_AGE_MONTHS',
|
|||||||
WHERE ...;
|
WHERE ...;
|
||||||
```
|
```
|
||||||
|
|
||||||
#### Issue 2: Archival Not Working as Expected
|
#### Issue 2: Archival Not Triggering Despite Configuration
|
||||||
|
|
||||||
|
**Scenario A**: **MINIMUM_AGE_MONTHS** strategy not archiving
|
||||||
|
```sql
|
||||||
|
-- Check files that should be archived
|
||||||
|
SELECT
|
||||||
|
SFR.A_SOURCE_FILE_RECEIVED_KEY,
|
||||||
|
SFR.SOURCE_FILE_NAME,
|
||||||
|
SFR.PROCESSING_STATUS,
|
||||||
|
LH.LOAD_START,
|
||||||
|
TRUNC(MONTHS_BETWEEN(SYSDATE, LH.LOAD_START)) AS MONTHS_AGE,
|
||||||
|
SFC.MINIMUM_AGE_MONTHS AS THRESHOLD
|
||||||
|
FROM CT_MRDS.A_SOURCE_FILE_RECEIVED SFR
|
||||||
|
JOIN CT_ODS.A_LOAD_HISTORY LH ON SFR.A_WORKFLOW_HISTORY_KEY = LH.A_WORKFLOW_HISTORY_KEY
|
||||||
|
JOIN CT_MRDS.A_SOURCE_FILE_CONFIG SFC ON SFR.A_SOURCE_FILE_CONFIG_KEY = SFC.A_SOURCE_FILE_CONFIG_KEY
|
||||||
|
WHERE SFC.ARCHIVAL_STRATEGY = 'MINIMUM_AGE_MONTHS'
|
||||||
|
AND SFR.PROCESSING_STATUS = 'INGESTED'
|
||||||
|
AND SFC.ARCHIVE_ENABLED = 'Y'
|
||||||
|
ORDER BY LH.LOAD_START;
|
||||||
|
|
||||||
|
-- Note: MINIMUM_AGE_MONTHS archives immediately (threshold-independent)
|
||||||
|
-- If files not archived, check ARCHIVE_ENABLED='Y' and run ARCHIVE_TABLE_DATA
|
||||||
|
```
|
||||||
|
|
||||||
|
**Scenario B**: **THRESHOLD_BASED** or **HYBRID** strategy not archiving
|
||||||
|
```sql
|
||||||
|
-- Check if threshold reached for specific configuration
|
||||||
|
SELECT
|
||||||
|
SFC.SOURCE_FILE_ID,
|
||||||
|
SFC.TABLE_ID,
|
||||||
|
SFC.ARCHIVAL_STRATEGY,
|
||||||
|
SFC.FILES_COUNT_OVER_ARCHIVE_THRESHOLD AS FILE_THRESHOLD,
|
||||||
|
SFC.ROWS_COUNT_OVER_ARCHIVE_THRESHOLD AS ROW_THRESHOLD,
|
||||||
|
SFC.BYTES_SUM_OVER_ARCHIVE_THRESHOLD AS BYTE_THRESHOLD,
|
||||||
|
COUNT(SFR.A_SOURCE_FILE_RECEIVED_KEY) AS CURRENT_FILES,
|
||||||
|
SUM(SFR.TOTAL_RECORDS) AS CURRENT_ROWS,
|
||||||
|
SUM(SFR.FILE_SIZE_BYTES) AS CURRENT_BYTES
|
||||||
|
FROM CT_MRDS.A_SOURCE_FILE_CONFIG SFC
|
||||||
|
LEFT JOIN CT_MRDS.A_SOURCE_FILE_RECEIVED SFR
|
||||||
|
ON SFC.A_SOURCE_FILE_CONFIG_KEY = SFR.A_SOURCE_FILE_CONFIG_KEY
|
||||||
|
AND SFR.PROCESSING_STATUS = 'INGESTED'
|
||||||
|
WHERE SFC.ARCHIVAL_STRATEGY IN ('THRESHOLD_BASED', 'HYBRID')
|
||||||
|
AND SFC.ARCHIVE_ENABLED = 'Y'
|
||||||
|
AND SFC.A_SOURCE_FILE_CONFIG_KEY = :yourConfigKey
|
||||||
|
GROUP BY
|
||||||
|
SFC.SOURCE_FILE_ID, SFC.TABLE_ID, SFC.ARCHIVAL_STRATEGY,
|
||||||
|
SFC.FILES_COUNT_OVER_ARCHIVE_THRESHOLD,
|
||||||
|
SFC.ROWS_COUNT_OVER_ARCHIVE_THRESHOLD,
|
||||||
|
SFC.BYTES_SUM_OVER_ARCHIVE_THRESHOLD;
|
||||||
|
|
||||||
|
-- Expected: At least ONE threshold (FILE/ROW/BYTE) must be exceeded
|
||||||
|
-- If no threshold exceeded, archival will NOT trigger (threshold-dependent behavior)
|
||||||
|
```
|
||||||
|
|
||||||
|
#### Issue 3: ARCH_PATH Contains Directory Not Filename
|
||||||
|
|
||||||
|
**Symptoms**: A_SOURCE_FILE_RECEIVED.ARCH_PATH shows folder path instead of specific file
|
||||||
|
|
||||||
|
**Explanation**: This is **expected behavior**:
|
||||||
|
```sql
|
||||||
|
-- Example ARCH_PATH value
|
||||||
|
SELECT ARCH_PATH
|
||||||
|
FROM CT_MRDS.A_SOURCE_FILE_RECEIVED
|
||||||
|
WHERE PROCESSING_STATUS = 'ARCHIVED_AND_TRASHED'
|
||||||
|
AND ROWNUM = 1;
|
||||||
|
|
||||||
|
-- Result (example):
|
||||||
|
-- https://objectstorage.../ARCHIVE/LM/STANDING_FACILITIES/PARTITION_YEAR=2026/PARTITION_MONTH=02/
|
||||||
|
|
||||||
|
-- Reason: DBMS_CLOUD.EXPORT_DATA with parallel execution creates multiple Parquet files:
|
||||||
|
-- - STANDING_FACILITIES_part_00001.parquet
|
||||||
|
-- - STANDING_FACILITIES_part_00002.parquet
|
||||||
|
-- - ...
|
||||||
|
-- System stores directory prefix to track ALL generated files
|
||||||
|
```
|
||||||
|
|
||||||
|
**To List Actual Parquet Files**:
|
||||||
|
```sql
|
||||||
|
-- Use DBMS_CLOUD.LIST_OBJECTS with ARCH_PATH as prefix
|
||||||
|
SELECT object_name, bytes, created
|
||||||
|
FROM TABLE(DBMS_CLOUD.LIST_OBJECTS(
|
||||||
|
credential_name => 'OCI$RESOURCE_PRINCIPAL',
|
||||||
|
location_uri => 'https://objectstorage.../b/archive/o/'
|
||||||
|
))
|
||||||
|
WHERE object_name LIKE 'ARCHIVE/LM/STANDING_FACILITIES/PARTITION_YEAR=2026/PARTITION_MONTH=02/%';
|
||||||
|
```
|
||||||
|
|
||||||
|
#### Issue 4: Files Remain in TRASH Folder
|
||||||
|
|
||||||
|
**Symptoms**: Files not deleted from TRASH after archival
|
||||||
|
|
||||||
|
**Cause**: Configuration has `KEEP_IN_TRASH='Y'` (retain files in TRASH)
|
||||||
|
|
||||||
|
**Verification**:
|
||||||
|
```sql
|
||||||
|
-- Check TRASH policy for configuration
|
||||||
|
SELECT
|
||||||
|
SOURCE_FILE_ID,
|
||||||
|
TABLE_ID,
|
||||||
|
KEEP_IN_TRASH,
|
||||||
|
CASE KEEP_IN_TRASH
|
||||||
|
WHEN 'Y' THEN 'Files RETAINED in TRASH (manual purge required)'
|
||||||
|
WHEN 'N' THEN 'Files DELETED immediately after archival'
|
||||||
|
END AS TRASH_BEHAVIOR
|
||||||
|
FROM CT_MRDS.A_SOURCE_FILE_CONFIG
|
||||||
|
WHERE TABLE_ID = 'YOUR_TABLE';
|
||||||
|
```
|
||||||
|
|
||||||
|
**Solutions**:
|
||||||
|
```sql
|
||||||
|
-- Option A: Change configuration to auto-delete (permanent change)
|
||||||
|
UPDATE CT_MRDS.A_SOURCE_FILE_CONFIG
|
||||||
|
SET KEEP_IN_TRASH = 'N' -- Auto-delete from TRASH after archival
|
||||||
|
WHERE TABLE_ID = 'YOUR_TABLE';
|
||||||
|
COMMIT;
|
||||||
|
|
||||||
|
-- Option B: Manually purge TRASH for specific table (one-time action)
|
||||||
|
BEGIN
|
||||||
|
CT_MRDS.FILE_ARCHIVER.PURGE_TRASH_FOLDER(
|
||||||
|
pSourceFileConfigKey => :yourConfigKey
|
||||||
|
);
|
||||||
|
END;
|
||||||
|
/
|
||||||
|
|
||||||
|
-- Option C: Purge all TRASH system-wide (use with caution)
|
||||||
|
BEGIN
|
||||||
|
CT_MRDS.FILE_ARCHIVER.PURGE_TRASH_FOLDER(
|
||||||
|
pPurgeAll => TRUE
|
||||||
|
);
|
||||||
|
END;
|
||||||
|
/
|
||||||
|
```
|
||||||
|
|
||||||
|
#### Issue 5: Automatic Rollback Occurred
|
||||||
|
|
||||||
|
**Symptoms**: Files unexpectedly back in INGESTED status, archival process reported failure
|
||||||
|
|
||||||
|
**Cause**: Error during "Move to TRASH" step triggered automatic rollback
|
||||||
|
|
||||||
|
**Investigation**:
|
||||||
|
```sql
|
||||||
|
-- Check process logs for rollback events
|
||||||
|
SELECT
|
||||||
|
PROCESS_LOG_KEY,
|
||||||
|
LOG_LEVEL,
|
||||||
|
LOG_MESSAGE,
|
||||||
|
PARAMETERS,
|
||||||
|
LOG_TIMESTAMP
|
||||||
|
FROM CT_MRDS.A_PROCESS_LOG
|
||||||
|
WHERE PROCESS_NAME = 'ARCHIVE_TABLE_DATA'
|
||||||
|
AND LOG_MESSAGE LIKE '%rollback%' OR LOG_MESSAGE LIKE '%MOVE_FILE_TO_TRASH_FAILURE%'
|
||||||
|
ORDER BY LOG_TIMESTAMP DESC
|
||||||
|
FETCH FIRST 10 ROWS ONLY;
|
||||||
|
|
||||||
|
-- Check files that were rolled back
|
||||||
|
SELECT
|
||||||
|
A_SOURCE_FILE_RECEIVED_KEY,
|
||||||
|
SOURCE_FILE_NAME,
|
||||||
|
PROCESSING_STATUS, -- Should be INGESTED after rollback
|
||||||
|
ARCH_PATH, -- Should be NULL after rollback
|
||||||
|
PARTITION_YEAR, -- Should be NULL after rollback
|
||||||
|
PARTITION_MONTH -- Should be NULL after rollback
|
||||||
|
FROM CT_MRDS.A_SOURCE_FILE_RECEIVED
|
||||||
|
WHERE A_SOURCE_FILE_CONFIG_KEY = :yourConfigKey
|
||||||
|
AND UPDATED_AT > SYSDATE - 1 -- Last 24 hours
|
||||||
|
ORDER BY UPDATED_AT DESC;
|
||||||
|
```
|
||||||
|
|
||||||
|
**Resolution**:
|
||||||
|
1. **Investigate root cause**: Check error messages in A_PROCESS_LOG
|
||||||
|
2. **Fix underlying issue**: OCI permissions, bucket access, wrong credentials, etc.
|
||||||
|
3. **Re-run archival**: Call ARCHIVE_TABLE_DATA again after fix
|
||||||
|
|
||||||
|
#### Issue 6: Archival Not Working as Expected
|
||||||
|
|
||||||
**Symptoms**: Data not being archived according to strategy
|
**Symptoms**: Data not being archived according to strategy
|
||||||
|
|
||||||
@@ -495,9 +1025,156 @@ FROM user_objects
|
|||||||
WHERE object_name = 'FILE_ARCHIVER';
|
WHERE object_name = 'FILE_ARCHIVER';
|
||||||
```
|
```
|
||||||
|
|
||||||
|
### Diagnostic Queries for Monitoring
|
||||||
|
|
||||||
|
#### Query 1: Status Distribution Across All Files
|
||||||
|
|
||||||
|
```sql
|
||||||
|
-- Overall file status distribution
|
||||||
|
SELECT
|
||||||
|
PROCESSING_STATUS,
|
||||||
|
COUNT(*) AS FILE_COUNT,
|
||||||
|
ROUND(COUNT(*) * 100.0 / SUM(COUNT(*)) OVER (), 2) AS PERCENTAGE,
|
||||||
|
MIN(CREATED_AT) AS OLDEST_FILE,
|
||||||
|
MAX(CREATED_AT) AS NEWEST_FILE
|
||||||
|
FROM CT_MRDS.A_SOURCE_FILE_RECEIVED
|
||||||
|
GROUP BY PROCESSING_STATUS
|
||||||
|
ORDER BY FILE_COUNT DESC;
|
||||||
|
```
|
||||||
|
|
||||||
|
#### Query 2: Files in TRASH (Archived but Not Purged)
|
||||||
|
|
||||||
|
```sql
|
||||||
|
-- Files currently in TRASH folder (status ARCHIVED_AND_TRASHED)
|
||||||
|
SELECT
|
||||||
|
SFR.A_SOURCE_FILE_RECEIVED_KEY,
|
||||||
|
SFC.SOURCE_FILE_ID,
|
||||||
|
SFC.TABLE_ID,
|
||||||
|
SFR.SOURCE_FILE_NAME,
|
||||||
|
SFR.ARCH_PATH,
|
||||||
|
SFR.PARTITION_YEAR,
|
||||||
|
SFR.PARTITION_MONTH,
|
||||||
|
SFR.FILE_SIZE_BYTES,
|
||||||
|
SFR.UPDATED_AT AS ARCHIVED_AT,
|
||||||
|
TRUNC(SYSDATE - SFR.UPDATED_AT) AS DAYS_IN_TRASH,
|
||||||
|
SFC.KEEP_IN_TRASH AS TRASH_POLICY
|
||||||
|
FROM CT_MRDS.A_SOURCE_FILE_RECEIVED SFR
|
||||||
|
JOIN CT_MRDS.A_SOURCE_FILE_CONFIG SFC ON SFR.A_SOURCE_FILE_CONFIG_KEY = SFC.A_SOURCE_FILE_CONFIG_KEY
|
||||||
|
WHERE SFR.PROCESSING_STATUS = 'ARCHIVED_AND_TRASHED'
|
||||||
|
ORDER BY SFR.UPDATED_AT DESC;
|
||||||
|
```
|
||||||
|
|
||||||
|
#### Query 3: Archival Activity by Configuration
|
||||||
|
|
||||||
|
```sql
|
||||||
|
-- Archival statistics per table configuration
|
||||||
|
SELECT
|
||||||
|
SFC.SOURCE_FILE_ID,
|
||||||
|
SFC.TABLE_ID,
|
||||||
|
SFC.ARCHIVAL_STRATEGY,
|
||||||
|
SFC.ARCHIVE_ENABLED,
|
||||||
|
SFC.KEEP_IN_TRASH,
|
||||||
|
COUNT(CASE WHEN SFR.PROCESSING_STATUS = 'INGESTED' THEN 1 END) AS PENDING_ARCHIVE,
|
||||||
|
COUNT(CASE WHEN SFR.PROCESSING_STATUS = 'ARCHIVED_AND_TRASHED' THEN 1 END) AS IN_TRASH,
|
||||||
|
COUNT(CASE WHEN SFR.PROCESSING_STATUS = 'ARCHIVED_AND_PURGED' THEN 1 END) AS PURGED,
|
||||||
|
MAX(SFR.UPDATED_AT) FILTER (WHERE SFR.PROCESSING_STATUS LIKE 'ARCHIVED%') AS LAST_ARCHIVAL
|
||||||
|
FROM CT_MRDS.A_SOURCE_FILE_CONFIG SFC
|
||||||
|
LEFT JOIN CT_MRDS.A_SOURCE_FILE_RECEIVED SFR ON SFC.A_SOURCE_FILE_CONFIG_KEY = SFR.A_SOURCE_FILE_CONFIG_KEY
|
||||||
|
WHERE SFC.SOURCE_FILE_TYPE = 'INPUT'
|
||||||
|
GROUP BY
|
||||||
|
SFC.SOURCE_FILE_ID, SFC.TABLE_ID, SFC.ARCHIVAL_STRATEGY,
|
||||||
|
SFC.ARCHIVE_ENABLED, SFC.KEEP_IN_TRASH
|
||||||
|
ORDER BY SFC.SOURCE_FILE_ID, SFC.TABLE_ID;
|
||||||
|
```
|
||||||
|
|
||||||
|
#### Query 4: Files Eligible for Archival (MINIMUM_AGE_MONTHS)
|
||||||
|
|
||||||
|
```sql
|
||||||
|
-- Files that should be archived based on MINIMUM_AGE_MONTHS strategy
|
||||||
|
SELECT
|
||||||
|
SFC.SOURCE_FILE_ID,
|
||||||
|
SFC.TABLE_ID,
|
||||||
|
SFC.MINIMUM_AGE_MONTHS AS AGE_THRESHOLD,
|
||||||
|
COUNT(*) AS ELIGIBLE_FILES,
|
||||||
|
SUM(SFR.FILE_SIZE_BYTES) AS TOTAL_SIZE_BYTES,
|
||||||
|
SUM(SFR.TOTAL_RECORDS) AS TOTAL_ROWS,
|
||||||
|
MIN(LH.LOAD_START) AS OLDEST_FILE,
|
||||||
|
MAX(LH.LOAD_START) AS NEWEST_ELIGIBLE
|
||||||
|
FROM CT_MRDS.A_SOURCE_FILE_CONFIG SFC
|
||||||
|
JOIN CT_MRDS.A_SOURCE_FILE_RECEIVED SFR ON SFC.A_SOURCE_FILE_CONFIG_KEY = SFR.A_SOURCE_FILE_CONFIG_KEY
|
||||||
|
JOIN CT_ODS.A_LOAD_HISTORY LH ON SFR.A_WORKFLOW_HISTORY_KEY = LH.A_WORKFLOW_HISTORY_KEY
|
||||||
|
WHERE SFC.ARCHIVAL_STRATEGY = 'MINIMUM_AGE_MONTHS'
|
||||||
|
AND SFC.ARCHIVE_ENABLED = 'Y'
|
||||||
|
AND SFR.PROCESSING_STATUS = 'INGESTED'
|
||||||
|
AND LH.LOAD_START < ADD_MONTHS(TRUNC(SYSDATE, 'MM'), -SFC.MINIMUM_AGE_MONTHS)
|
||||||
|
GROUP BY SFC.SOURCE_FILE_ID, SFC.TABLE_ID, SFC.MINIMUM_AGE_MONTHS
|
||||||
|
ORDER BY ELIGIBLE_FILES DESC;
|
||||||
|
```
|
||||||
|
|
||||||
|
#### Query 5: Archival Performance Metrics
|
||||||
|
|
||||||
|
```sql
|
||||||
|
-- Recent archival operations with timing
|
||||||
|
SELECT
|
||||||
|
PROCESS_LOG_KEY,
|
||||||
|
SUBSTR(PARAMETERS, 1, 100) AS CONFIG_INFO,
|
||||||
|
LOG_TIMESTAMP AS START_TIME,
|
||||||
|
LEAD(LOG_TIMESTAMP) OVER (PARTITION BY SUBSTR(PARAMETERS, 1, 100) ORDER BY LOG_TIMESTAMP) AS END_TIME,
|
||||||
|
ROUND((LEAD(LOG_TIMESTAMP) OVER (PARTITION BY SUBSTR(PARAMETERS, 1, 100) ORDER BY LOG_TIMESTAMP)
|
||||||
|
- LOG_TIMESTAMP) * 24 * 60, 2) AS DURATION_MINUTES,
|
||||||
|
CASE
|
||||||
|
WHEN LOG_LEVEL = 'ERROR' THEN 'FAILED'
|
||||||
|
WHEN LOG_MESSAGE LIKE '%Archival completed%' THEN 'SUCCESS'
|
||||||
|
ELSE 'IN_PROGRESS'
|
||||||
|
END AS STATUS
|
||||||
|
FROM CT_MRDS.A_PROCESS_LOG
|
||||||
|
WHERE PROCESS_NAME = 'ARCHIVE_TABLE_DATA'
|
||||||
|
AND LOG_TIMESTAMP > SYSDATE - 7 -- Last 7 days
|
||||||
|
ORDER BY LOG_TIMESTAMP DESC;
|
||||||
|
```
|
||||||
|
|
||||||
|
#### Query 6: TRASH Storage Usage
|
||||||
|
|
||||||
|
```sql
|
||||||
|
-- Estimate TRASH folder storage usage
|
||||||
|
SELECT
|
||||||
|
SFC.SOURCE_FILE_ID,
|
||||||
|
COUNT(*) AS FILES_IN_TRASH,
|
||||||
|
ROUND(SUM(SFR.FILE_SIZE_BYTES) / 1024 / 1024 / 1024, 2) AS SIZE_GB,
|
||||||
|
MIN(SFR.UPDATED_AT) AS OLDEST_IN_TRASH,
|
||||||
|
MAX(SFR.UPDATED_AT) AS NEWEST_IN_TRASH,
|
||||||
|
SFC.KEEP_IN_TRASH AS POLICY
|
||||||
|
FROM CT_MRDS.A_SOURCE_FILE_RECEIVED SFR
|
||||||
|
JOIN CT_MRDS.A_SOURCE_FILE_CONFIG SFC ON SFR.A_SOURCE_FILE_CONFIG_KEY = SFC.A_SOURCE_FILE_CONFIG_KEY
|
||||||
|
WHERE SFR.PROCESSING_STATUS = 'ARCHIVED_AND_TRASHED'
|
||||||
|
GROUP BY SFC.SOURCE_FILE_ID, SFC.KEEP_IN_TRASH
|
||||||
|
ORDER BY SIZE_GB DESC;
|
||||||
|
```
|
||||||
|
|
||||||
## Version History
|
## Version History
|
||||||
|
|
||||||
### v3.1.0 (Current - 2026-02-05)
|
### v3.3.0 (Current - 2026-02-11)
|
||||||
|
- **BREAKING CHANGE**: Removed `pKeepInTrash` parameter from ARCHIVE_TABLE_DATA
|
||||||
|
- Added `ARCHIVE_ENABLED` column to A_SOURCE_FILE_CONFIG for selective archiving control
|
||||||
|
- Added `KEEP_IN_TRASH` column to A_SOURCE_FILE_CONFIG (replaces pKeepInTrash parameter)
|
||||||
|
- Added batch procedures with 3-level granularity (config/source/all):
|
||||||
|
- ARCHIVE_ALL - Batch archival procedure
|
||||||
|
- GATHER_TABLE_STAT_ALL - Batch statistics procedure
|
||||||
|
- RESTORE_FILE_FROM_TRASH - Restore files from TRASH folder
|
||||||
|
- PURGE_TRASH_FOLDER - Purge TRASH folder files
|
||||||
|
- TRASH retention now configuration-based instead of parameter-based
|
||||||
|
- Enhanced flexibility for archival orchestration and monitoring
|
||||||
|
|
||||||
|
### v3.2.1 (2026-02-10)
|
||||||
|
- Fixed critical bug: Status update ARCHIVED → ARCHIVED_AND_TRASHED when moving files to TRASH folder
|
||||||
|
- Ensures proper status tracking for files retained in TRASH
|
||||||
|
|
||||||
|
### v3.2.0 (2026-02-06)
|
||||||
|
- Added `pKeepInTrash` parameter (DEFAULT TRUE) to ARCHIVE_TABLE_DATA
|
||||||
|
- TRASH folder retention control for safety and compliance
|
||||||
|
- Files kept in TRASH subfolder by default for rollback capability
|
||||||
|
|
||||||
|
### v3.1.0 (2026-02-05)
|
||||||
- **BREAKING CHANGE**: Removed CURRENT_MONTH_ONLY strategy (replaced by MINIMUM_AGE_MONTHS = 0)
|
- **BREAKING CHANGE**: Removed CURRENT_MONTH_ONLY strategy (replaced by MINIMUM_AGE_MONTHS = 0)
|
||||||
- Mathematical equivalence: CURRENT_MONTH_ONLY ≡ MINIMUM_AGE_MONTHS = 0
|
- Mathematical equivalence: CURRENT_MONTH_ONLY ≡ MINIMUM_AGE_MONTHS = 0
|
||||||
- Updated trigger validation to allow MINIMUM_AGE_MONTHS >= 0 (previously >= 1)
|
- Updated trigger validation to allow MINIMUM_AGE_MONTHS >= 0 (previously >= 1)
|
||||||
@@ -567,9 +1244,7 @@ WHERE object_name = 'FILE_ARCHIVER';
|
|||||||
- Example: CSDB securities data (MINIMUM_AGE_MONTHS = 6)
|
- Example: CSDB securities data (MINIMUM_AGE_MONTHS = 6)
|
||||||
|
|
||||||
2. **Use THRESHOLD_BASED when**:
|
2. **Use THRESHOLD_BASED when**:
|
||||||
- Maintaining backward compatibility with legacy behavior
|
|
||||||
- Simple time-based archival is sufficient
|
- Simple time-based archival is sufficient
|
||||||
- Migration from FILE_ARCHIVER v2.0.0
|
|
||||||
|
|
||||||
3. **Use HYBRID when**:
|
3. **Use HYBRID when**:
|
||||||
- Complex retention requirements
|
- Complex retention requirements
|
||||||
@@ -632,18 +1307,30 @@ WHERE object_name = 'FILE_ARCHIVER';
|
|||||||
|
|
||||||
### TRASH Folder Retention Best Practices
|
### TRASH Folder Retention Best Practices
|
||||||
|
|
||||||
1. **Default Behavior (pKeepInTrash = TRUE - Recommended)**:
|
1. **Default Behavior (KEEP_IN_TRASH = 'Y' - Recommended)**:
|
||||||
- Keeps CSV files in TRASH folder after archival
|
- Keeps CSV files in TRASH folder after archival
|
||||||
- Provides safety net for rollback if archival issues occur
|
- Provides safety net for rollback if archival issues occur
|
||||||
- Supports compliance and audit requirements
|
- Supports compliance and audit requirements
|
||||||
- Status: ARCHIVED_AND_TRASHED
|
- Status: ARCHIVED_AND_TRASHED
|
||||||
- Use for: Production environments, regulatory compliance, critical data
|
- Use for: Production environments, regulatory compliance, critical data
|
||||||
|
- Configuration:
|
||||||
|
```sql
|
||||||
|
UPDATE CT_MRDS.A_SOURCE_FILE_CONFIG
|
||||||
|
SET KEEP_IN_TRASH = 'Y'
|
||||||
|
WHERE SOURCE_FILE_TYPE = 'INPUT' AND TABLE_ID = 'YOUR_TABLE';
|
||||||
|
```
|
||||||
|
|
||||||
2. **TRASH Cleanup (pKeepInTrash = FALSE)**:
|
2. **TRASH Cleanup (KEEP_IN_TRASH = 'N')**:
|
||||||
- Deletes CSV files from TRASH folder after successful archival
|
- Deletes CSV files from TRASH folder after successful archival
|
||||||
- Reduces storage costs in DATA bucket
|
- Reduces storage costs in DATA bucket
|
||||||
- Status: ARCHIVED_AND_PURGED
|
- Status: ARCHIVED_AND_PURGED
|
||||||
- Use for: Non-critical data, storage optimization, test environments
|
- Use for: Non-critical data, storage optimization, test environments
|
||||||
|
- Configuration:
|
||||||
|
```sql
|
||||||
|
UPDATE CT_MRDS.A_SOURCE_FILE_CONFIG
|
||||||
|
SET KEEP_IN_TRASH = 'N'
|
||||||
|
WHERE SOURCE_FILE_TYPE = 'INPUT' AND TABLE_ID = 'YOUR_TABLE';
|
||||||
|
```
|
||||||
|
|
||||||
3. **Monitoring TRASH Folder**:
|
3. **Monitoring TRASH Folder**:
|
||||||
```sql
|
```sql
|
||||||
@@ -676,7 +1363,7 @@ WHERE object_name = 'FILE_ARCHIVER';
|
|||||||
## Author
|
## Author
|
||||||
|
|
||||||
Created by: Grzegorz Michalski
|
Created by: Grzegorz Michalski
|
||||||
Date: 2026-02-06
|
Date: 2026-02-11
|
||||||
Schema: CT_MRDS
|
Schema: CT_MRDS
|
||||||
Package: FILE_ARCHIVER
|
Package: FILE_ARCHIVER
|
||||||
Version: 3.2.0
|
Version: 3.3.0
|
||||||
|
|||||||
Reference in New Issue
Block a user