Skip to content

Conversation

@idelcano
Copy link
Contributor

@idelcano idelcano commented May 22, 2025

This script removes documents that should have been deleted by DHIS2 but weren't.

to use
python3 file_garbage_remover.py -i instancename

Introduced file_garbage_remover_tomcat.py:
A robust script designed specifically for production (Tomcat) environments. This script safely handles orphaned document resources by moving them to a temporary directory, creating an audit trail in the database, and then cleaning up original entries. It provides test (--test) and force (--force) modes for secure operations.

Updated Documentation:
Clearly documented both scripts (file_garbage_remover_tomcat.py and the previously existing Docker test script) to distinguish their usage, operation, and precautions.

This improvement allows safer file handling and audit tracking in production, complementing the simpler Docker-based script intended for testing purposes.

I have create a wrapper in dev to test it in dry run mode:

/home/tomcatuser/bin/file_garbage_remover/run_file_garbage_remover.sh test

@idelcano idelcano requested a review from cgbautista May 23, 2025 07:09
Copy link
Contributor

@cgbautista cgbautista left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good from my side. Suggested some changes, but would work as is.

@idelcano idelcano requested a review from cgbautista July 7, 2025 17:13
@idelcano
Copy link
Contributor Author

idelcano commented Jul 7, 2025

Hi @cgbautista
Please review the first comment and the updated README.md. If you have any questions, feel free to ask.
This script has already been executed in the preprod-indiv and dev environments as of this comment (we could think that the functional testing is done).


Test Mode (dry-run):

```./file_garbage_remover_tomcat.py --test --config /path/to/config.json```
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test mode also requires the db password to connect and retrieve the list of fileResources, so the code example should contain the export line as the --force example

def ensure_audit_table_exists(cursor, dry_run=False):
if dry_run:
log("[DRY RUN] Would execute:")
log(" CREATE TABLE IF NOT EXISTS fileresourcesaudit AS TABLE fileresource WITH NO DATA;")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would include this SQL query (and the same query in line 109) the same way as the other 3 queries in lines 12-30


db_password = os.environ.get("DB_PASSWORD_FILE_G")
if not db_password:
log("❌ Missing required environment variable: DB_PASSWORD")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Error message informs that the user did not inform an environmental variable that is not the same that the code checks.
If the password is expected to be as plain text in the env., it should not be called FILE

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the name was really bad
I used it to avoid introducing another DB_PASSWORD environment variable (definitely improvable) and was thinking in FILE_GARBAGE.py, but it doesn't make sense. I've renamed it to DB_PASSWORD_FG.

@idelcano
Copy link
Contributor Author

Thanks @cgbautista I did a new commits to fix this minus issues

Copy link
Contributor

@cgbautista cgbautista left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested in local enviroment both on docker and tomcat instances.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants