CodeQL guided high risk (potential) file handle leaks DBio.c #375

dlmiles · 2025-02-18T10:19:36Z

Should the cd_fd field be managed by a new flock_close(int fd) function, where the application can hand the open FD back to flock.c to manage when the real close should occur ? Which in turn, will receive the other active FD when it is also closed (or maybe there is also a flock_fclose(FILE *fp) to manage stdio use cases). The point is flock.c needs visibility of open and closes, then it can manage the locking requirements with multiple users in same process.
Then when all uses of a file are close (active_use_count_of_this_fd==0) then flock.c will manage closing all FDs (causing file-locking release) ?

Maybe if the sequence of events can be written in a issue comment, highlight the potential data-race-data-corruption concern (the need for a file lock in the first place) in the order of events. I can better understand how the existing code meets requirements.

In this one function there are multiple fp = fopen(expandname) then fclose(fp) that occur, without being locking-aware. Then at the end of the function, there is this lock-aware section, my patch touches. The call to fclose(fp) will destroy any existing locks that may have been obtained before, on that file. The file looks to be appended/write/rewind/truncate etc... and then closed(), if there were to happen more than once the 2nd use of this function the same file, would invalidate the 1st use's locks that were laid down. (due to the close(fp) syscall getting called from fclose use above)

In short you can't mix locking-aware and locking-unaware file open/access to the same file. Either all access to a particular file should be locked/locking-aware, or all accesses to a particular file never obey any locking rules (even when another process is successfully placing locks because it is locking-aware, in this scenario we did not participate in their locking scheme, as it is a co-operative scheme).

FWIW there are other locking types and schemes that exist, linux at least supports, that maybe more suited to the design goals trying to be met.

DBio.c: CodeQL File{MayNot,Never}BeClosed.ql file-handle resource leaks

Guided by CodeQL static code analyser.

FileMayNotBeClosed.ql
FileMayNeverBeClosed.ql

Technically the FILE_LOCKS feature leaks the file handle, but maybe this isn't in a perfectly controlled way (with assurance that at some correct point in the program future, all the fd's are eventually closed)

Guided by CodeQL static code analyser. FileMayNotBeClosed.ql FileMayNeverBeClosed.ql Technically the FILE_LOCKS feature leaks the file handle, but maybe this isn't in a perfectly controlled way (with assurance that at some correct point in the program future, all the fd's are eventually closed)

RTimothyEdwards · 2025-02-18T14:44:41Z

Per my comment on the previous PR:

"FWIW there are other locking types and schemes that exist, linux at least supports, that maybe more suited to the design goals trying to be met."

---Yes, could not agree more.

dlmiles · 2025-02-20T08:03:09Z

The flock use-cases needs to be understood in respect of the comments #374 (comment) and the flock.c implementation.

Merge Status: merge on hold, pending rework expected (review this status in 4 weeks)
Quality: not ready or complete
Risk: high-impact
Level of Testing: not started (needs targeted testing)

RTimothyEdwards · 2025-02-23T02:14:14Z

I have to recall conversations from around 1992, but I think the problem with having a lock file was that lock files could get left behind if the program crashed, but a file lock handled by the system would be released if the program crashed. The file locking mechanism implemented was the only one (that I am aware of) that was available at the time. Please enumerate the options.

dlmiles · 2025-02-25T11:58:16Z

Yes that maybe considered an issue, but the policy at a slightly later time (~1995) was to create the lockfile and write into it the fprintf(lockfile, "%5d\n", getpid()) value of the owner (space padded somewhat, but usually the creation uses O_EXCL which does most of the work, so I never really understood why it need to be padded, presumably to overwrite a shorter looking PID under overwrite non-O_CREATE circumstances).

At the time another process that found the lockfile existed, would read it, then could send a kill(0, pid_from_lockfile) to see if this returned an error code (ESRCH) or was successful (0). This is an indicator the process is still alive. There is the concern of eventual PID number recycling, in those days maybe it was a 16bit range for PID.

This was the basic design that allowed stale lockfile to be spotted and removed, automatically in the 99.99% of cases, due to the owning PID not being alive.

Then to enhance this, use of fcntl(F_SETLK) like now, but on the lockfile itself. This does not suffer from the potential PID get recycled concern (for the reasons you state) and a long running process is currently using that PID by the time the stale lockfile test was done.

Today the PID number space has expanded somewhat, so less likely to see recycled PID.
Today there is security capability and containerization that may limit the visibility of the PIDs a process can see, including the ability to use this method to see if a process is alive (hidden due to security information leakage).
But the fcntl() should still work.

The benefit of only locking the the *.lockfile with fcntl() is that the process is free to open/close and use multiple file handles on the main file, knowing that the whole time it is still exclusively locked (by other systems that participate in the same locking scheme)

Back in 1992 I'm sure lockfiles on NFS systems might have also been an issue but these days even that is not a problem.

Visible lockfiles I find are very useful for a human to know, correct/workaround make a decision about.

How to handle things when one is found, is a UI matter with appropriate dialogs (popup modals with options to user), or diagnostic command suggestion in the error output such as when in batch mode on how to correct it:

ERROR: Unable to acquire lock file for /path/to/problem.mag
#####: To override this you must run manually remove the file:
#####   rm -f /absolute/path/to/problem.mag.lockfile

In modern times some system will create a symlink like: ln -s username@hostname.PID .#target_file.mag this make the lockfile's existence less visible (dot prefix hidden file), the target it points to does not exist, it is used to attach the string content into the readlink() part of the symlink, it identifies the username, the system-name (helpful for NFS shares) and the PID of that system that currently is the owner of the lock.

This way I guess it is possible to check when you kill(0, pid_from_filename) you first check you are running this on the correct hostname. But also provides additional information to the user on shared filesystems.

dlmiles · 2025-06-27T14:57:10Z

Moved to draft to better indicate current status.

dlmiles mentioned this pull request Feb 20, 2025

CodeQL guided medium risk (potential) file handle leaks path.c/flock.c #374

Closed

dlmiles marked this pull request as draft June 27, 2025 14:56

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

CodeQL guided high risk (potential) file handle leaks DBio.c #375

CodeQL guided high risk (potential) file handle leaks DBio.c #375

Uh oh!

dlmiles commented Feb 18, 2025 •

edited

Loading

Uh oh!

RTimothyEdwards commented Feb 18, 2025

Uh oh!

dlmiles commented Feb 20, 2025

Uh oh!

RTimothyEdwards commented Feb 23, 2025

Uh oh!

dlmiles commented Feb 25, 2025 •

edited

Loading

Uh oh!

dlmiles commented Jun 27, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

CodeQL guided high risk (potential) file handle leaks DBio.c #375

Are you sure you want to change the base?

CodeQL guided high risk (potential) file handle leaks DBio.c #375

Uh oh!

Conversation

dlmiles commented Feb 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

RTimothyEdwards commented Feb 18, 2025

Uh oh!

dlmiles commented Feb 20, 2025

Uh oh!

RTimothyEdwards commented Feb 23, 2025

Uh oh!

dlmiles commented Feb 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dlmiles commented Jun 27, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

dlmiles commented Feb 18, 2025 •

edited

Loading

dlmiles commented Feb 25, 2025 •

edited

Loading