Skip to content

Conversation

@Nateowami
Copy link
Collaborator

@Nateowami Nateowami commented Jan 9, 2026

This change is Reviewable

@Nateowami Nateowami added testing not required e2e Run e2e tests for this pull request labels Jan 9, 2026
@Nateowami Nateowami force-pushed the fix/dbl-resource-extraction-logic branch from 1945cc0 to 2409928 Compare January 9, 2026 17:42
@Nateowami Nateowami changed the title Fix/dbl resource extraction logic Fix DBL resource extraction logic Jan 9, 2026
@codecov
Copy link

codecov bot commented Jan 9, 2026

Codecov Report

❌ Patch coverage is 45.00000% with 11 lines in your changes missing coverage. Please review.
✅ Project coverage is 82.80%. Comparing base (9ccc759) to head (db1374d).
⚠️ Report is 1 commits behind head on master.
✅ All tests successful. No failed tests found.

Files with missing lines Patch % Lines
...c/SIL.XForge.Scripture/Services/ParatextService.cs 11.11% 6 Missing and 2 partials ⚠️
...rge.Scripture/Services/SFInstallableDblResource.cs 70.00% 2 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #3628      +/-   ##
==========================================
- Coverage   82.82%   82.80%   -0.02%     
==========================================
  Files         609      610       +1     
  Lines       37426    37440      +14     
  Branches     6159     6158       -1     
==========================================
+ Hits        30998    31003       +5     
+ Misses       5494     5487       -7     
- Partials      934      950      +16     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@Nateowami Nateowami force-pushed the fix/dbl-resource-extraction-logic branch from 2409928 to 285e41f Compare January 9, 2026 17:59
@marksvc marksvc self-assigned this Jan 9, 2026
Copy link
Collaborator

@marksvc marksvc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your work on this. I am sorry to point out additional things :)

(Legend: ✏️ are blocking comments)

public sealed class SFInstallableDblResourceTests
{
[Test]
public void ExtractAllAsync_WithDirectoryTraversalEntry_ThrowsInvalidOperation()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✏️ Since we are writing some security code here, can we also show that an exception is thrown if a file

  • is called . or
  • is called .. or
  • is called /foo or
  • is called foo/../bar.txt (which is not starting with ../), or
  • is a symlink.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I confirm that I can make a zip file with symlinks. The following shows commands that do this, and the ls shows that the two symlinks that were extracted. I have not tried extracting with SharpZipLib to see if it would even make a symlink though.

zip --symlinks out.zip var passwd
mkdir ext && cp out.zip ext/ && cd ext && unzip out.zip && ls

-rw-rw-r-- 1  311 2026-01-09 11:37 out.zip
lrwxrwxrwx 1    4 2026-01-09 11:37 var -> /var
lrwxrwxrwx 1   11 2026-01-09 11:37 passwd -> /etc/passwd

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's wrong with /foo? What's wrong with /foo/../bar.txt?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't tested symlinks, but my first guess would be that they are skipped due to entry.IsFile being false. A test for that would probably be good.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tested a zip file with a symlink on Linux and Windows, and the SharpZipLib library we use just extracts the file with the contents set to the target for the symlink.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's wrong with /foo? What's wrong with /foo/../bar.txt?

/foo is not good because if a file being processed asks to be written to /etc/passwd, we don't want to honour that.

foo/../bar.txt is not good because foo/whatever/../../../../../../../../../../../../etc/passwd can resolve to /etc/passwd.

This is all quite implementation dependent. But they are some directory traversal problems that can be possible.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@marksvc My point was that the logic fully resolves a path and then determines if it's in the target directory, so weird paths aren't going to be a problem (at least in most cases). I wasn't sure how it combines /foo with the root path, but a quick check shows this is how it handles it:

> Path.Join("/path/to/dir","/path/in/zip")
"/path/to/dir/path/in/zip"

OK, but what if the first path ends with a slash:

> Path.Join("/path/to/dir/","/path/in/zip")
"/path/to/dir//path/in/zip"

Now we have double slashes, but they do get stripped out when GetFullPath is called:

> Path.GetFullPath(Path.Join("/path/to/dir/","/path/in/zip"))
"/path/to/dir/path/in/zip"

Adding .. to the file path also isn't going to cause any problems:

> Path.GetFullPath(Path.Join("/path/to/dir","/foo/../bar.txt"))
"/path/to/dir/bar.txt"

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, maybe there are some different thoughts about the purpose of the unit tests here. If we add the different problem cases to the tests, then it will show that the implementation handles the different cases (now, and next time the implementation is modified).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tested a zip file with a symlink on Linux and Windows, and the SharpZipLib library we use just extracts the file with the contents set to the target for the symlink.

I had not understood just what you meant. So I did some testing today. I found that using SharpZipLib was not creating symlinks when it unzipped a zip file that did contain symlinks.

cd $(mktemp -d) && 
ln -s /etc/passwd && 
ln -s /var && 
zip --symlinks out.zip var passwd && 
mkdir output && 
mv out.zip output && 
cd output && 
TestTool unzip out.zip 
-rw-rw-r-- 1    311 2026-01-12 11:42 out.zip
-rw-rw-r-- 1      4 2026-01-12 11:42 var
-rw-rw-r-- 1     11 2026-01-12 11:42 passwd

$ file var passwd
var:    ASCII text, with no line terminators
passwd: ASCII text, with no line terminators
$ hd var
00000000  2f 76 61 72                                       |/var|
00000004
$ hd passwd
00000000  2f 65 74 63 2f 70 61 73  73 77 64                 |/etc/passwd|
0000000b

I am satisfied that we will not be creating symlinks.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. I've added checking for symlinks and a unit test for this.

Copy link
Collaborator

@pmachapman pmachapman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pmachapman reviewed 2 files and all commit messages, and made 2 comments.
Reviewable status: all files reviewed, 5 unresolved discussions (waiting on @marksvc and @Nateowami).


src/SIL.XForge.Scripture/Services/SFInstallableDblResource.cs line 438 at r3 (raw file):

            {
                continue; // Do not overwrite existing files
            }

This will break any updates to DBL resources - files will not be overwritten with new data from the DBL.

Code quote:

            if (_fileSystemService.FileExists(entryFullPath))
            {
                continue; // Do not overwrite existing files
            }

src/SIL.XForge.Scripture/Services/SFInstallableDblResource.cs line 451 at r3 (raw file):

            await zipStream.CopyToAsync(output);
        }
    }

SharpZipLib has a class that can do this checks (and more, such as making sure the file isn't writing to a device on Windows) called WindowsNameTransform (see https://github.com/icsharpcode/SharpZipLib/blob/master/src/ICSharpCode.SharpZipLib/Zip/WindowsNameTransform.cs). NB: Despite the name, the class is not just for Windows.

    internal async Task ExtractAllAsync(ZipFile zip, string path)
    {
        WindowsNameTransform extractNameTransform = new WindowsNameTransform(path);
        foreach (ZipEntry entry in zip)
        {
            if (!entry.IsFile)
                continue; // Skip directories

            string entryName = extractNameTransform.TransformFile(entry.Name);
            string entryPath = Path.Join(path, entryName);

            if (_fileSystemService.FileExists(entryPath))
                continue; // Don't overwrite

            // Ensure directories in the ZIP entry are created
            _fileSystemService.CreateDirectory(Path.GetDirectoryName(entryPath));

            // Extract the file
            await using Stream zipStream = zip.GetInputStream(entry);
            await using Stream output = _fileSystemService.CreateFile(entryPath);
            await zipStream.CopyToAsync(output);
        }
    }

Your test will then just need to be:

     [Test]
    public void ExtractAllAsync_WithDirectoryTraversalEntry_ThrowsInvalidOperation()
    {
        TestEnvironment env = new TestEnvironment();
        using MemoryStream zipStream = TestEnvironment.CreateZipStream("../evil.txt", "malicious");
        using ZipFile zip = new ZipFile(zipStream);
        zip.IsStreamOwner = false;

        InvalidNameException exception = Assert.ThrowsAsync<InvalidNameException>(async () =>
            await env.Resource.ExtractAllAsync(zip, env.DestinationRoot)
        );

        Assert.That(exception.Message, Does.Contain("Parent traversal in paths is not allowed"));
        env.FileSystem.DidNotReceive().CreateFile(Arg.Any<string>());
        env.FileSystem.DidNotReceive().CreateDirectory(Arg.Any<string>());
    }

Code quote:

    internal async Task ExtractAllAsync(ZipFile zip, string path)
    {
        string destinationRoot = Path.GetFullPath(path);
        string rootWithSeparator = destinationRoot.EndsWith(Path.DirectorySeparatorChar)
            ? destinationRoot
            : destinationRoot + Path.DirectorySeparatorChar;
        foreach (ZipEntry entry in zip)
        {
            if (!entry.IsFile || string.IsNullOrEmpty(entry.Name))
            {
                continue; // Skip directories and entries without a valid name
            }

            string entryFullPath = Path.GetFullPath(Path.Join(destinationRoot, entry.Name));

            if (!entryFullPath.StartsWith(rootWithSeparator, StringComparison.Ordinal))
            {
                throw new InvalidOperationException("ZIP entry resolves outside the extraction directory.");
            }

            if (_fileSystemService.FileExists(entryFullPath))
            {
                continue; // Do not overwrite existing files
            }

            string entryDirectory = Path.GetDirectoryName(entryFullPath);
            if (!string.IsNullOrEmpty(entryDirectory))
            {
                _fileSystemService.CreateDirectory(entryDirectory); // Ensure destination directory exists
            }

            // Extract the file while keeping the write target within the allowed directory
            await using Stream zipStream = zip.GetInputStream(entry);
            await using Stream output = _fileSystemService.CreateFile(entryFullPath);
            await zipStream.CopyToAsync(output);
        }
    }

Copy link
Collaborator

@pmachapman pmachapman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pmachapman made 1 comment.
Reviewable status: all files reviewed, 6 unresolved discussions (waiting on @marksvc and @Nateowami).


test/SIL.XForge.Scripture.Tests/Services/SFInstallableDblResourceTests.cs line 22 at r3 (raw file):

{
    [Test]
    public void ExtractAllAsync_WithDirectoryTraversalEntry_ThrowsInvalidOperation()

Do you think we should have a test that tests a successful extraction with a zip file that kind of looks like a DBL resource (i.e. maybe an sfm file and a .dbl directory with a id subdirectory and a file inside that), and a test that upgrades a DBL resource (i.e. overwrites existing files)?

Code quote:

 public void ExtractAllAsync_WithDirectoryTraversalEntry_ThrowsInvalidOperation()

@pmachapman pmachapman force-pushed the fix/dbl-resource-extraction-logic branch from e591cb1 to 3779d4e Compare January 11, 2026 23:46

await env.Resource.ExtractAllAsync(zip, env.DestinationRoot);

env.FileSystem.Received().CreateDirectory(Path.Combine(env.DestinationRoot, ".dbl"));

Check notice

Code scanning / CodeQL

Call to System.IO.Path.Combine Note test

Call to 'System.IO.Path.Combine'.

Copilot Autofix

AI 3 days ago

To fix this, replace the use of Path.Combine with Path.Join when simply appending path segments and you do not want later absolute segments to override earlier ones. Path.Join concatenates segments with the appropriate directory separator without the “absolute path wins” semantics.

In this file, the best minimal fix is to update the expectations in ExtractAllAsync_Success:

  • On line 33, change Path.Combine(env.DestinationRoot, ".dbl") to Path.Join(env.DestinationRoot, ".dbl").
  • On line 34, change Path.Combine(env.DestinationRoot, ".dbl", "dbl_id_here") to Path.Join(env.DestinationRoot, ".dbl", "dbl_id_here").

No additional imports or helpers are needed because System.IO is already imported and Path.Join is available there. This preserves existing functionality while aligning with the safer API.

Suggested changeset 1
test/SIL.XForge.Scripture.Tests/Services/SFInstallableDblResourceTests.cs

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/test/SIL.XForge.Scripture.Tests/Services/SFInstallableDblResourceTests.cs b/test/SIL.XForge.Scripture.Tests/Services/SFInstallableDblResourceTests.cs
--- a/test/SIL.XForge.Scripture.Tests/Services/SFInstallableDblResourceTests.cs
+++ b/test/SIL.XForge.Scripture.Tests/Services/SFInstallableDblResourceTests.cs
@@ -30,8 +30,8 @@
 
         await env.Resource.ExtractAllAsync(zip, env.DestinationRoot);
 
-        env.FileSystem.Received().CreateDirectory(Path.Combine(env.DestinationRoot, ".dbl"));
-        env.FileSystem.Received().CreateFile(Path.Combine(env.DestinationRoot, ".dbl", "dbl_id_here"));
+        env.FileSystem.Received().CreateDirectory(Path.Join(env.DestinationRoot, ".dbl"));
+        env.FileSystem.Received().CreateFile(Path.Join(env.DestinationRoot, ".dbl", "dbl_id_here"));
     }
 
     [Test]
EOF
@@ -30,8 +30,8 @@

await env.Resource.ExtractAllAsync(zip, env.DestinationRoot);

env.FileSystem.Received().CreateDirectory(Path.Combine(env.DestinationRoot, ".dbl"));
env.FileSystem.Received().CreateFile(Path.Combine(env.DestinationRoot, ".dbl", "dbl_id_here"));
env.FileSystem.Received().CreateDirectory(Path.Join(env.DestinationRoot, ".dbl"));
env.FileSystem.Received().CreateFile(Path.Join(env.DestinationRoot, ".dbl", "dbl_id_here"));
}

[Test]
Copilot is powered by AI and may make mistakes. Always verify output.
Unable to commit as this autofix suggestion is now outdated
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

await env.Resource.ExtractAllAsync(zip, env.DestinationRoot);

env.FileSystem.Received().CreateDirectory(Path.Combine(env.DestinationRoot, ".dbl"));
env.FileSystem.Received().CreateFile(Path.Combine(env.DestinationRoot, ".dbl", "dbl_id_here"));

Check notice

Code scanning / CodeQL

Call to System.IO.Path.Combine Note test

Call to 'System.IO.Path.Combine'.

Copilot Autofix

AI 4 days ago

In general, to avoid the issue where Path.Combine drops earlier segments when a later segment is absolute, use Path.Join instead. Path.Join performs string concatenation with path separators but does not reinterpret a later absolute path as replacing previous segments, which makes it safer in cases where any component might unexpectedly be absolute.

For this specific test file, the best fix is to update the single problematic call on line 34 (and optionally the adjacent call on line 33 for consistency) to use Path.Join instead of Path.Combine. Both methods are in System.IO, which is already imported at the top of the file, so no new imports are required. The behavior for the provided relative segments remains the same, and the test continues to assert that CreateDirectory and CreateFile are called with the correct combined paths, but without incurring the static analysis warning. Concretely:

  • In ExtractAllAsync_Success, change:
    • Path.Combine(env.DestinationRoot, ".dbl")Path.Join(env.DestinationRoot, ".dbl")
    • Path.Combine(env.DestinationRoot, ".dbl", "dbl_id_here")Path.Join(env.DestinationRoot, ".dbl", "dbl_id_here")
      All other code remains unchanged.
Suggested changeset 1
test/SIL.XForge.Scripture.Tests/Services/SFInstallableDblResourceTests.cs

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/test/SIL.XForge.Scripture.Tests/Services/SFInstallableDblResourceTests.cs b/test/SIL.XForge.Scripture.Tests/Services/SFInstallableDblResourceTests.cs
--- a/test/SIL.XForge.Scripture.Tests/Services/SFInstallableDblResourceTests.cs
+++ b/test/SIL.XForge.Scripture.Tests/Services/SFInstallableDblResourceTests.cs
@@ -30,8 +30,8 @@
 
         await env.Resource.ExtractAllAsync(zip, env.DestinationRoot);
 
-        env.FileSystem.Received().CreateDirectory(Path.Combine(env.DestinationRoot, ".dbl"));
-        env.FileSystem.Received().CreateFile(Path.Combine(env.DestinationRoot, ".dbl", "dbl_id_here"));
+        env.FileSystem.Received().CreateDirectory(Path.Join(env.DestinationRoot, ".dbl"));
+        env.FileSystem.Received().CreateFile(Path.Join(env.DestinationRoot, ".dbl", "dbl_id_here"));
     }
 
     [Test]
EOF
@@ -30,8 +30,8 @@

await env.Resource.ExtractAllAsync(zip, env.DestinationRoot);

env.FileSystem.Received().CreateDirectory(Path.Combine(env.DestinationRoot, ".dbl"));
env.FileSystem.Received().CreateFile(Path.Combine(env.DestinationRoot, ".dbl", "dbl_id_here"));
env.FileSystem.Received().CreateDirectory(Path.Join(env.DestinationRoot, ".dbl"));
env.FileSystem.Received().CreateFile(Path.Join(env.DestinationRoot, ".dbl", "dbl_id_here"));
}

[Test]
Copilot is powered by AI and may make mistakes. Always verify output.
Unable to commit as this autofix suggestion is now outdated
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Copy link
Collaborator

@pmachapman pmachapman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pmachapman made 1 comment, resolved 4 discussions, and dismissed @github-advanced-security[bot] from 2 discussions.
Reviewable status: 0 of 2 files reviewed, 4 unresolved discussions (waiting on @marksvc and @Nateowami).

public sealed class SFInstallableDblResourceTests
{
[Test]
public void ExtractAllAsync_WithDirectoryTraversalEntry_ThrowsInvalidOperation()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tested a zip file with a symlink on Linux and Windows, and the SharpZipLib library we use just extracts the file with the contents set to the target for the symlink.

Copy link
Collaborator Author

@Nateowami Nateowami left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Nateowami reviewed 2 files and made 3 comments.
Reviewable status: all files reviewed (commit messages unreviewed), 6 unresolved discussions (waiting on @marksvc and @pmachapman).


src/SIL.XForge.Scripture/Services/SFInstallableDblResource.cs line 438 at r3 (raw file):

Previously, pmachapman (Peter Chapman) wrote…

This will break any updates to DBL resources - files will not be overwritten with new data from the DBL.

Thanks for catching this. I think the fact that there is no cleaning up of files that aren't in the ZIP file made me think this method was used in such a way that updates were not necessary (for example, removing the old directory first, then doing a clean extract).


test/SIL.XForge.Scripture.Tests/Services/SFInstallableDblResourceTests.cs line 22 at r3 (raw file):

Previously, pmachapman (Peter Chapman) wrote…

Do you think we should have a test that tests a successful extraction with a zip file that kind of looks like a DBL resource (i.e. maybe an sfm file and a .dbl directory with a id subdirectory and a file inside that), and a test that upgrades a DBL resource (i.e. overwrites existing files)?

I think you've added it?

public sealed class SFInstallableDblResourceTests
{
[Test]
public void ExtractAllAsync_WithDirectoryTraversalEntry_ThrowsInvalidOperation()
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great!

public sealed class SFInstallableDblResourceTests
{
[Test]
public void ExtractAllAsync_WithDirectoryTraversalEntry_ThrowsInvalidOperation()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's wrong with /foo? What's wrong with /foo/../bar.txt?

/foo is not good because if a file being processed asks to be written to /etc/passwd, we don't want to honour that.

foo/../bar.txt is not good because foo/whatever/../../../../../../../../../../../../etc/passwd can resolve to /etc/passwd.

This is all quite implementation dependent. But they are some directory traversal problems that can be possible.

public sealed class SFInstallableDblResourceTests
{
[Test]
public void ExtractAllAsync_WithDirectoryTraversalEntry_ThrowsInvalidOperation()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, maybe there are some different thoughts about the purpose of the unit tests here. If we add the different problem cases to the tests, then it will show that the implementation handles the different cases (now, and next time the implementation is modified).

public sealed class SFInstallableDblResourceTests
{
[Test]
public void ExtractAllAsync_WithDirectoryTraversalEntry_ThrowsInvalidOperation()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tested a zip file with a symlink on Linux and Windows, and the SharpZipLib library we use just extracts the file with the contents set to the target for the symlink.

I had not understood just what you meant. So I did some testing today. I found that using SharpZipLib was not creating symlinks when it unzipped a zip file that did contain symlinks.

cd $(mktemp -d) && 
ln -s /etc/passwd && 
ln -s /var && 
zip --symlinks out.zip var passwd && 
mkdir output && 
mv out.zip output && 
cd output && 
TestTool unzip out.zip 
-rw-rw-r-- 1    311 2026-01-12 11:42 out.zip
-rw-rw-r-- 1      4 2026-01-12 11:42 var
-rw-rw-r-- 1     11 2026-01-12 11:42 passwd

$ file var passwd
var:    ASCII text, with no line terminators
passwd: ASCII text, with no line terminators
$ hd var
00000000  2f 76 61 72                                       |/var|
00000004
$ hd passwd
00000000  2f 65 74 63 2f 70 61 73  73 77 64                 |/etc/passwd|
0000000b

I am satisfied that we will not be creating symlinks.

@marksvc
Copy link
Collaborator

marksvc commented Jan 12, 2026

src/SIL.XForge.Scripture/Services/SFInstallableDblResource.cs line 425 at r4 (raw file):

            if (_fileSystemService.FileExists(entryPath))
                continue; // Don't overwrite

I didn't see why the current code won't be overwriting files. Maybe it does not overwrite files by default?

Copy link
Collaborator

@marksvc marksvc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@marksvc reviewed 1 file and all commit messages, and resolved 2 discussions.
Reviewable status: all files reviewed, 5 unresolved discussions (waiting on @Nateowami and @pmachapman).

@marksvc
Copy link
Collaborator

marksvc commented Jan 12, 2026

src/SIL.XForge.Scripture/Services/SFInstallableDblResource.cs line 425 at r4 (raw file):

Previously, marksvc wrote…

I didn't see why the current code won't be overwriting files. Maybe it does not overwrite files by default?

:) Nevermind

Copy link
Collaborator

@pmachapman pmachapman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pmachapman made 1 comment, resolved 3 discussions, and dismissed @github-advanced-security[bot] from 2 discussions.
Reviewable status: 0 of 4 files reviewed, 1 unresolved discussion (waiting on @marksvc and @Nateowami).

public sealed class SFInstallableDblResourceTests
{
[Test]
public void ExtractAllAsync_WithDirectoryTraversalEntry_ThrowsInvalidOperation()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. I've added checking for symlinks and a unit test for this.

Copy link
Collaborator

@marksvc marksvc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@marksvc reviewed 4 files and all commit messages, made 1 comment, and resolved 1 discussion.
Reviewable status: all files reviewed, 3 unresolved discussions (waiting on @Nateowami).


src/SIL.XForge.Scripture/Services/ZipEntryExtensions.cs line 16 at r5 (raw file):
After looking into it, I understand that checking with (entry.ExternalFileAttributes >> 16) & 0xF000 == 0xA000 is in principle a more accurate way to check.

The reason is because other file types could have a bitmask that is the symlink bit mask as well as some other bits set to 1. Such a file type bitmask would also be matched by the proposed PR code as a false positive, but not matched by the above.

Now, even if some other file types did cause such a false positive, the code as written will still detect symlinks. And it so happens that the other file types don't appear to have bitmasks that will result in a false positive :). So.. maybe I shouldn't be picky. But we may as well write it the prescribed way that has less potential to give an inaccurate result.

For reference, I see that bits/stat.h defines

#define	__S_IFMT	0170000	/* These bits determine file type.  */

/* File types.  */
#define	__S_IFDIR	0040000	/* Directory.  */
#define	__S_IFCHR	0020000	/* Character device.  */
#define	__S_IFBLK	0060000	/* Block device.  */
#define	__S_IFREG	0100000	/* Regular file.  */
#define	__S_IFIFO	0010000	/* FIFO.  */
#define	__S_IFLNK	0120000	/* Symbolic link.  */
#define	__S_IFSOCK	0140000	/* Socket.  */

And https://www.gnu.org/software/libc/manual/html_node/Testing-File-Type.html says

An alternate non-POSIX method of testing the file type is supported for compatibility with BSD. The mode can be bitwise AND-ed with S_IFMT to extract the file type code, and compared to the appropriate constant. For example,

S_ISCHR (mode)

is equivalent to:

((mode & S_IFMT) == S_IFCHR)

(and S_IFMT == 0xF000).

(Also, & 0xFFFF, rather than & 0xF000, may leave some bits set that are used by things other than file type, resulting in us return false even if the file type is in fact a symlink, but I haven't done enough looking around to confirm if 0x0FFF is the permission bits or what it is.)

@marksvc
Copy link
Collaborator

marksvc commented Jan 13, 2026

test/SIL.XForge.Scripture.Tests/Services/SFInstallableDblResourceTests.cs line 51 at r5 (raw file):

    [TestCase("../evil.txt")]
    [TestCase("foo/../../evil.txt")]
    public void ExtractAllAsync_WithDirectoryTraversalEntry_ThrowsInvalidOperation(string entryName)

BTW, FWIW I also tried with these and they also passed the test (by throwing an exception).

    [TestCase(".")]
    [TestCase("..")]
    [TestCase("..\\evil.txt")]
    [TestCase("foo\\..\\..\\evil.txt")]
    [TestCase("foo/..\\../evil.txt")]

@pmachapman pmachapman force-pushed the fix/dbl-resource-extraction-logic branch from 9a67163 to cad08e6 Compare January 13, 2026 21:32
Copy link
Collaborator

@pmachapman pmachapman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pmachapman reviewed all commit messages, made 1 comment, resolved 2 discussions, and dismissed @github-advanced-security[bot] from 2 discussions.
Reviewable status: all files reviewed, 1 unresolved discussion (waiting on @Nateowami).


test/SIL.XForge.Scripture.Tests/Services/SFInstallableDblResourceTests.cs line 51 at r5 (raw file):

Previously, marksvc wrote…

BTW, FWIW I also tried with these and they also passed the test (by throwing an exception).

    [TestCase(".")]
    [TestCase("..")]
    [TestCase("..\\evil.txt")]
    [TestCase("foo\\..\\..\\evil.txt")]
    [TestCase("foo/..\\../evil.txt")]

Done. I have added these test cases.

@pmachapman pmachapman force-pushed the fix/dbl-resource-extraction-logic branch from cad08e6 to db1374d Compare January 13, 2026 21:58
Copy link
Collaborator

@pmachapman pmachapman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pmachapman made 3 comments, resolved 3 discussions, and dismissed @github-advanced-security[bot] from a discussion.
Reviewable status: 2 of 4 files reviewed, all discussions resolved (waiting on @marksvc).


src/SIL.XForge.Scripture/Services/ZipEntryExtensions.cs line 16 at r5 (raw file):

Previously, marksvc wrote…

After looking into it, I understand that checking with (entry.ExternalFileAttributes >> 16) & 0xF000 == 0xA000 is in principle a more accurate way to check.

The reason is because other file types could have a bitmask that is the symlink bit mask as well as some other bits set to 1. Such a file type bitmask would also be matched by the proposed PR code as a false positive, but not matched by the above.

Now, even if some other file types did cause such a false positive, the code as written will still detect symlinks. And it so happens that the other file types don't appear to have bitmasks that will result in a false positive :). So.. maybe I shouldn't be picky. But we may as well write it the prescribed way that has less potential to give an inaccurate result.

For reference, I see that bits/stat.h defines

#define	__S_IFMT	0170000	/* These bits determine file type.  */

/* File types.  */
#define	__S_IFDIR	0040000	/* Directory.  */
#define	__S_IFCHR	0020000	/* Character device.  */
#define	__S_IFBLK	0060000	/* Block device.  */
#define	__S_IFREG	0100000	/* Regular file.  */
#define	__S_IFIFO	0010000	/* FIFO.  */
#define	__S_IFLNK	0120000	/* Symbolic link.  */
#define	__S_IFSOCK	0140000	/* Socket.  */

And https://www.gnu.org/software/libc/manual/html_node/Testing-File-Type.html says

An alternate non-POSIX method of testing the file type is supported for compatibility with BSD. The mode can be bitwise AND-ed with S_IFMT to extract the file type code, and compared to the appropriate constant. For example,

S_ISCHR (mode)

is equivalent to:

((mode & S_IFMT) == S_IFCHR)

(and S_IFMT == 0xF000).

(Also, & 0xFFFF, rather than & 0xF000, may leave some bits set that are used by things other than file type, resulting in us return false even if the file type is in fact a symlink, but I haven't done enough looking around to confirm if 0x0FFF is the permission bits or what it is.)

Done. You are right. I think I got confused. Thank you for spotting that!!!!!


await env.Resource.ExtractAllAsync(zip, env.DestinationRoot);

env.FileSystem.Received().CreateDirectory(Path.Combine(env.DestinationRoot, ".dbl"));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

await env.Resource.ExtractAllAsync(zip, env.DestinationRoot);

env.FileSystem.Received().CreateDirectory(Path.Combine(env.DestinationRoot, ".dbl"));
env.FileSystem.Received().CreateFile(Path.Combine(env.DestinationRoot, ".dbl", "dbl_id_here"));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Copy link
Collaborator

@marksvc marksvc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@marksvc reviewed 2 files and all commit messages.
Reviewable status: :shipit: complete! all files reviewed, all discussions resolved (waiting on @Nateowami).

@Nateowami Nateowami merged commit 8f424fe into master Jan 14, 2026
27 of 28 checks passed
@Nateowami Nateowami deleted the fix/dbl-resource-extraction-logic branch January 14, 2026 02:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

e2e Run e2e tests for this pull request testing not required

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants