Generate more unique names for new book folders (BL-15608) by JohnThomson · Pull Request #7683 · BloomBooks/BloomDesktop

JohnThomson · 2026-02-11T20:28:19Z

This change is

devin-ai-integration

Devin Review found 2 potential issues.

View 4 additional findings in Devin Review.

src/BloomExe/Book/BookStorage.cs

src/BloomTests/Book/BookStarterTests.cs

src/BloomExe/Book/BookStorage.cs

devin-ai-integration · 2026-02-12T21:53:27Z

src/BloomExe/Book/BookStorage.cs

-            var copyNum = 1;
+            // see if this already has a name like "foo - Copy-abc12345" or "foo - Copy 3"
+            // (The second is obsolete, but might still be encountered.)
+            // If it does, remove that suffix so we get back to the base name


📝 Info: Greedy regex in Duplicate() could match unintended patterns in book names containing ' - Copy'

The regex ^(.+)(\s-\sCopy)(-[0-9a-f]+)?$ at line 3453 uses a greedy (.+) for the first group. If a book title legitimately contains - Copy (e.g., How to - Copy Documents), the regex will match and strip the suffix. However, this is the same behavior as the old regex ^(.+)(\s-\sCopy)(\s[0-9]+)?$, and since (.+) is greedy, it will match the last occurrence of - Copy in the string. For a name like How to - Copy Documents - Copy-abc12345, group 1 would capture How to - Copy Documents which is correct. For How to - Copy Documents (no copy suffix), the regex matches with group 1 = How to which would incorrectly strip part of the title. But this pre-existing edge case was also present in the old code, so it's not a regression.

Was this helpful? React with 👍 or 👎 to provide feedback.

If this is right, it seems like a pre-existing bug we should fix now.

It would not match the example given, because a match requires that " - Copy" is followed by either a space and decimal number or a hyphen and hex number. If the user duplicates a book whose actual title is "My book - Copy 3", Bloom will give the duplicate an initial name like "My book - Copy-1a2b3c4d". And then probably the user changes the title to "My book - Copy 4" if that's really what he wants, and the folder name will change to match.
Pathologically the book title could be something like "My book - Copy-bad" which would match the first alternative in the regex. Even then, it's not clear to me that somehow producing "My book - Copy-bad - Copy-1a2b3c4d" is better than "My book - Copy-1a2b3c4d" for a first-pass name that probably won't last.
Unless we compare the existing name with the book title (and that's only even relevant if the user isn't doing a custom folder name), I don't see how we can NOT match a title that happens to look like one of our de-duplicated names. I believe Devin is wrong that the greedy initial regex makes things worse; neither of the regex's in the final group will match "Copy" so either way the regex can only match the last occurrence of " - Copy" in the title, and then only if what follows matches a fairly restrictive pattern.
I don't think it's worth worrying about.

src/BloomExe/Book/BookStorage.cs

JohnThomson

@JohnThomson made 3 comments.
Reviewable status: 0 of 4 files reviewed, 3 unresolved discussions (waiting on @andrew-polk).

src/BloomExe/Book/BookStorage.cs

JohnThomson · 2026-02-13T17:31:41Z

src/BloomExe/Book/BookStorage.cs

-            var copyNum = 1;
+            // see if this already has a name like "foo - Copy-abc12345" or "foo - Copy 3"
+            // (The second is obsolete, but might still be encountered.)
+            // If it does, remove that suffix so we get back to the base name


It would not match the example given, because a match requires that " - Copy" is followed by either a space and decimal number or a hyphen and hex number. If the user duplicates a book whose actual title is "My book - Copy 3", Bloom will give the duplicate an initial name like "My book - Copy-1a2b3c4d". And then probably the user changes the title to "My book - Copy 4" if that's really what he wants, and the folder name will change to match.
Pathologically the book title could be something like "My book - Copy-bad" which would match the first alternative in the regex. Even then, it's not clear to me that somehow producing "My book - Copy-bad - Copy-1a2b3c4d" is better than "My book - Copy-1a2b3c4d" for a first-pass name that probably won't last.
Unless we compare the existing name with the book title (and that's only even relevant if the user isn't doing a custom folder name), I don't see how we can NOT match a title that happens to look like one of our de-duplicated names. I believe Devin is wrong that the greedy initial regex makes things worse; neither of the regex's in the final group will match "Copy" so either way the regex can only match the last occurrence of " - Copy" in the title, and then only if what follows matches a fairly restrictive pattern.
I don't think it's worth worrying about.

src/BloomExe/Book/BookStorage.cs

devin-ai-integration

Devin Review found 4 new potential issues.

View 6 additional findings in Devin Review.

src/BloomExe/Book/BookStorage.cs

devin-ai-integration · 2026-02-13T17:36:31Z

src/BloomExe/Book/BookStarter.cs

+        private string GetInitialName(string parentCollectionPath, string instanceId)
        {
            var name = BookStorage.SanitizeNameForFileSystem(UntitledBookName);
-            return BookStorage.GetUniqueFolderName(parentCollectionPath, name);
+            return BookStorage.GetUniqueBookFolderName(parentCollectionPath, name, instanceId);


🚩 Behavior change: New books ALWAYS get a GUID suffix in folder name

Previously, GetInitialName called GetUniqueFolderName which would return the plain name (e.g., "Book") if no conflict existed, only appending 2, 3, etc. on conflict. Now GetUniqueBookFolderName at src/BloomExe/Book/BookStorage.cs:3533 ALWAYS appends separator + instanceSuffix (e.g., "Book-abc12345"), even when there's no naming conflict. This is intentional per the PR description (BL-15608), but it's a significant UX change: every new book folder will have a seemingly random suffix. Reviewers should confirm this is the desired behavior for all new-book scenarios.

Was this helpful? React with 👍 or 👎 to provide feedback.

This is exactly what we're trying to accomplish: different users of a TC each attempting to create a different new book won't, even temporarily, have books with the same name.

devin-ai-integration · 2026-02-13T17:36:34Z

src/BloomExe/Book/BookStorage.cs

+            // Truncate and clean the base name if necessary
+            if (baseName.Length > maxBaseNameLength)
+            {
+                baseName = MiscUtils.TruncateSafely(baseName, maxBaseNameLength);
+                baseName = Regex.Replace(baseName, "[\\s.]+$", "", RegexOptions.Compiled);
+            }


🚩 Name truncation after suffix reservation could produce empty or whitespace-only base names

In GetUniqueBookFolderName, when baseName is truncated at line 3554 and then trailing whitespace/periods are stripped at line 3555, the result could theoretically become empty if the truncation point lands in the middle of trailing whitespace. This would produce folder names like "-abc12345" or " - Copy-abc12345". While unlikely in practice (names are sanitized before reaching this point), there's no guard against an empty baseName after cleanup. The SanitizeNameForFileSystem method at src/BloomExe/Book/BookStorage.cs:3248 does protect against empty names, but that runs before this truncation step, not after.

Was this helpful? React with 👍 or 👎 to provide feedback.

SanitizeNameForFileSystem is very careful not to return a name that will Trim() to an empty string.

JohnThomson

@JohnThomson made 2 comments.
Reviewable status: 0 of 4 files reviewed, 5 unresolved discussions (waiting on @andrew-polk).

JohnThomson · 2026-02-13T19:25:21Z

src/BloomExe/Book/BookStarter.cs

+        private string GetInitialName(string parentCollectionPath, string instanceId)
        {
            var name = BookStorage.SanitizeNameForFileSystem(UntitledBookName);
-            return BookStorage.GetUniqueFolderName(parentCollectionPath, name);
+            return BookStorage.GetUniqueBookFolderName(parentCollectionPath, name, instanceId);


This is exactly what we're trying to accomplish: different users of a TC each attempting to create a different new book won't, even temporarily, have books with the same name.

src/BloomExe/Book/BookStorage.cs

andrew-polk

@andrew-polk reviewed all commit messages and resolved 1 discussion.
Reviewable status: 0 of 4 files reviewed, 3 unresolved discussions (waiting on @JohnThomson).

JohnThomson

@JohnThomson made 4 comments.
Reviewable status: 0 of 4 files reviewed, 5 unresolved discussions (waiting on @andrew-polk).

src/BloomTests/Book/BookStorageTests.cs line 210 at r4 (raw file):

        [Test]
        public void MoveBookToSafeName_NameNotNFC_NewNameConflicts_Moves()

This test (in addition to the one above) is not useful any more; name conflicts are vanishingly unlikely and hard to simulate.

src/BloomTests/Book/BookStorageTests.cs line 1136 at r4 (raw file):

        [Test]
        public void SetBookName_FolderWithSanitizedNameAlreadyExists_AddsANumberToName()

This seems to be doing the same thing as the previous test, at least now that a conflict is so unlikely.

src/BloomExe/Book/BookStorage.cs

JohnThomson · 2026-02-13T21:11:31Z

src/BloomExe/Book/BookStorage.cs

+            // Truncate and clean the base name if necessary
+            if (baseName.Length > maxBaseNameLength)
+            {
+                baseName = MiscUtils.TruncateSafely(baseName, maxBaseNameLength);
+                baseName = Regex.Replace(baseName, "[\\s.]+$", "", RegexOptions.Compiled);
+            }


SanitizeNameForFileSystem is very careful not to return a name that will Trim() to an empty string.

devin-ai-integration

Devin Review found 2 new potential issues.

View 9 additional findings in Devin Review.

devin-ai-integration · 2026-02-13T21:11:32Z

src/BloomExe/Book/BookStorage.cs

+            var newPathForExtraBook = BookStorage.GetAvailableDirectoryPath(
+                desiredPath,
+                null,
+                " - "
+            );


🟡 MoveBookToAvailableName always appends GUID suffix even when the desired path is available

MoveBookToAvailableName now calls GetAvailableDirectoryPath, which delegates to GetUniqueBookFolderName. This function always appends separator + 8 hex chars — it never tries the base name alone first. This means that even when the desired path does not conflict with any existing folder, the book gets an unnecessary GUID suffix in its name.

Detailed Explanation and Impact

The old code called GetUniqueFolderPath(desiredPath) → GetUniqueFolderName(parent, name), which started with an empty suffix and only appended 2, 3, etc. if there was a conflict. So if the desired path was available, the book kept the clean name.

The new code at BookStorage.cs:3856-3860 calls:

var newPathForExtraBook = BookStorage.GetAvailableDirectoryPath( desiredPath, null, " - " );

which calls GetUniqueBookFolderName at BookStorage.cs:3569-3580, which always constructs baseName + separator + instanceSuffix without first checking if baseName alone is available.

This concretely affects MoveBookToSafeName (BookStorage.cs:3829). Example: a book with a non-NFC name like "Some nicé book" gets sanitized to NFC "Some nicé book". If that NFC folder doesn't exist, the old code would move the book to "Some nicé book". The new code moves it to "Some nicé book - 12345678", unnecessarily polluting the name.

Similarly, MoveBookToAvailableName is called from TeamCollection.cs:204 to resolve conflicts with existing books. If the desired path is available, the book name is needlessly modified.

The modified test at BookStorageTests.cs:203 acknowledges this with the comment: "Typically we added hyphen and a guid. Not sure if that's good."

Impact: Book folders that are moved for sanitization (e.g., NFC normalization) will always receive an unnecessary GUID suffix, making book names uglier and harder to identify. This also removes the deleted test case MoveBookToSafeName_NameNotNFC_NewNameConflicts_Moves which specifically validated the conflict scenario separately from the no-conflict scenario.

Prompt for agents

In MoveBookToAvailableName (around line 3843-3877 in BookStorage.cs), before calling GetAvailableDirectoryPath, first check if the desiredPath itself is available. If it doesn't already exist, use it directly (just call MoveDirectory to it and rename the HTM file). Only call GetAvailableDirectoryPath when the desiredPath already exists and we need to find an alternative. This preserves the old behavior where sanitized names are used directly when there's no conflict. The same logic applies within GetAvailableDirectoryPath/GetUniqueBookFolderName: consider adding an option or a separate code path for callers (like MoveBookToAvailableName) that want to try the base name first before adding a GUID suffix.

Was this helpful? React with 👍 or 👎 to provide feedback.

devin-ai-integration · 2026-02-13T21:11:33Z

src/BloomTests/Book/BookStorageTests.cs


            var expectedFolder = bookFolder.Replace("e\x0301", "\x00e9");

-            Assert.That(BookStorage.MoveBookToSafeName(bookFolder), Is.EqualTo(expectedFolder));
-            Assert.That(Directory.Exists(expectedFolder));
-            var expectedBookPath = bookPath.Replace("e\x0301", "\x00e9");
-            Assert.That(File.ReadAllText(expectedBookPath), Is.EqualTo("this is a test"));
-        }
-
-        [Test]
-        public void MoveBookToSafeName_NameNotNFC_NewNameConflicts_Moves()
-        {
-            var oldName = "Some nice\x0301 book";
-            var bookFolder = Path.Combine(_fixtureFolder.Path, oldName);
-            Directory.CreateDirectory(bookFolder);
-            var bookPath = Path.Combine(bookFolder, oldName + ".htm");
-            File.WriteAllText(bookPath, "this is a test");
-
-            var desiredFolder = bookFolder.Replace("e\x0301", "\x00e9");
-            Directory.CreateDirectory(desiredFolder);
-            var expectedFolder = desiredFolder + " 2";
-
-            Assert.That(BookStorage.MoveBookToSafeName(bookFolder), Is.EqualTo(expectedFolder));
-            Assert.That(Directory.Exists(expectedFolder));
-            var expectedBookPath = Path.Combine(
-                expectedFolder,
-                Path.GetFileName(expectedFolder) + ".htm"
-            );
+            // Typically we added hyphen and a guid. Not sure if that's good.
+            var result = BookStorage.MoveBookToSafeName(bookFolder);
+            Assert.That(result, Does.StartWith(expectedFolder));
+            Assert.That(Directory.Exists(result));
+            var expectedBookPath = Path.Combine(result, Path.GetFileName(result) + ".htm");
            Assert.That(File.ReadAllText(expectedBookPath), Is.EqualTo("this is a test"));
        }


🚩 Deleted test MoveBookToSafeName_NameNotNFC_NewNameConflicts_Moves reduces coverage

The test MoveBookToSafeName_NameNotNFC_NewNameConflicts_Moves (old lines 209-228) was deleted entirely rather than updated. This test specifically validated the case where the sanitized name conflicts with an existing folder. With the new always-add-GUID behavior, conflict is extremely unlikely, but the test still provided value in verifying that the conflict-resolution loop in GetUniqueBookFolderName works. Consider adding a unit test for GetUniqueBookFolderName directly that validates collision handling (where the first proposed name already exists as a directory).

(Refers to lines 192-209)

Was this helpful? React with 👍 or 👎 to provide feedback.

JohnThomson added 2 commits February 11, 2026 13:54

Generate more unique names for new book folders (BL-15608)

c557e04

Unit test fixes (by CoPilot)

2c0c60a

devin-ai-integration bot reviewed Feb 11, 2026

View reviewed changes

src/BloomExe/Book/BookStorage.cs Outdated Show resolved Hide resolved

src/BloomTests/Book/BookStarterTests.cs Show resolved Hide resolved

More fixes suggested by Devin

d4d7229

devin-ai-integration bot reviewed Feb 12, 2026

View reviewed changes

src/BloomExe/Book/BookStorage.cs Show resolved Hide resolved

devin-ai-integration bot reviewed Feb 12, 2026

View reviewed changes

src/BloomExe/Book/BookStorage.cs Show resolved Hide resolved

JohnThomson commented Feb 13, 2026

View reviewed changes

devin-ai-integration bot reviewed Feb 13, 2026

View reviewed changes

JohnThomson force-pushed the uniqueBookNames branch from ac9c4c7 to f2c9f7c Compare February 13, 2026 19:25

JohnThomson commented Feb 13, 2026

View reviewed changes

Post-review improvements; add guids in more places

4346384

JohnThomson force-pushed the uniqueBookNames branch from f2c9f7c to 4346384 Compare February 13, 2026 21:06

andrew-polk reviewed Feb 13, 2026

View reviewed changes

JohnThomson commented Feb 13, 2026

View reviewed changes

devin-ai-integration bot reviewed Feb 13, 2026

View reviewed changes

Uh oh!

Conversation

JohnThomson commented Feb 11, 2026 • edited by hatton Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

devin-ai-integration bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

devin-ai-integration bot Feb 12, 2026

Choose a reason for hiding this comment

Uh oh!

andrew-polk Feb 12, 2026

Choose a reason for hiding this comment

Uh oh!

JohnThomson Feb 13, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

JohnThomson left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

JohnThomson Feb 13, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

devin-ai-integration bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

devin-ai-integration bot Feb 13, 2026

Choose a reason for hiding this comment

Uh oh!

JohnThomson Feb 13, 2026

Choose a reason for hiding this comment

Uh oh!

devin-ai-integration bot Feb 13, 2026

Choose a reason for hiding this comment

Uh oh!

JohnThomson Feb 13, 2026

Choose a reason for hiding this comment

Uh oh!

JohnThomson left a comment

Choose a reason for hiding this comment

Uh oh!

JohnThomson Feb 13, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

andrew-polk left a comment

Choose a reason for hiding this comment

Uh oh!

JohnThomson left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

JohnThomson Feb 13, 2026

Choose a reason for hiding this comment

Uh oh!

devin-ai-integration bot left a comment

Choose a reason for hiding this comment

Uh oh!

devin-ai-integration bot Feb 13, 2026

Choose a reason for hiding this comment

Uh oh!

devin-ai-integration bot Feb 13, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

JohnThomson commented Feb 11, 2026 •

edited by hatton

Loading