Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -832,15 +832,13 @@ private List<Double> parseTextShowArgument(COSBase argument, StringBuilder unico
double shift = - obj.getReal() / 1000 *
graphicsState.getTextState().getTextFontSize() *
graphicsState.getTextState().getHorizontalScaling();
if (-obj.getReal() >= TextChunkUtils.TEXT_CHUNK_SPACE_RATIO && StaticStorages.getIsAddSpacesBetweenTextPieces()) {
textPieces.add(new TextPieces.TextPiece(" ", textPieces.getCurrentX(),
textPieces.getCurrentX() + shift));
} else {
textPieces.shiftCurrentX(shift);
}
textPieces.shiftCurrentX(shift);
}
}
}
double threshold = graphicsState.getTextState().getTextFontSize() * TextChunkUtils.TEXT_LINE_SPACE_RATIO;
textPieces.addSpaces(threshold);
Comment on lines +839 to +840
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

rg -n "TEXT_LINE_SPACE_RATIO" --type java -C 3

Repository: veraPDF/veraPDF-validation

Length of output: 343


🏁 Script executed:

find . -name "TextPieces.java" -type f

Repository: veraPDF/veraPDF-validation

Length of output: 153


🏁 Script executed:

cat -n ./wcag-validation/src/main/java/org/verapdf/gf/model/factory/chunks/TextPieces.java

Repository: veraPDF/veraPDF-validation

Length of output: 4704


🏁 Script executed:

sed -n '830,850p' ./wcag-validation/src/main/java/org/verapdf/gf/model/factory/chunks/ChunkParser.java | cat -n

Repository: veraPDF/veraPDF-validation

Length of output: 1220


🏁 Script executed:

rg -n "TEXT_LINE_SPACE_RATIO\s*=" --type java

Repository: veraPDF/veraPDF-validation

Length of output: 52


🏁 Script executed:

rg -n "TEXT_LINE_SPACE_RATIO" --type java -B 2 -A 2

Repository: veraPDF/veraPDF-validation

Length of output: 679


🏁 Script executed:

find . -name "TextChunkUtils.java" -type f

Repository: veraPDF/veraPDF-validation

Length of output: 52


🏁 Script executed:

rg "TEXT_LINE_SPACE_RATIO" --type java

Repository: veraPDF/veraPDF-validation

Length of output: 270


🏁 Script executed:

find . -type f -name "*.java" | xargs grep -l "class TextChunkUtils"

Repository: veraPDF/veraPDF-validation

Length of output: 54


🏁 Script executed:

find . -type f -name "*.java" | xargs grep -l "TEXT_LINE_SPACE_RATIO"

Repository: veraPDF/veraPDF-validation

Length of output: 154


🏁 Script executed:

find . -name "*.java" -type f -exec grep -l "TEXT_LINE_SPACE_RATIO" {} \;

Repository: veraPDF/veraPDF-validation

Length of output: 154


🏁 Script executed:

find . -name "*.java" -type f | head -50 | xargs grep -l "TextChunkUtils"

Repository: veraPDF/veraPDF-validation

Length of output: 54


🏁 Script executed:

head -100 ./wcag-validation/src/main/java/org/verapdf/gf/model/factory/chunks/ChunkParser.java | grep -E "^import|^package"

Repository: veraPDF/veraPDF-validation

Length of output: 1561


🏁 Script executed:

rg "class TextChunkUtils" --type java -A 10

Repository: veraPDF/veraPDF-validation

Length of output: 52


🏁 Script executed:

find . -name "TextChunkUtils.java" -type f 2>/dev/null

Repository: veraPDF/veraPDF-validation

Length of output: 52


🏁 Script executed:

rg "TEXT_LINE_SPACE_RATIO" --type java -C 5

Repository: veraPDF/veraPDF-validation

Length of output: 1445


🏁 Script executed:

sed -n '860,880p' ./wcag-validation/src/main/java/org/verapdf/gf/model/factory/chunks/ChunkParser.java | cat -n

Repository: veraPDF/veraPDF-validation

Length of output: 1472


🏁 Script executed:

sed -n '890,900p' ./wcag-validation/src/main/java/org/verapdf/gf/model/factory/chunks/ChunkParser.java | cat -n

Repository: veraPDF/veraPDF-validation

Length of output: 591


Threshold for space detection doesn't account for horizontal scaling.

The gap between consecutive text pieces (line 92 in TextPieces.java) is calculated in a coordinate space where both glyph widths (line 873) and character shifts (line 871) are scaled by horizontalScaling. However, the threshold computed on line 839 only uses fontSize * TEXT_LINE_SPACE_RATIO without including horizontalScaling. This causes incorrect space insertion when horizontalScaling deviates from 1.0.

Proposed fix
-            double threshold = graphicsState.getTextState().getTextFontSize() * TextChunkUtils.TEXT_LINE_SPACE_RATIO;
+            double threshold = graphicsState.getTextState().getTextFontSize() *
+                    graphicsState.getTextState().getHorizontalScaling() * TextChunkUtils.TEXT_LINE_SPACE_RATIO;
🤖 Prompt for AI Agents
In
`@wcag-validation/src/main/java/org/verapdf/gf/model/factory/chunks/ChunkParser.java`
around lines 839 - 840, The space-detection threshold in ChunkParser currently
uses graphicsState.getTextState().getTextFontSize() *
TextChunkUtils.TEXT_LINE_SPACE_RATIO but ignores horizontal scaling; change the
computation to include the text state's horizontal scaling (e.g., multiply by
graphicsState.getTextState().getHorizontalScaling()) so the threshold becomes
fontSize * horizontalScaling * TextChunkUtils.TEXT_LINE_SPACE_RATIO, then pass
that adjusted threshold into textPieces.addSpaces(threshold) to ensure spaces
are detected correctly when horizontalScaling != 1.0.


unicodeValue.append(textPieces.getValue());
if (!textPieces.isEmpty()) {
textMatrix.concatenate(Matrix.getTranslateInstance(textPieces.getStartX(), 0));
Expand All @@ -867,17 +865,18 @@ private void parseString(COSString string, StringBuilder unicodeValue, TextPiece
" in font" + graphicsState.getTextState().getTextFont().getName());
width = 0.0;
}
double shift = (width *
graphicsState.getTextState().getTextFontSize() / 1000 +
graphicsState.getTextState().getCharacterSpacing() + (code == 32 ?
double shift = (graphicsState.getTextState().getCharacterSpacing() + (code == 32 ?
graphicsState.getTextState().getWordSpacing() : 0)) *
graphicsState.getTextState().getHorizontalScaling();
String value = graphicsState.getTextState().getTextFont().toUnicode(code);
width = width *
graphicsState.getTextState().getTextFontSize() / 1000 *
graphicsState.getTextState().getHorizontalScaling();
String value = graphicsState.getTextState().getTextFont().toUnicode(code);
if (symbolEnds != null) {
if (symbolEnds.isEmpty()) {
TextChunksHelper.updateSymbolEnds(symbolEnds, shift, 0, value != null ? value.length() : 0);
TextChunksHelper.updateSymbolEnds(symbolEnds, shift + width, 0, value != null ? value.length() : 0);
} else {
TextChunksHelper.updateSymbolEnds(symbolEnds, shift, symbolEnds.get(symbolEnds.size() - 1),
TextChunksHelper.updateSymbolEnds(symbolEnds, shift + width, symbolEnds.get(symbolEnds.size() - 1),
value != null ? value.length() : 0);
}
}
Expand All @@ -892,7 +891,8 @@ private void parseString(COSString string, StringBuilder unicodeValue, TextPiece
unicodeValue.append(result);
} else {
textPieces.add(new TextPieces.TextPiece(result, textPieces.getCurrentX(),
textPieces.getCurrentX() + shift));
textPieces.getCurrentX() + width));
textPieces.shiftCurrentX(shift);
}
}
} catch (IOException e) {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -77,6 +77,26 @@ public List<Double> getSymbolEnds() {
return ends;
}

public void addSpaces(double threshold) {
List<TextPiece> spaces = new ArrayList<>();
Iterator<TextPiece> validation = textPieces.iterator();
if (!validation.hasNext()) {
return;
}
TextPiece prev = validation.next();
double previousEnd = prev.getEndX();

while (validation.hasNext()) {
TextPiece piece = validation.next();
double currentStart = piece.getStartX();
if (currentStart - previousEnd > threshold) {
spaces.add(new TextPieces.TextPiece(" ", previousEnd, currentStart));
}
previousEnd = piece.getEndX();
}
textPieces.addAll(spaces);
}

public static class TextPiece {
private final String value;
private final double startX;
Expand All @@ -91,6 +111,10 @@ public TextPiece(String value, double startX, double endX) {
public double getEndX() {
return endX;
}

public double getStartX() {
return startX;
}
}

public static class TextPieceComparator implements Comparator<TextPiece> {
Expand Down