Skip to content

dolkensp/pdfix

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

PDF Inspector

This repository contains a .NET 8 console app that inspects a PDF and lists the key components of each page (text, images, vector paths, and graphics operations). It is geared toward locating and isolating vector content for later modification.

Quick start

# From the repo root
PATH="$HOME/.dotnet:$PATH" dotnet run --project PdfInspector.App -- sample.pdf \
  --output components.json \
  --vector-bbox 0,0,400,400 \
  --pages 1-2

Options:

  • --pages 1,3-4 – restricts inspection to specific pages.
  • --vector-bbox minX,minY,maxX,maxY – only include vector paths whose bounds intersect the provided rectangle (PDF coordinate space).
  • --line-color #ffcccc – only include stroked vector paths that match the provided hex color (RGB, with a small tolerance).
  • --edit-page <n> --edit-path <id> --edit-stroke-color #00ff00 --edit-line-start x,y --edit-line-end x,y --output-pdf edited.pdf – rewrite the PDF with an updated path (stroke color and optional endpoints). Path IDs match the per-page index from the inspection output.
  • --debug-randomize-lines --output-pdf edited.pdf – overlay every stroked path with a random color and print the page/path→hex mapping.
  • --find-color #ffcccc – list all stroked paths and text fragments that match the given color (with tolerance).
  • --output <file> – write the full JSON report to a file.

Output structure

The JSON report includes:

  • Document metadata (title, author, creator, producer, PDF version, creation/modification dates).
  • Per-page dimensions, rotation, a text preview, and operation counts (top graphics operators on the page).
  • Text entries (text content, bounding box, font name/size, orientation).
  • Images (bounds, pixel dimensions, bits per component, rendering intent, color space, mask flag).
  • Vector paths with stroke/fill details and subpath commands (moves, lines, Bezier curves) plus bounding boxes.

Use the --vector-bbox filter to zero in on vectors drawn inside a specific area (e.g., around a logo or illustration) before editing the PDF with another tool.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages