A small Python script for automatically searching through PDF files in specific folders for defined keywords.
Perfect for situations where you have many local PDFs and need to quickly find which ones contain relevant content.
- π Path selection: Choose from predefined search paths
- π Recursive folder search: Includes all subfolders
- π Full-text search in PDFs (single or double keyword search)
- π Simple result output in the terminal
β οΈ Error handling for unreadable PDFs
- Python 3.x
- Libraries:
pip install PyPDF2
- Clone the repository or download the script
- Adjust the search paths
path1 = 'YOUR\\PATH\\HERE' path2 = 'YOUR\\PATH\\HERE' path3 = 'YOUR\\PATH\\HERE'
- Run the script
python pdf_search.py
- Select a path
The terminal will prompt you to choose one of the three preset search paths. - Enter keyword(s)
- Only the first keyword β single keyword search
- Two keywords β both must appear in the document
- View the results
The terminal will list the found files with their full paths.
Which path should be selected? (1: PATH1, 2: PATH2, 3: PATH3): 1
Keyword 1: invoice
Keyword 2: 2023
Searching for: ['invoice', '2023'] Number of terms: 2
Search results:
C:\Documents\Projects\Finance\invoice_april_2023.pdf
C:\Documents\Invoices\Clients\invoice_may_2023.pdf
- π Progress indicator for large numbers of files
- π Export results to CSV or HTML
- π§ Regular expression support for more complex searches
- π₯ GUI version for users unfamiliar with the command line