Indexing - PDFs to records

	1.	Extract Text from PDFs:
	•	Use a library like PyMuPDF, PyPDF2, or pdfminer to extract text from each PDF.
	2.	Preprocess the Text:
	• Lower case, etc
	4.	Store and Index the Text using one of the following methods:
	•	Use SQLite for a simple, SQL-based index.
	•	Use libraries like Whoosh for full-text search.
	•	Use distributed systems like Elasticsearch for large-scale search.