PLAYA is intended to get objects out of PDF, with no dependencies or further analysis. So, over top of PLAYA, this package provides PDF, Analyse et Visualisation simplifiÉeS.
Or, if you prefer, PDF Analysis and Visualization for dummiES.
The goal here is not to provide elaborate, enterprise-grade, battle-tested, cloud and AI-native, completely configurable and confoundingly complex classes for ETL. It's to give you some helpful functions that you can use to poke around in PDFs and get useful things out of them, often but not exclusively in the context of a Jupyter notebook.
See the https://dhdaines.github.io/paves for more information. There will also be some helpful notebooks soon, to help you.
Install it from PyPI (as paves) with pip or uv, preferably in a
virtual environment. That's all. If you want to play around in the
source code you can use hatch or uv (your choice), for instance:
# with hatch
hatch shell
# with uv
uv venv
. .venv/bin/activate
PAVÉS is distributed under the terms of the
MIT license.