Releases: bellingcat/auto-archiver
Deletion detection, file metadata filtering, bug fixes, dependency updates
What's Changed
- implementing default metadata omission/user metadata selection by @mjgaughan in #359
- Fix #335: Add comprehensive deletion detection for removed/unavailable content by @m4cd4r4 in #377
- 1.2.0 dependencies, small bugs, 1st time contributors by @msramalho in #385
New Contributors
Full Changelog: v1.1.6...v1.2.0
Facebook reels fix
v1.1.5 dependencies update
What's Changed
- Correction of small documentation typos by @mjgaughan in #345
- updating the style-checking code in the documentation by @mjgaughan in #352
- WIP small improvements on multiple fronts. by @msramalho in #357
New Contributors
- @mjgaughan made their first contribution in #345
Full Changelog: v1.1.2...v1.1.5
v1.1.4
What's Changed
- Correction of small documentation typos by @mjgaughan in #345
- updating the style-checking code in the documentation by @mjgaughan in #352
- WIP small improvements on multiple fronts. by @msramalho in #357
New Contributors
- @mjgaughan made their first contribution in #345
Full Changelog: v1.1.2...v1.1.4
v1.1.2
What's Changed
- Address several small bugs, includes tiktok photos extraction, and data-saving for proxy usage in generic_extractor. by @msramalho in #341
Full Changelog: v1.1.1...v1.1.2
v1.1.1
What's Changed
- Bump @types/react from 19.1.7 to 19.1.8 in /scripts/settings in the actions group across 1 directory by @dependabot in #326
- fix to configuration editor npm versions by @msramalho in #328
- installs ffmpeg in readthedocs by @msramalho in #329
- 1.1.1 multiple small fixes, and new logging strategy by @msramalho in #331
Full Changelog: v1.1.0...v1.1.1
v1.1.0
What's Changed
- Changed log level on gsheet_feeder_db started from warning to info by @djhmateer in #301
- Replaces ScreenshotEnricher with AntibotExtractorEnricher, removes VkExtractor by @msramalho in #311
- counter_screenshots to counter_warc_files in wacz_extractor so don't … by @djhmateer in #310
- Bump the actions group in /scripts/settings with 4 updates by @dependabot in #308
- Introduces more flexibility to the Antibot Extractor by @msramalho in #313
- Bump webrecorder/browsertrix-crawler from 1.6.1 to 1.6.2 by @dependabot in #315
- Adds RedditDropin and other flow improvements by @msramalho in #318
- Antibot Dropin for Linkedin by @msramalho in #319
- V1 dm changes including logging by @djhmateer in #320
- v1.1.0 WIP by @msramalho in #312
- Bug fixes
- Deprecates Vk Extractor due to sub-dependency issues
- Deprecates ScreenshotEnricher
- Adds AntibotExtractorEnricher which does what ScreenshotE did and more, including captcha evasion, and reduces firefox/gekodriver maintenance hassle
- NPM/poetry package updates
- Adds VkDropin to replace VkExtractor to archive with logins
- Adds RedditDropin to archive with logins
- Adds LinkedinDropin to archive with logins
- Auto-documentation of new Antibot Dropins
- 1.0.1 to 1.10 migration notes
- Adds JsonEnricher by @djhmateer
- Adds new logging configurations
Full Changelog: v1.0.1...v1.1.0
v1.0.1
What's Changed
- catch for if self.comments are true but no actual comments in video by @djhmateer in #303
- v1.0.1 dependency updates, generic extractor improvements by @msramalho in #307
Full Changelog: v1.0.0...v1.0.1
v1.0.0
🎉 Auto Archiver v1.0.0
We're excited to announce a major stable release of the Auto Archiver tool!
This release comes with some major improvements in stability and flexibility, with a modular and extendable architecture supporting a wide range of archiving extractors, enrichers, feeders, databases and storage methods.
✨ New Features
-
We now have full documentation:
📖 https://auto-archiver.readthedocs.io/en/latest/- (Psst: If you were set up for
v0.12or below, check out our upgrade guide to adapt to the new config format.)
- (Psst: If you were set up for
-
Getting set up for new users should be much easier using the new config editor:
🛠 https://auto-archiver.readthedocs.io/en/latest/installation/config_editor.html -
New
print_pdfoption in the screenshot enricher #159 -
Added an unauthenticated Bluesky archiver #160
-
Major enhancements to the Generic Extractor (previously
youtubedl_archiver.py):- Extending it to extract targeted content from additional sites using a Dropin structure
- Auto-updating the yt-dlp module to ensure latest fixes and compatibility"
- Universal support for valid youtube-dl URLs
- New TruthSocial extractor
-
New settings page UI #217
-
Added support for
yt-dlpPO Token clients #222 -
New unofficial API-based TikTok extractor #237
-
Added InstagrAPI server script authenticated access for the
instagram_api_extractor.py#281
🧹 Stability and Modularity
-
The new modular structure means only modules selected in your config are loaded, keeping the system lightweight and more resilient — if a module you don’t need breaks, your pipeline continues to work.
-
Multiple authentication strategies have been added to reduce the likelihood of platform blocking.
-
The setup process now validates your config and gives detailed error messages in the output log.
-
We’ve increased test coverage and integrated it with GitHub Actions for CI
-
Dependency and packaging management has been migrated to Poetry.
Coming soon!
- Bellingcat will release an article and video guide on the Auto Archiver in the coming weeks.
💬 Keep in Touch!
Whether it's a question, a bug report, or a feature request, please get in touch!
-
Join our Discord thread for tool support and community discussion:
🔗 https://discord.com/channels/709752884257882135/1346596825611632770 -
Found a bug or want to contribute? Open a GitHub issue or PR using our contribution guide:
🤝 https://auto-archiver.readthedocs.io/en/latest/contributing.html
🙌 Thanks
A huge thank you to everyone who has contributed, as well as those who provided feedback and ideas throughout development.
@msramalho @pjrobertson
v0.13.9
What's Changed
- Timestamping enricher rewrite - now works with latest ubuntu + fixes various other issues by @pjrobertson in #224
- Add explicit dependabots for pip/poetry, GH actions and npm by @pjrobertson in #269
- Minor improvements by @pjrobertson in #268
- Force-pins cryptography to >44.0.1 to fix dependabot warning by @pjrobertson in #278
Full Changelog: v0.13.8...v0.13.9