Skip to content

Conversation

@Yvette-0508
Copy link
Owner

  • Integrated LlamaParse for multi-modal document parsing (PDF, DOCX, XLSX, HTML, images)
  • Added Supabase cloud storage for extraction results
  • Created universal test script (test_extractor.py) with --store flag
  • Support for audio transcription via OpenAI Whisper
  • Chunking with table and formula extraction from markdown

- Integrated LlamaParse for multi-modal document parsing (PDF, DOCX, XLSX, HTML, images)
- Added Supabase cloud storage for extraction results
- Created universal test script (test_extractor.py) with --store flag
- Support for audio transcription via OpenAI Whisper
- Chunking with table and formula extraction from markdown
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants