Document intelligence framework for Python - Extract text, metadata, and structured data from PDFs, images, Office documents, and more. Built on Pandoc, PDFium, and Tesseract.
-
Updated
Oct 19, 2025 - HTML
Document intelligence framework for Python - Extract text, metadata, and structured data from PDFs, images, Office documents, and more. Built on Pandoc, PDFium, and Tesseract.
Documentation repository for the Egeria project.
Twitter, Instagram, and GeoTagging Media Intelligence. One-stop information about Social Networking sites.
This is a minimalist web server that pretend to generate the metadata of pdf. Beside that is a utility program that contribute with Litterarum.
An express app implementing a file metadata scanner
A standalone desktop application for digital forensic analysis. This tool extracts file metadata and generates an interactive, visual timeline of file creation, modification, and access events. Built with Python, Flask, and SQLite.
projects for ihrd internship screening
Add a description, image, and links to the metadata-extraction topic page so that developers can more easily learn about it.
To associate your repository with the metadata-extraction topic, visit your repo's landing page and select "manage topics."