Skip to content

Pull requests: Unstructured-IO/unstructured

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

fix: update deps to resolve cve
#4093 opened Sep 9, 2025 by qued Loading… updated Sep 9, 2025
⚡️ Speed up function group_broken_paragraphs by 30%
#4088 opened Aug 26, 2025 by aseembits93 Loading… updated Sep 5, 2025
⚡️ Speed up function _assign_hash_ids by 34%
#4089 opened Aug 28, 2025 by aseembits93 Loading… updated Sep 5, 2025
⚡️ Speed up method ElementHtml._get_children_html by 234%
#4087 opened Aug 26, 2025 by aseembits93 Loading… updated Sep 5, 2025
bugfix/fix missing extensions in file detection
#3926 opened Feb 18, 2025 by rbiseck3 Loading… updated Sep 4, 2025
fix: None text attribute when normalizing Picture to Image element
#4083 opened Aug 22, 2025 by ishahroz Loading… updated Aug 22, 2025
Switch from pdfminer to paves to improve robustness and use multiple CPUs
#4067 opened Jul 19, 2025 by dhdaines Loading… updated Jul 21, 2025
Feature/remove unnessary re for table ele in pdf
#3984 opened Apr 9, 2025 by JIAQIA Loading… updated Jul 1, 2025
Config for VoyageAI's v3.5 embedding models
#4004 opened May 21, 2025 by fzowl Loading… updated May 26, 2025
Prefer using provided filename over detection from file.name
#3786 opened Nov 19, 2024 by framp Loading… updated Apr 4, 2025
Improve readability of the text by adding new line to the end of row
#3913 opened Feb 7, 2025 by Sheripov Loading… updated Feb 7, 2025
fix: preserve text after line breaks in PowerPoint table cells
#3877 opened Jan 18, 2025 by yamazombie Loading… updated Feb 5, 2025
feat: Allow deactivating OCR entirely with hi_res strategy
#3839 opened Dec 17, 2024 by dhdaines Loading… updated Jan 30, 2025
Fix typing issue in inference_utils.py
#3716 opened Oct 12, 2024 by cckolon Loading… updated Jan 22, 2025
Add password
#3876 opened Jan 18, 2025 by Coniferish Draft updated Jan 21, 2025
add post chunking strategy
#3869 opened Jan 16, 2025 by tbs17 Draft updated Jan 16, 2025
fix: Fix issue #3815
#3835 opened Dec 17, 2024 by PhorstenkampFuzzy Loading… updated Dec 19, 2024
#3713 fix the wrong file path in README.md documentation Improvements or additions to documentation
#3714 opened Oct 10, 2024 by shaofengshi Loading… updated Dec 16, 2024
fix: when convert doc to docx, UnicodeDecodeError may be raised
#3830 opened Dec 14, 2024 by YooshiJay Loading… updated Dec 14, 2024
Add a note to README.md about CHANGELOG.md
#3824 opened Dec 12, 2024 by dhdaines Loading… updated Dec 12, 2024
chore: Switch to v4 of upload artifact
#3820 opened Dec 10, 2024 by fxdgear Loading… updated Dec 10, 2024
update scarf_analytics() GET request with timeouts
#3780 opened Nov 13, 2024 by garyfanhku Loading… updated Nov 13, 2024
fixed pdf path error.
#3777 opened Nov 9, 2024 by mzdz Loading… updated Nov 9, 2024
build(deps): bump ruff from 0.4.10 to 0.7.2 in /requirements dependencies Pull requests that update a dependency file python Pull requests that update Python code
#3771 opened Nov 1, 2024 by dependabot bot Loading… updated Nov 1, 2024
ProTip! Type g p on any issue or pull request to go back to the pull request listing page.