Lists (18)
Sort Name ascending (A-Z)
Starred repositories
This is a Hijab Detection Project Folder
A polyglot document intelligence framework with a Rust core. Extract text, metadata, and structured information from PDFs, Office documents, images, and 75+ formats. Available for Rust, Python, Rub…
PDF to markdown using vision LLMs — tables, layouts, and structure preserved
Open-source infrastructure for Computer-Use Agents. Sandboxes, SDKs, and benchmarks to train and evaluate AI agents that can control full desktops (macOS, Linux, Windows).
Grounded SAM 2: Ground and Track Anything in Videos with Grounding DINO, Florence-2 and SAM 2
Gradio WebUI for creators and developers, featuring key TTS (Edge-TTS, kokoro) and zero-shot Voice Cloning (E2 & F5-TTS, CosyVoice), with Whisper audio processing, YouTube download, Demucs vocal is…
LLaMA-Omni is a low-latency and high-quality end-to-end speech interaction model built upon Llama-3.1-8B-Instruct, aiming to achieve speech capabilities at the GPT-4o level.
Bare metal to production ready in mins; your own fly server on your VPS.
🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
OpenPanel is an open-source web and product analytics platform, an open-source alternative to Mixpanel with optional self-hosting.
llama3.np is a pure NumPy implementation for Llama 3 model.
A low order, approximate aerodynamics model for rigid bodies simulated in Unity
RAGFlow is a leading open-source Retrieval-Augmented Generation (RAG) engine that fuses cutting-edge RAG with Agent capabilities to create a superior context layer for LLMs
StreamDiffusion: A Pipeline-Level Solution for Real-Time Interactive Generation
The world's largest GitHub Repository for LLMs + Robotics
CVMHT : Complementary-View Multiple Human Tracking (AAAI 2020).
StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models
Multi-camera live traffic and object counting with YOLO v4, Deep SORT, and Flask.
State-of-the-art 2D and 3D Face Analysis Project
A human detector and tracker, written in python, using YOLOv7 for detection and DeepSORT for tracking the detections from YOLO.
A library for efficient similarity search and clustering of dense vectors.
The world's simplest facial recognition api for Python and the command line
A Lightweight Face Recognition and Facial Attribute Analysis (Age, Gender, Emotion and Race) Library for Python
