gguf-model-support

Here are 10 public repositories matching this topic...

brontoguana / krasis

Krasis is a Hybrid LLM runtime which focuses on efficient running of larger models on consumer grade VRAM limited hardware

transformer inference-engine inference-optimization mixture-of-experts cpu-inference large-language-models gpu-inference llm-inference high-performance-inference hybrid-inference gguf-model-support llama-cpp-alternative

Updated Mar 20, 2026
Rust

nareshis21 / Truelarge-RT

Star

Android inference engine running 20B+ parameter LLMs on 4GB-8GB RAM devices. Features proprietary Layer-by-Layer (LBL) streaming, zero-copy mmap loading, and native C++/Kotlin architecture.

android cpp kotlin-android jni layered-architecture inference-engine on-device-ai edgeai llm low-ram-usage llamacpp llm-inference gguf-model-support

Updated Feb 21, 2026
Kotlin

Mainframework / Quanta

Star

Convert and quantize llm models

Updated Dec 30, 2025
Python

Privacy-first Local RAG Server: Chat with PDF & DOCX using GGUF models via llama.cpp and Qdrant. A lightweight, standalone FastAPI server with a clean HTML UI. High-performance, fully offline document intelligence. No Ollama, no cloud, no API keys.

python document-search rag fastapi qdrant llm llama-cpp local-llm gguf offline-ai rag-pipeline rag-chatbot gguf-model-support

Updated Feb 24, 2026
Python

splinterhq / libsplinter

Star

Splinter is an atomic, lock-free persist-able shared memory KV & vector store that runs LLM inference without socket, mutex or memcpy() overhead; it ingests, stores and optionally persists huge amounts of data without latency. Splinter fits in the size of most modern CPU instruction caches (875 LOC) and ships with CLI , tools and tests.

lua signal-processing bloom-filter inference pubsub atomic-design epoll lock-free kv vectors gdelt-data atomics seqlock vector-search vector-database physics-informed-neural-networks llama-cpp gguf-model-support

Updated Mar 18, 2026
C

mamei16 / MADLAD-400-WebUI

Star

A simple Gradio app for local translation using the GGUF versions of MADLAD-400

nlp translation machine-translation gradio llamacpp gguf gguf-model-support

Updated Dec 8, 2025
Python

KienPC1234 / Emotica-AI

Star

Emotica AI is a compassionate and therapeutic virtual assistant designed to provide empathetic and supportive conversations. It integrates a local LLaMA model for text generation, a vision model for image captioning, a RAG system for information retrieval, and emotion detection to tailor its responses.

python model chatbot cuda embeddings lang blip-model gguf-model-support

Updated Nov 1, 2025
Python

headlessripper / Nectar-X-Studio

Star

Nectar-X-Studio is a powerful, Local AI-Inferencing application that allows the user download, create, run agents and run large language models on their own machine. With no internet connection required, Nectar ensures privacy-first, high-performance inference using cutting-edge open-source models from Hugging Face, Ollama, and beyond.

ai ml ai-agents infrence stable-diffusion gguf-model-support

Updated Jan 3, 2026
Python

frinknet / gelli

Star

Containerized LLM for any use-case big or small

llama llm llmops llamacpp ggml llm-training gguf-model-support

Updated Mar 10, 2026
Shell

aTh1ef / tavily-gemma-researcher

Star

AI tool to help users research using local LLMs and automated web search.

python google gemma streamlit lm-studio langgraph private-llm tavily-api privacy-first-ai tavily-search next-gen-ai google-gemma-3-1b instruction-tuned-llm gguf-model-support

Updated Jun 9, 2025
Python

Improve this page

Add a description, image, and links to the gguf-model-support topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the gguf-model-support topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

gguf-model-support

Here are 10 public repositories matching this topic...

brontoguana / krasis

nareshis21 / Truelarge-RT

Mainframework / Quanta

Ozgur-al / local-rag-server

splinterhq / libsplinter

mamei16 / MADLAD-400-WebUI

KienPC1234 / Emotica-AI

headlessripper / Nectar-X-Studio

frinknet / gelli

aTh1ef / tavily-gemma-researcher

Improve this page

Add this topic to your repo