AI
Top 100 AI Open Source
100 Projects · 8 Categories
Updated 2025
100 Most-Starred AI &
ML Open Source Repos
Curated by GitHub star count — frameworks, agents, LLMs, image generation, voice, code assistants, RAG, and developer tooling, organized by category.
100
Projects
8
Categories
5M+
Total Stars
2025
Updated
AI Tooling
Platforms, SDKs & productivity tools
13 repos
★ 108k
Curated collection of the best ChatGPT prompt examples and personas for developers and power users.
★ 65k
LLM-enhanced academic research interface with paper reading, code editing, and LaTeX support.
★ 61k
Open-source LLM app development platform with visual orchestration, RAG pipeline, and model hub.
★ 47k
Extendable workflow automation with 200+ integrations, AI nodes, and self-hosting support.
★ 47k
Modern AI chat framework with plugin marketplace, vision, TTS, and multi-model support.
★ 33k
Local private knowledge Q&A system combining LangChain with local LLMs for enterprise use.
★ 22k
Official Python library for the OpenAI API with async support, streaming, and typed responses.
★ 17k
End-to-end NLP framework for building custom Q&A, document search, and RAG pipelines.
★ 14k
Framework for evaluating LLMs and LLM systems with built-in and custom evaluation metrics.
★ 18k
Constrained LLM generation with templates, output validation, and structured response control.
★ 9.4k
Build, evaluate, and deploy high-quality LLM flows from prototype to production with monitoring.
★ 16k
Parameter-efficient fine-tuning: LoRA, Prefix Tuning, P-Tuning for large pretrained models.
AI Agents
Autonomous agents, multi-agent frameworks & orchestration
15 repos
★ 167k
Experimental autonomous AI agent that chains GPT calls to accomplish complex long-horizon tasks without human input.
★ 92k
Build context-aware reasoning applications with LLMs using chains, agents, tools, and memory modules.
★ 55k
LLM-powered terminal that writes and runs code locally — a natural-language interface to your computer.
★ 44k
Multi-agent LLM framework assigning PM, architect, and engineer roles to build complete software projects.
★ 34k
Enable next-generation LLM applications via multi-agent conversations with human-in-the-loop support.
★ 34k
Platform for autonomous AI software agents that can write, edit, run, debug, and browse the web.
★ 24k
Connect ChatGPT with hundreds of Hugging Face expert models to solve complex AI tasks step-by-step.
★ 26k
Communicative agents for software development — LLM-powered team of PM, developer, and QA tester.
★ 22k
SDK integrating LLMs into .NET, Python, and Java apps with plugins, planners, and memory.
★ 22k
Framework for orchestrating role-based autonomous AI agent teams for complex collaborative tasks.
★ 17k
Lightweight, experimental multi-agent orchestration framework for educational purposes from OpenAI.
★ 15k
Build AI assistants with memory, knowledge, tools, and reasoning from a unified agent framework.
★ 9k
Automate browser workflows using LLMs and computer vision — no brittle CSS selectors needed.
★ 6k
Communicative Agents for Mind Exploration — role-playing multi-agent framework for LLM cooperation.
★ 5k
Build real-time multimodal voice and video AI agents with production-ready WebRTC infrastructure.
LLM & Models
Large language model repos, serving engines & fine-tuning
30 repos
★ 131k
State-of-the-art pretrained models for NLP, vision, and audio on PyTorch, TensorFlow, and JAX.
★ 95k
Run Llama 3, Mistral, Gemma, and 80+ large language models locally with a single command.
★ 70k
Run private, no-internet LLMs locally on CPU and GPU — privacy-first AI assistant for everyone.
★ 66k
LLM inference in pure C/C++ with quantization — runs Llama, Mistral, and Gemma on CPU with no GPU.
★ 57k
Open foundation language model from Meta — 7B to 65B parameters for research and production use.
★ 40k
Bilingual Chinese-English dialogue language model based on the General Language Model architecture.
★ 37k
Minimal, fast GPT-2 training and inference codebase — the simplest, fastest repo for training medium-size GPTs.
★ 36k
Open platform for training, serving, and evaluating LLMs — home of Vicuna and Chatbot Arena.
★ 35k
Unified fine-tuning framework for 100+ LLMs with LoRA, QLoRA, full-parameter, and RLHF support.
★ 35k
High-throughput, memory-efficient LLM inference and serving engine powered by PagedAttention.
★ 28k
LLM training in pure C and CUDA with no PyTorch dependency — fast, educational GPT-2 implementation.
★ 26k
Next-generation open foundation models from Meta with 8B, 70B, and 405B parameter variants.
★ 24k
Free, open-source OpenAI-compatible local inference server for LLMs, image, and audio generation.
★ 22k
Finetune LLMs 2× faster with 50% less VRAM — Llama 3, Mistral, Gemma with no accuracy loss.
★ 20k
Clean, minimal PyTorch re-implementation of GPT training — ~300 lines, pure education-first codebase.
★ 20k
Visual instruction tuning for large multimodal models — connects CLIP vision encoder with language models.
★ 14k
Series of open language models from Alibaba with strong multilingual and coding capabilities.
★ 12k
Linear-time sequence model with selective state spaces — a compelling alternative to Transformers at scale.
★ 10k
Run any open-source LLM as an OpenAI-compatible REST API server with easy local and cloud deployment.
★ 9k
Reference implementation of Mistral 7B — an efficient, high-performance open-weights language model.
★ 8k
Edge-capable language models with strong performance for mobile and IoT — small but mighty LLMs.
★ 8k
Production-grade toolkit for deploying and serving Large Language Models at scale with low latency.
★ 8k
Mixture-of-Experts language model — 236B total parameters, 21B active, strong and economical inference.
★ 7k
20B parameter autoregressive language model trained on the Pile dataset — open and fully replicable.
★ 6k
Vision-language foundation models matching closed-source GPT-4V performance on major visual benchmarks.
★ 5k
Samples and guides for Microsoft's Phi-3 small language models — fine-tuning, inference, and deployment.
★ 5k
Lightweight open models built with Gemini research — capable yet compact LLMs for diverse use cases.
★ 4k
Large vision-language model supporting image, text, and multi-region visual understanding and generation.
★ 4k
PyTorch-native library for fine-tuning and experimenting with LLMs using simple, modular APIs.
★ 4k
General language model with all-tools capability — function calling, code, browsing, and image understanding.
Image Generation
Diffusion models, visual AI & creative tools
8 repos
★ 141k
Feature-rich browser interface for Stable Diffusion with extensions, scripts, and advanced controls.
★ 59k
Powerful modular node-based UI for Stable Diffusion workflows with custom extension support.
★ 37k
Latent text-to-image diffusion model for high-resolution photorealistic image synthesis from text prompts.
★ 29k
Neural network for adding spatial conditioning controls to Stable Diffusion via auxiliary encoders.
★ 28k
Practical algorithms for general real-world image and video super-resolution using enhanced ESRGAN.
★ 25k
State-of-the-art diffusion model library for image, audio, and 3D generation — training and inference.
★ 23k
Professional Stable Diffusion toolkit for creatives with a node editor, canvas, and custom workflows.
★ 7k
Large-scale text-to-video generation model pre-trained on billions of text-video aligned pairs.
Voice & Audio
Speech recognition, TTS, and voice cloning
8 repos
★ 72k
Robust general-purpose speech recognition trained on 680,000 hours of multilingual web audio data.
★ 37k
1-minute voice cloning with few-shot zero-shot TTS — fine-tune your own voice model quickly and easily.
★ 35k
High-performance C/C++ port of OpenAI Whisper enabling real-time, on-device speech transcription.
★ 35k
Deep learning text-to-speech toolkit with 1,100+ pretrained models and dozens of supported languages.
★ 34k
Text-prompted generative audio model producing voice, music, sound effects, and multilingual speech.
★ 29k
Instant, flexible voice cloning with precise tone, style, and language control for diverse speakers.
★ 12k
Multi-voice TTS system producing remarkably realistic speech — slow but exceptional audio quality output.
★ 7k
Open-source audio, music, and speech generation toolkit providing a unified research platform.
ML Frameworks
Core training, distributed computing & scientific ML
11 repos
★ 185k
End-to-end open source ML platform for training and deploying models across every environment and device.
★ 82k
Tensors and dynamic neural networks with strong GPU acceleration — the go-to deep learning framework.
★ 34k
Deep learning optimization library enabling training of 100B+ parameter models with extreme efficiency.
★ 32k
Unified framework for scaling AI and Python applications from a laptop to a large distributed cluster.
★ 29k
Composable NumPy transformations — autograd, JIT compilation, VMAP, and PMAP on CPUs, GPUs, and TPUs.
★ 26k
Simple deep learning framework with a ~1000-line Python core — for learning and fast inference on edge.
★ 18k
Open-source platform for the full ML lifecycle — experiment tracking, model registry, and deployment.
★ 12k
AI system predicting protein 3D structures with near-atomic accuracy — a landmark biology breakthrough.
★ 12k
A language and compiler for writing highly efficient custom GPU kernels without requiring CUDA expertise.
★ 8k
Run PyTorch training on any distributed configuration with minimal code change and maximum performance.
★ 6k
Neural network library built on JAX — focused on flexibility and performance for deep learning research.
RAG & Vector Search
Retrieval-augmented generation, vector databases & document Q&A
9 repos
★ 53k
Ask questions to your documents using LLMs locally — full privacy, no data ever leaves your machine.
★ 36k
Data framework for connecting LLMs to external knowledge, databases, and structured data sources.
★ 31k
Cloud-native vector database built for scalable similarity search and AI-powered application backends.
★ 20k
Chat privately with your local documents using LLMs — zero data leaves your device at any time.
★ 20k
Clean, customizable RAG-based UI for chatting with your documents — easy to self-host and extend.
★ 20k
ChatGPT plugin enabling semantic document search using various vector database backends.
★ 20k
Vector database and search engine built in Rust — designed for high performance AI application backends.
★ 16k
Open-source AI-native embedding database for building LLM apps with blazing-fast semantic retrieval.
★ 11k
Open-source ML-native vector database — store objects and vectors for semantic search and RAG pipelines.
Code AI
AI coding assistants, code search & completion
4 repos
★ 22k
Self-hosted, open-source AI coding assistant — a privacy-friendly GitHub Copilot alternative you own.
★ 18k
Open-source AI code assistant integrating deeply into VS Code and JetBrains with any LLM backend.
★ 12k
Code-focused LLM trained on 2T tokens of code and natural language — strong fill-in-the-middle performance.
★ 10k
AI-powered code search and navigation — search your entire codebase using natural language queries.