Top Python libraries of 2025

Meet us at Gartner Data & Analytics (Mar 9–11). Book a meeting ->

technical|Dec 18, 2025

Top Python libraries of 2025

Alan Descoins

Chief Executive Officer (CEO)

Federico Bello

Machine Learning Engineer

Welcome to the 11th edition of our yearly roundup of the Python libraries!

If 2025 felt like the year of Large Language Models (LLMs) and agents, it’s because it truly was. The ecosystem expanded at incredible speed, with new models, frameworks, tools, and abstractions appearing almost weekly.

That created an unexpected challenge for us: with so much momentum around LLMs, agent frameworks, retrievers, orchestrators, and evaluation tools, this year’s Top 10 could’ve easily turned into a full-on LLM list. We made a conscious effort to avoid that.

Instead, this year’s selection highlights two things:

The LLM world is evolving fast, and we surface the libraries that genuinely stood out.
But Python remains much broader than LLMs, with meaningful progress in data processing, scientific computing, performance, and overall developer experience.

The result is a balanced, opinionated selection featuring our Top 10 picks for each category, plus notable runners-up, reflecting how teams are actually building AI systems today by combining Python’s proven foundations with the new wave of agentic and LLM-driven tools.

Let’s dive into the libraries that shaped 2025.

Jump straight to:

Top 10 Python Libraries - General use

1. ty - a blazing-fast type checker built in Rust

Python's type system has become essential for modern development, but traditional type checkers can feel sluggish on larger codebases. Enter ty, an extremely fast Python type checker and language server written in Rust by Astral (creators of Ruff and uv).

ty prioritizes performance and developer experience from the ground up. Getting started is refreshingly simple: you can try the online playground or run uvx ty check to analyze your entire project. The tool automatically discovers your project structure, finds your virtual environment, and checks all Python files without extensive configuration. It respects your pyproject.toml, automatically detects .venv environments, and can target specific files or directories as needed.

Beyond raw speed, ty represents Astral's continued investment in modernizing Python's tooling ecosystem. The same team that revolutionized linting with Ruff and package management with uv is now tackling type checking: developer tools should be fast enough to fade into the background. As both a standalone type checker and language server, ty provides real-time editor feedback. Notably, ty uses Salsa for function-level incremental analysis. That way, when you modify a single function, only that function and its dependents are rechecked, not the entire module. This fine-grained approach delivers particularly responsive IDE experiences.

Alongside Meta's recently released pyrefly, ty represents a new generation of Rust-powered type checkers—though with fundamentally different approaches. Where pyrefly pursues aggressive type inference that may flag working code, ty embraces the "gradual guarantee": removing type annotations should never introduce new errors, making it easier to adopt typing incrementally.

It's important to note that ty is currently in preview and not yet ready for production use. Expect bugs, missing features, and occasional issues. However, for personal projects or experimentation, ty provides valuable insight into the direction of Python tooling. With Astral's track record and ongoing development momentum, ty is worth keeping on your radar as it matures toward stable release.

2. complexipy - measures how hard it is to understand the code

Code complexity metrics have long been a staple of software quality analysis, but traditional approaches like cyclomatic complexity often miss the mark when it comes to human comprehension. complexipy takes a different approach: it uses cognitive complexity, a metric that aligns with how developers actually perceive code difficulty. Built in Rust for speed, this tool helps identify code that genuinely needs refactoring rather than flagging mathematically complex but readable patterns.

Cognitive complexity, originally researched by SonarSource, measures the mental effort required to understand code rather than the number of execution paths. This human-focused approach penalizes nested structures and interruptions in linear flow, which is where developers typically struggle. complexipy brings this methodology to Python with a straightforward interface: complexipy . analyzes your entire project, while complexipy path/to/code.py --max-complexity-allowed 10 lets you enforce custom thresholds. The tool supports both command-line usage and a Python API, making it adaptable to various workflows:

The project includes a GitHub Action for CI/CD pipelines, a pre-commit hook to catch complexity issues before they're committed, and a VS Code extension that provides real-time analysis with visual indicators as you code. Configuration is flexible through TOML files or pyproject.toml, and the tool can export results to JSON or CSV for further analysis. The Rust implementation ensures that even large codebases are analyzed quickly, a genuine advantage over pure-Python alternatives.

complexipy fills a specific niche: teams looking to enforce code maintainability standards with metrics that actually reflect developer experience. The default threshold of 15 aligns with SonarSource's research recommendations, though you can adjust this based on your team's tolerance. The tool is mature, with active maintenance and a growing community of contributors. For developers tired of debating subjective code quality, complexipy offers objective, research-backed measurement that feels intuitive rather than arbitrary.

If you care about maintainability grounded in actual developer experience, make sure to make room for this tool in your CI/CD pipeline.

3. Kreuzberg - extracts data from 50+ file formats

Working with documents in production often means choosing between convenience and control. Cloud-based solutions offer powerful extraction but introduce latency, costs, and privacy concerns. Local libraries provide autonomy but typically lock you into a single language ecosystem. Kreuzberg takes a different approach: a Rust-powered document intelligence framework that brings native performance to Python, TypeScript, Ruby, Go, and Rust itself, all from a single codebase.

At its core, Kreuzberg handles over 50 file format families—PDFs, Office documents, images, HTML, XML, emails, and archives—with consistent APIs across all supported languages. Language bindings follow ecosystem conventions while maintaining feature parity, so whether you're calling extract_file() in Python or the equivalent in TypeScript, you're accessing the same capabilities. This eliminates the common frustration of discovering that a feature exists in one binding but not another.

Kreuzberg's deployment flexibility stands out. Beyond standard library usage, it ships as a CLI tool, a REST API server with OpenAPI documentation, a Model Context Protocol server for AI assistants, and official Docker images. For teams working across different languages or deployment scenarios, this versatility means standardizing on one extraction tool rather than maintaining separate solutions. The OCR capabilities deserve attention too: built-in Tesseract support across all bindings, with Python additionally supporting EasyOCR and PaddleOCR. The framework includes intelligent table detection and reconstruction, while streaming parsers maintain constant memory usage even when processing multi-gigabyte files.

If your organization spans multiple languages and needs consistent, reliable extraction, Kreuzberg is well worth a serious look.

4. throttled-py - control request rates with five algorithms

Rate limiting is one of those unglamorous but essential features that every production application needs. Whether you're protecting your API from abuse, managing third-party API calls to avoid exceeding quotas, or ensuring fair resource allocation across users, proper rate limiting is non-negotiable. throttled-py addresses this need with a focused, high-performance library that brings together five proven algorithms and flexible storage options in a clean Python package.

What sets throttled-py apart is its comprehensive approach to algorithm selection. Rather than forcing you into a single strategy, it supports Fixed Window, Sliding Window, Token Bucket, Leaky Bucket, and Generic Cell Rate Algorithm (GCRA), each with its upsides and downsides between precision, memory usage, and performance. This flexibility matters because different applications have different needs: a simple API might work fine with Fixed Window's minimal overhead, while a distributed system handling bursty traffic might benefit from Token Bucket or GCRA. The library makes it straightforward to switch between algorithms, letting you choose the right tool for your specific constraints.

Performance is another area where throttled-py delivers tangible benefits. Benchmarks show in-memory operations running at roughly 2.5-4.5x the speed of basic dictionary operations, while Redis-backed limiting performs comparably to raw Redis commands. Getting started takes just a few lines: install via pip, configure your quota and algorithm, and you're limiting requests. The API supports decorators, context managers, and direct function calls, with identical syntax for both synchronous and asynchronous code. Wait-and-retry behavior is available when you need automatic backoff rather than immediate rejection.

The library supports both in-memory storage (with built-in LRU eviction) and Redis, making it suitable for single-process applications and distributed systems alike. Thread safety is built in, and the straightforward configuration model means you can share rate limiters across different parts of your codebase by reusing the same storage backend. The documentation is clear and includes practical examples for common patterns like protecting API routes or throttling external service calls.

throttled-py is actively maintained and offers a modern, flexible approach to Python rate limiting. While it doesn’t yet have the ecosystem recognition of older libraries like Flask-Limiter, it brings contemporary Python practices—including full async support—to a space that hasn’t seen much innovation recently. For developers needing reliable rate limiting with algorithm flexibility and good performance characteristics, throttled-py offers a compelling option worth evaluating against your specific requirements.

A solid, modern option for teams that want rate limiting to be reliable, flexible, and out of the way.

5. httptap - timing HTTP requests with waterfall views

When troubleshooting HTTP performance issues or debugging API integrations, developers often find themselves reaching for curl and then manually parsing timing information or piecing together what went wrong. httptap addresses this diagnostic gap with a focused approach: it dissects HTTP requests into their constituent phases—DNS resolution, TCP connection, TLS handshake, server wait time, and response transfer—and presents the data in formats ranging from rich terminal visualizations to machine-readable metrics.

Built on httpcore’s trace hooks, httptap provides precise measurements for each phase of an HTTP transaction. The tool captures network-level details that matter for diagnosis: IPv4 or IPv6 addresses, TLS certificate information including expiration dates and cipher suites, and timing breakdowns that reveal whether slowness stems from DNS lookups, connection establishment, or server processing. Beyond simple GET requests, httptap supports all standard HTTP methods with request body handling, automatically detecting content types for JSON and XML payloads. The --follow flag tracks redirect chains with full timing data for each hop, making it straightforward to understand multi-step request flows.

The real utility emerges in httptap's output flexibility. The default rich mode presents a waterfall timeline in your terminal—immediately visual and informative for interactive debugging. Switch to --compact for single-line summaries suitable for log files, or --metrics-only for raw values that pipe cleanly into scripts for performance monitoring and regression testing. The --json export captures complete request data including redirect chains and response headers, enabling programmatic analysis or historical tracking of API performance baselines.

For developers who need customization, httptap exposes clean protocol interfaces for DNS resolution, TLS inspection, and request execution. This extensibility allows you to swap in custom resolvers or modify request behavior without forking the project. The tool also includes practical features for real-world debugging: curl-compatible flag aliases for easy adoption, proxy support for routing traffic through development environments, and the ability to bypass TLS verification when working with self-signed certificates in test environments.

Your debugging sessions just got easier.

6. fastapi-guard - security middleware for FastAPI apps

Security in modern web applications is often an afterthought—bolted on through scattered middleware, manual IP checks, and reactive measures when threats are already at the door. FastAPI Guard takes a different approach, providing comprehensive security middleware that integrates directly into FastAPI applications to handle common threats systematically. If you've been piecing together various security solutions, this library offers a centralized approach to application-layer security.

At its core, FastAPI Guard addresses the fundamentals most APIs need: IP whitelisting and blacklisting, rate limiting, user agent filtering, and automatic IP banning after suspicious activity. The library includes penetration attempt detection that monitors for common attack signatures like SQL injection, path traversal, and XSS attempts. It also supports geographic filtering through IP geolocation, can block requests from cloud provider IP ranges, and manages comprehensive HTTP security headers following OWASP guidelines. Configuration is straightforward—define a SecurityConfig object with your rules and add the middleware to your application.

The deployment flexibility of FastAPI Guard makes it well-suited for real world use. Single-instance deployments use efficient in-memory storage, while distributed systems can leverage optional Redis integration for shared security state across instances. The library also provides fine-grained control through decorators, letting you apply specific security rules to individual routes rather than enforcing everything globally. An admin endpoint might require HTTPS, limit access to internal IPs, and monitor for suspicious patterns, while public endpoints remain permissive.

While it won't prevent every sophisticated attack, it provides a solid foundation for common security concerns and integrates naturally into FastAPI without requiring architectural changes. For teams needing more than basic security but wanting to avoid managing multiple middleware solutions, FastAPI Guard consolidates essential protections into a single, well-designed package.

Security doesn't have to be complicated.

7. modshim - seamlessly enhance modules without monkey-patching

When you need to modify a third-party Python library's behavior, the traditional options are limited and filled with tradeoffs. Fork the entire repository and take on its maintenance burden, monkey-patch the module and risk polluting your application's global namespace, or vendor the code and deal with synchronization headaches when the upstream library updates. Enter modshim, a Python library that offers a fourth approach: overlay your modifications onto existing modules without touching their source code.

modshim works by creating virtual merged modules through Python's import system. You write your enhancements in a separate module that mirrors the structure of the target library, then use shim() to combine them into a new namespace. For instance, to add a prefix parameter to the standard library's textwrap.TextWrapper, you'd subclass the original class with your enhancement and mount it as a new module. The original textwrap remains completely untouched, while your shimmed version provides the extended functionality. This isolation is modshim's key advantage: your modifications exist in their own namespace, preventing the global pollution issues that plague monkey-patching.

Under the hood, modshim adds a custom finder to sys.meta_path that intercepts imports and builds virtual modules by running the original code and your enhancement code one after the other. It rewrites the AST to fix internal imports, supports merging submodules recursively, and keeps everything thread-safe. The author describes it as “OverlayFS for Python modules,” a reminder that this kind of import-system plumbing is powerful but requires careful use.

It may not be for every team, but in the right hands it offers a powerful alternative to forking or patching.

8. Spec Kit - executable specs that generate working code

As AI coding assistants have become ubiquitous in software development, a familiar pattern has emerged: developers describe what they want, receive plausible-looking code in seconds, and then spend considerable time debugging why it doesn't quite work. This vibe-coding approach where vague prompts yield inconsistent implementations highlights a fundamental mismatch between how we communicate with AI agents and how they actually work best. GitHub's spec-kit addresses this gap by introducing a structured workflow that treats specifications as the primary source of truth, turning them into executable blueprints that guide AI agents through implementation with clarity and consistency.

spec-kit operationalizes Spec-Driven Development through a command-line tool called Specify and a set of carefully designed templates. The process moves through distinct phases: establish a project constitution that codifies development principles, create detailed specifications capturing the "what" and "why," generate technical plans with your chosen stack, break down work into actionable tasks, and finally let the AI agent implement according to plan. Run uvx --from git+https://github.com/github/spec-kit.git specify init my-project and you'll have a structured workspace with slash commands like /speckit.constitution, /speckit.specify, and /speckit.implement ready to use with your AI assistant.

spec-kit's deliberate agent-agnostic design is particularly notable. Whether you're using GitHub Copilot, Claude Code, Gemini CLI, or a dozen other supported tools, the workflow remains consistent. The toolkit creates a .specify directory with templates and helper scripts that manage Git branching and feature tracking. This separation of concerns—stable intent in specifications, flexible implementation in code—enables generating multiple implementations from the same spec to explore architectural tradeoffs, or modernizing legacy systems by capturing business logic in fresh specifications while leaving technical debt behind.

Experimental or not, it hints at a smarter way to build with AI, and it’s worth paying close attention as it evolves.

9. skylos - detects dead code and security vulnerabilities

Dead code accumulates in every Python codebase: unused imports, forgotten functions, and methods that seemed essential at the time but now serve no purpose. Traditional static analysis tools struggle with Python's dynamic nature, often missing critical issues or flooding developers with false positives. Skylos approaches this challenge pragmatically: it's a static analysis tool specifically designed to detect dead code while acknowledging Python's inherent complexity and the limitations of static analysis.

Skylos aims to take a comprehensive approach to code health. Beyond identifying unused functions, methods, classes, and imports, it tackles two increasingly important concerns for modern Python development. First, it includes optional security scanning to detect dangerous patterns: SQL injection vulnerabilities, command injection risks, insecure pickle usage, and weak cryptographic hashes. Second, it addresses the rise of AI-generated code with pattern detection for common vulnerabilities introduced by vibe-coding, where code may execute but harbor security flaws. These features are opt-in via --danger and --secrets flags, keeping the tool focused on your specific needs.

The confidence-based system is particularly thoughtful. Rather than claiming absolute certainty, Skylos assigns confidence scores (0-100) to its findings, with lower scores indicating greater ambiguity. This is especially useful for framework code—Flask routes, Django models, or FastAPI endpoints may appear unused but are actually invoked externally. The default confidence of 60 provides safe cleanup suggestions, while lower thresholds enable more aggressive auditing. It's an honest approach that respects Python's dynamic features instead of pretending they don't exist.

Skylos shows real maturity in practical use: its interactive mode lets you review and selectively remove flagged code, while a VS Code extension provides real-time feedback as you write. GitHub Actions and pre-commit hooks support CI/CD workflows with configurable strictness, all managed through pyproject.toml. At the same time, Skylos is clear about its limits: no static analyzer can perfectly handle Python’s metaprogramming, its security scanning is still proof-of-concept, and although benchmarks show it outperforming tools like Vulture, Flake8, and Pylint in certain cases, the maintainers note that real-world results will vary.

In the age of vibe-coded chaos, Skylos is the ally that keeps your codebase grounded.

10. FastOpenAPI - easy OpenAPI docs for any framework

If you've ever felt constrained by framework lock-in while trying to add proper API documentation to your Python web services, FastOpenAPI offers a practical solution. This library brings FastAPI's developer-friendly approach, automatic OpenAPI schema generation, Pydantic validation, and interactive documentation to a wider range of Python web frameworks. Rather than forcing you to rebuild your application on a specific stack, FastOpenAPI integrates directly with what you're already using.

The core idea is simple: FastOpenAPI provides decorator-based routing that mirrors FastAPI's familiar @router.get and @router.post syntax, but works across eight different frameworks including AioHTTP, Falcon, Flask, Quart, Sanic, Starlette, Tornado, and Django. This "proxy routing" approach registers endpoints in a FastAPI-like style while integrating seamlessly with your existing framework's routing system. You define your API routes with Pydantic models for validation, and FastOpenAPI handles the rest, generating OpenAPI schemas, validating requests, and serving interactive documentation at /docs and /redoc.

The example below shows this in practice using Flask: you attach a FastOpenAPI router to the app, define a Pydantic model, and declare an endpoint with a decorator, no extra boilerplate, no manual schema work:

What makes FastOpenAPI notable is its focus on framework flexibility without sacrificing the modern Python API development experience. Built with Pydantic v2 support, it provides the type safety and validation you'd expect from contemporary tooling. The library handles both request payload and response validation automatically, with built-in error handling that returns properly formatted JSON error messages.

Bridge the gap between your favorite framework and modern API docs.

Top 10 Python Libraries - AI/ML/Data

1. MCP Python SDK & FastMCP - connect LLMs to external data sources

As LLMs become more capable, connecting them to external data and tools has grown increasingly critical. The Model Context Protocol (MCP) addresses this by providing a standardized way for applications to expose resources and functionality to LLMs, similar to how REST APIs work for web services, but designed specifically for AI interactions. For Python developers building production MCP applications, the ecosystem centers on two complementary frameworks: the official MCP Python SDK as the core protocol implementation, and FastMCP 2.0 as the production framework with enterprise features.

The MCP Python SDK, maintained by Anthropic, provides the canonical implementation of the MCP specification. It handles protocol fundamentals: transports (stdio, SSE, Streamable HTTP), message routing, and lifecycle management. Resources expose data to LLMs, tools enable action-taking, and prompts provide reusable templates. With structured output validation, OAuth 2.1 support, and comprehensive client libraries, the SDK delivers a solid foundation for MCP development.

FastMCP 2.0 extends this foundation with production-oriented capabilities. Pioneered by Prefect, FastMCP 1.0 was incorporated into the official SDK. FastMCP 2.0 continues as the actively maintained production framework, adding enterprise authentication (Google, GitHub, Azure, Auth0, WorkOS with persistent tokens and auto-refresh), advanced patterns (server composition, proxying, OpenAPI/FastAPI generation), deployment tooling, and testing utilities. The developer experience is simple, adding the @mcp.tool decorator often suffices, with automatic schema generation from type hints.

FastMCP 2.0 and the MCP Python SDK naturally complement each other: FastMCP provides production-ready features like enterprise auth, deployment tooling, and advanced composition, while the SDK offers lower-level protocol control and minimal dependencies. Both share the same transports and can run locally, in the cloud, or via FastMCP Cloud.

Worth exploring for serious LLM integrations.

2. Token-Oriented Object Notation (TOON) - compact JSON encoding for LLMs

When working with LLMs, every token counts—literally. Whether you're building a RAG system, passing structured data to prompts, or handling large-scale information retrieval, JSON's verbosity can quickly inflate costs and consume valuable context window space. TOON (Token-Oriented Object Notation) addresses this practical concern with a focused solution: a compact, human-readable encoding that achieves significant token reduction while maintaining the full expressiveness of JSON's data model.

TOON's design philosophy combines the best aspects of existing formats. For nested objects, it uses YAML-style indentation to eliminate braces and reduce punctuation overhead. For uniform arrays—the format's sweet spot—it switches to a CSV-inspired tabular layout where field names are declared once in a header, and data flows in rows beneath. An array of employee records that might consume thousands of tokens in JSON can shrink by 40-60% in TOON, with explicit length declarations and field headers that actually help LLMs parse and validate the structure more reliably.

The format includes thoughtful details that matter in practice. Array headers declare both length and fields, providing guardrails that enable validation without requiring models to count rows or guess structure. Strings are quoted only when necessary, and commas, inner spaces, and Unicode characters pass through safely unquoted. Alternative delimiters (tabs or pipes) can provide additional token savings for specific datasets.

TOON’s benchmarks show clear gains in comprehension and token use, with transparent notes on where it excels and where JSON or CSV remain better fits. The format is production-ready yet still evolving across multiple language implementations. For developers who need token-efficient, readable structures with reliable JSON round-tripping in LLM workflows, TOON offers a practical option.

TOON proves sometimes the best format is the one optimized for its actual use case.

3. Deep Agents - framework for building sophisticated LLM agents

Building AI agents that can handle complex, multi-step tasks has become increasingly important as LLMs demonstrate growing capability with long-horizon work. Research shows that agent task length is doubling every seven months, but this progress brings challenges: dozens of tool calls create cost and reliability concerns that need practical solutions. LangChain's deepagents tackles these issues with an open-source agent harness that mirrors patterns used in systems like Claude Code and Manus, providing planning capabilities, filesystem access, and subagent delegation.

At its core, deepagents is built on LangGraph and provides three key capabilities out of the box. First, a planning tool (write_todos and read_todos) enables agents to break down complex tasks into discrete steps and track progress. Second, a complete filesystem toolkit (ls, read_file, write_file, edit_file, glob, grep) allows agents to offload large context to memory, preventing context window overflow. Third, a task tool enables spawning specialized subagents with isolated contexts for handling complex subtasks independently. These capabilities are delivered through a modular middleware architecture that makes them easy to customize or extend.

Getting started is straightforward. Install with pip install deepagents, and you can create an agent in just a few lines, using any LangChain-compatible model. You can add custom tools alongside the built-in capabilities, provide domain-specific system prompts, and configure subagents for specialized tasks. The create_deep_agent function returns a standard LangGraph StateGraph, so it integrates naturally with streaming, human-in-the-loop workflows, and persistent memory through LangGraph's ecosystem.

The pluggable backend system makes deepagents particularly useful. Files can be stored in ephemeral state (default), on local disk, in persistent storage via LangGraph Store, or through composite backends that route different paths to different storage systems. This flexibility enables use cases like long-term memory, where working files remain ephemeral but knowledge bases persist across conversations, or hybrid setups that combine local filesystem access with cloud storage. The middleware architecture also handles automatic context management, summarizing conversations when they exceed 170K tokens and caching prompts to reduce costs with Anthropic models.

It's worth noting that deepagents sits in a specific niche within LangChain's ecosystem. Where LangGraph excels at building custom workflows combining agents and logic, and core LangChain provides flexible agent loops from scratch, deepagents targets developers who want autonomous, long-running agents with built-in planning and filesystem capabilities.

If you’re developing autonomous or long-running agents, deepagents is well worth a closer look.

4. smolagents - agent framework that executes actions as code

Building AI agents that can reason through complex tasks and interact with external tools has become a critical capability, but existing frameworks often layer on abstractions that obscure what's actually happening under the hood. smolagents, an open-source library from Hugging Face, takes a different approach: distilling agent logic into roughly 1,000 lines of focused code that developers can actually understand and modify. For Python developers tired of framework bloat or looking for a clearer path into agentic AI, smolagents offers a refreshingly transparent foundation.

At its core, smolagents implements multi-step agents that execute tasks through iterative reasoning loops: observing, deciding, and acting until a goal is reached. What distinguishes the library is its first-class support for code agents, where the LLM writes actions as Python code snippets rather than JSON blobs. This might seem like a minor detail, but research shows it matters: code agents use roughly 30% fewer steps and achieve better performance on benchmarks compared to traditional tool-calling approaches. The reason is straightforward: Python was designed to express computational actions clearly, with natural support for loops, conditionals, and function composition that JSON simply can't match.

The library provides genuine flexibility in how you deploy these agents. You can use any LLM, whether that's a model hosted on Hugging Face, GPT-4 via OpenAI, Claude via Anthropic, or even local models through Transformers. Tools are equally flexible: define custom tools with simple decorated functions, import from LangChain, connect to MCP servers, or even use Hugging Face Spaces as tools. Security considerations are addressed through multiple execution environments, including E2B sandboxes, Docker containers, and WebAssembly isolation. For teams already invested in the Hugging Face ecosystem, smolagents integrates naturally, letting you share agents and tools as Spaces.

smolagents positions itself as the successor to transformers.agents and represents Hugging Face's evolving perspective on what agent frameworks should be: simple enough to understand fully, powerful enough for real applications, and honest about their design choices.

In a field obsessed with bigger models and bigger stacks, smolagents wins by being the one you can understand.

5. LlamaIndex Workflows - building complex AI workflows with ease

Building complex AI applications often means wrestling with intricate control flow: managing loops, branches, parallel execution, and state across multiple LLM calls and API interactions. Traditional approaches like directed acyclic graphs (DAGs) have attempted to solve this problem, but they come with notable limitations: logic gets encoded into edges rather than code, parameter passing becomes convoluted, and the resulting structure feels unnatural for developers building sophisticated agentic systems. LlamaIndex Workflows addresses these challenges with an event-driven framework that brings clarity and control to multi-step AI application development.

At its core, Workflows organizes applications around two simple primitives: steps and events. Steps are async functions decorated with @step that handle incoming events and emit new ones. Events are user-defined Pydantic objects that carry data between steps. This event-driven pattern makes complex behaviors, like reflection loops, parallel execution, and conditional branching, feel natural to implement. The framework automatically infers which steps handle which events through type annotations, providing early validation before your workflow even runs. Here's a glimpse of how straightforward the code becomes:

What makes Workflows particularly valuable is its async-first architecture built on Python's asyncio. Since LLM calls and API requests are inherently I/O-bound, the framework handles concurrent execution naturally, steps can run in parallel when appropriate, and you can stream results as they're generated. The Context object provides elegant state management, allowing workflows to maintain data across steps, serialize their state, and even resume from checkpoints.

Workflows makes complex AI behavior feel less like orchestration and more like real software design.

6. Batchata - unified batch processing for AI providers

When working with LLMs at scale, cost efficiency matters. Most major AI providers offer batch APIs that process requests asynchronously at 50% the cost of real-time endpoints, a substantial saving for data processing workloads that don't require immediate responses. The challenge lies in managing these batch operations: tracking jobs across different providers, monitoring costs, handling failures gracefully, and mapping structured outputs back to source documents. Batchata addresses this orchestration problem with a unified Python API that makes batch processing straightforward across Anthropic, OpenAI, and Google Gemini.

batchata focuses on production workflow details. Beyond basic job submission, the library provides cost limiting to prevent budget overruns, dry-run modes for estimating expenses before execution, and time constraints to ensure batches complete within acceptable windows. State persistence means network interruptions won't lose your progress. The library handles the mechanics of batch API interaction—polling for completion, retrieving results, managing retries—while exposing a clean interface that feels natural to Python developers.

The structured output support deserves particular attention. Using Pydantic models, you can define exactly what shape your results should take, and batchata will validate them accordingly. Developer experience is solid throughout. Installation is simple via pip or uv, configuration uses environment variables or .env files, and the API follows familiar patterns. The interactive progress display shows job completion, batch status, current costs against limits, and elapsed time. Results are saved to JSON files with clear organization, making post-processing straightforward.

Batch smarter, spend less, and save your focus for bachata nights.

7. MarkItDown - convert any file to clean Markdown

Working with documents in Python often means wrestling with multiple file formats like PDFs, Word documents, Excel spreadsheets, images, and more, each requiring different libraries and approaches. For developers building LLM-powered applications or text analysis pipelines, converting these varied formats into a unified, machine-readable structure has become a common bottleneck. MarkItDown, a Python utility from Microsoft, addresses this challenge by providing a single tool that converts diverse file types into Markdown, the format that modern language models understand best.

What makes MarkItDown practical is its breadth of format support and its focus on preserving document structure rather than just extracting raw text. The library handles PowerPoint presentations, Word documents, Excel spreadsheets, PDFs, images (with OCR), audio files (with transcription), HTML, and text-based formats like CSV and JSON. It even processes ZIP archives by iterating through their contents. Unlike general-purpose extraction tools, MarkItDown specifically preserves important structural elements, like headings, lists, tables, and links, in Markdown format, making the output immediately useful for LLM consumption without additional preprocessing.

Getting started is simple: install it with pip install 'markitdown[all]' for full format support or use selective extras like [pdf, docx, pptx]. You can convert files through the intuitive CLI (markitdown file.pdf > [output.md](http://output.md/)) or through the Python API by instantiating MarkItDown() and calling convert(). It also integrates with Azure Document Intelligence for advanced PDF parsing, can use LLM clients to describe images in presentations, and supports MCP servers for seamless use with tools like Claude Desktop, making it a strong choice for building AI-ready document processing workflows.

MarkItDown is actively maintained and already seeing adoption in the Python community, but it's worth noting that it's optimized for machine consumption rather than high-fidelity human-readable conversions. The Markdown output is clean and structured, designed to be token-efficient and LLM-friendly, but may not preserve every formatting detail needed for presentation-quality documents. For developers building RAG systems, document analysis tools, or any application that needs to ingest diverse document types into text pipelines, MarkItDown provides a practical, well-integrated solution that eliminates much of the format-juggling complexity.

If your work touches documents and language models, MarkItDown belongs in your stack.

8. Data Formulator - AI-powered data exploration through natural language

Creating compelling data visualizations often requires wrestling with two distinct challenges: designing the right chart and transforming messy data into the format your visualization tools expect. Most analysts bounce between separate tools: pandas for data wrangling, then moving to Tableau or matplotlib for charting, losing momentum with each context switch. Data Formulator from Microsoft Research addresses this friction by unifying data transformation and visualization authoring into a single, AI-powered workflow that feels natural rather than constraining.

What makes Data Formulator distinct is its blended interaction model. Rather than forcing you to describe everything through text prompts, it combines a visual drag-and-drop interface with natural language when you need it. You specify chart designs through a familiar encoding shelf, dragging fields to visual channels like any modern visualization tool. The difference? You can reference fields that don't exist yet. Type "profit_margin" or "top_5_regions" into the encoding shelf, optionally add a natural language hint about what you mean, and Data Formulator's AI backend generates the necessary transformation code automatically. The system handles reshaping, filtering, aggregation, and complex derivations while you focus on the analytical questions that matter.

The tool shines particularly in iterative exploration, where insights from one chart naturally lead to the next. Data Formulator maintains a "data threads" history, letting you branch from any previous visualization without starting over. Want to see only the top performers from that sales chart? Select it from your history, add a filter instruction, and move forward. The architecture separates data transformation from chart specification cleanly, using Vega-Lite for visualization and delegating transformation work to LLMs that generate pandas or SQL code. You can inspect the generated code, transformed data, and resulting charts at every step—full transparency with none of the tedious implementation work.

Data Formulator is an active research project rather than a production-ready commercial tool, which means you should expect occasional rough edges and evolving interfaces. However, it's already usable for exploratory analysis and represents a genuinely thoughtful approach to AI-assisted data work. By respecting that analysts think visually but work iteratively, and by letting AI handle transformation drudgery while keeping humans in control of analytical direction, Data Formulator points toward what the next generation of data tools might become. For Python developers doing exploratory data analysis, it's worth experimenting with—not as a replacement for your existing toolkit, but as a complement that might change how you approach certain analytical workflows.

9. LangExtract - extract key details from any document

Extracting structured data from unstructured text has long been a pain point for developers working with clinical notes, research papers, legal documents, and other text-heavy domains. While LLMs excel at understanding natural language, getting them to reliably output consistent, traceable structured information remains challenging. LangExtract, an open-source Python library from Google, addresses this problem with a focused approach: few-shot learning, precise source grounding, and built-in optimization for long documents.

What sets LangExtract apart is its emphasis on traceability. Every extracted entity is mapped back to its exact character position in the source text, enabling visual highlighting that makes verification straightforward. This feature proves particularly valuable in domains like healthcare, where accuracy and auditability are non-negotiable. The library enforces consistent output schemas through few-shot examples, leveraging controlled generation in models like Gemini to ensure robust, structured results. You define your extraction task with a simple prompt and one or two quality examples—no model fine-tuning required.

LangExtract tackles the "needle-in-a-haystack" problem that plagues information retrieval from large documents. Rather than relying on a single pass over lengthy text, it employs an optimized strategy combining text chunking, parallel processing, and multiple extraction passes. This approach significantly improves recall when extracting multiple entities from documents spanning thousands of characters. The library also generates interactive HTML visualizations that make it easy to explore hundreds or even thousands of extracted entities in their original context.

The developer experience is notably clean. Installation is straightforward via pip, and the API is intuitive: you provide text, a prompt description, and examples, then call lx.extract(). LangExtract supports various LLM providers including Gemini models (both cloud and Vertex AI), OpenAI, and local models via Ollama. A lightweight plugin system allows custom providers without modifying core code. The library even includes helpful defaults, like automatically discovering virtual environments and respecting pyproject.toml configurations.

For developers working with unstructured text who need reliable, traceable structured outputs, LangExtract offers a practical solution worth exploring.

10. GeoAI - bridging AI and geospatial data analysis

Applying machine learning to geospatial data has become essential across fields from environmental monitoring to urban planning, yet the path from satellite imagery to actionable insights remains surprisingly fragmented. Researchers and practitioners often find themselves stitching together general-purpose ML libraries with specialized geospatial tools, navigating steep learning curves and wrestling with preprocessing pipelines before any real analysis begins. GeoAI, a Python package from the Open Geospatial Solutions community, addresses this friction by providing a unified interface that connects modern AI frameworks with geospatial workflows—making sophisticated analyses accessible without sacrificing technical depth.

At its core, GeoAI integrates PyTorch, Transformers, and specialized libraries like PyTorch Segmentation Models into a cohesive framework designed specifically for geographic data. The package handles five essential capabilities: searching and downloading remote sensing imagery, preparing datasets with automated chip generation and labeling, training models for classification and segmentation tasks, running inference on new data, and visualizing results through Leafmap integration. This end-to-end approach means you can move from raw satellite imagery to trained models with considerably less boilerplate than traditional workflows require.

What makes GeoAI practical is its focus on common geospatial tasks. Building footprint extraction, land cover classification, and change detection—analyses that typically demand extensive setup—become straightforward with high-level APIs that abstract complexity without hiding it. The package supports standard geospatial formats (GeoTIFF, GeoJSON, GeoPackage) and automatically manages GPU acceleration when available. With over 10 modules and extensive Jupyter notebook examples and tutorials, GeoAI serves both as a research tool and an educational resource. Installation is simple via pip or conda, and the comprehensive documentation at opengeoai.org includes video tutorials that walk through real-world applications.

For Python developers working at the intersection of AI and geospatial analysis, GeoAI offers a practical path forward, reducing the friction between having satellite data and actually doing something useful with it. Worth exploring for your next geospatial project!

Runners-up – General use

AuthTuna – Security framework designed for modern async Python applications with first-class FastAPI support but framework-agnostic core capabilities. Features comprehensive authentication systems including traditional login flows, social SSO integration (Google, GitHub), multi-factor authentication with TOTP and email verification, role-based access control (RBAC), and fine-grained permission checking. Includes session management with device fingerprinting, database-backed storage, configurable lifetimes, and security controls for device/IP/region restrictions. Provides built-in user dashboard, email verification systems, WebAuthn support, and extensive configuration options for deployment in various environments from development to production with secrets manager integration. AuthTuna GitHub stars
FastRTC – Real-time communication library that transforms Python functions into audio and video streams over WebRTC or WebSockets. Features automatic voice detection and turn-taking for conversational applications, built-in Gradio UI for testing, automatic WebRTC and WebSocket endpoints when mounted on FastAPI apps, and telephone support with free temporary phone numbers. Supports both audio and video streaming modalities with customizable backends, making it suitable for building voice assistants, video chat applications, real-time transcription services, and computer vision applications. The library integrates seamlessly with popular AI services like OpenAI, Anthropic Claude, and Google Gemini for creating intelligent conversational interfaces. FastRTC GitHub stars
hexora – Static analysis tool specifically designed to identify malicious and harmful patterns in Python code for security auditing purposes. Features over 30 detection rules covering code execution, obfuscation, data exfiltration, suspicious imports, and malicious payloads, with confidence-based scoring to distinguish between legitimate and malicious usage. Supports auditing individual files, directories, and virtual environments with customizable output formats and filtering options. Particularly useful for supply-chain attack detection, dependency auditing, and analyzing potentially malicious scripts from various sources including PyPI packages and security incidents. hexora GitHub stars
opentemplate – All-in-one Python project template that provides a complete development environment with state-of-the-art tooling for code quality, security, and automation. Template includes comprehensive code formatting and linting with ruff and basedpyright, automated testing across Python versions with pytest, MkDocs documentation with automatic deployment, and extensive security features including SLSA Level 3 compliance, SBOMs, and static security analysis. Features a unified configuration system through pyproject.toml that controls pre-commit hooks, GitHub Actions, and all development tools, along with automated dependency updates, release management, and comprehensive GitHub repository setup with templates, labels, and security policies. opentemplate GitHub stars
PyByntic – Extension to Pydantic that enables binary serialization of models using custom binary types and annotations. Features include type-safe binary field definitions with precise control over numeric types (Int8, UInt32, Float64, etc.), string handling with variable and fixed-length options, date/time serialization, and support for nested models and lists. The package offers significant size efficiency compared to JSON serialization, making it ideal for applications requiring compact data storage or network transmission. Development includes comprehensive testing, compression support, and custom encoder capabilities for specialized use cases. PyByntic GitHub stars
pyochain – Functional-style method chaining library that brings fluent, declarative APIs to Python iterables and dictionaries. It provides core components including Iter[T] for lazy operations on iterators, Seq[T] for eager evaluation of sequences, Dict[K, V] for chainable dictionary manipulation, Result[T, E] for explicit error handling, and Option[T] for safe optional value handling. The library emphasizes type safety through extensive use of generics and overloads, operates with lazy evaluation for efficiency on large datasets, and encourages functional paradigms by composing simple, reusable functions rather than implementing custom classes. pyochain GitHub stars
Pyrefly – Type checker and language server that combines lightning-fast type checking with comprehensive IDE features including code navigation, semantic highlighting, and code completion. Built in Rust for performance, it features advanced type inference capabilities, flow-sensitive type analysis, and module-level incrementality with optimized parallelism. The tool supports both command-line usage and editor integration, with particular focus on large-scale codebases through its modular architecture that handles strongly connected components of modules efficiently. Pyrefly draws inspiration from established type checkers like Pyre, Pyright, and MyPy while making distinct design choices around type inference, flow types, and incremental checking strategies. Pyrefly GitHub stars
reaktiv – State management library that enables declarative reactive programming through automatic dependency tracking and updates. It provides three core building blocks - Signal for reactive values, Computed for derived state, and Effect for side effects - that work together like Excel spreadsheets where changing one value automatically recalculates all dependent formulas. The library features lazy evaluation, smart memoization, fine-grained reactivity that only updates what changed, and full type safety support. It addresses common state management problems by eliminating forgotten updates, preventing inconsistent data, and making state relationships explicit and centralized. reaktiv GitHub stars
Scraperr – Self-hosted web scraping solution designed for extracting data from websites without requiring any coding knowledge. Features XPath-based element targeting, queue management for multiple scraping jobs, domain spidering capabilities, custom headers support, automatic media downloads, and results visualization in structured table formats. Built with FastAPI backend and Next.js frontend, it provides data export options in markdown and CSV formats, notification channels for job completion, and a user-friendly interface for managing scraping operations. The platform emphasizes ethical scraping practices and includes comprehensive documentation for deployment using Docker or Helm. Scraperr GitHub stars
Skills – Repository of example skills for Claude's skills system that demonstrates various capabilities ranging from creative applications like art and music to technical tasks such as web app testing and MCP server generation. The skills are self-contained folders with SKILL.md files containing instructions and metadata that Claude loads dynamically to improve performance on specialized tasks. The repository includes both open-source example skills under Apache 2.0 license and source-available document creation skills that power Claude's production document capabilities, serving as reference implementations for developers creating their own custom skills. Skills GitHub stars
textcase – Text case conversion utility that transforms strings between various naming conventions and formatting styles such as snake_case, kebab-case, camelCase, PascalCase, and others. The utility accurately handles complex word boundaries including acronyms and supports non-ASCII characters without making language-specific inferences. It features an extensible architecture that allows custom word boundaries and cases to be defined, operates without external dependencies using regex-free algorithms for efficient performance, and provides full type annotations with comprehensive test coverage for reliable text processing workflows. textcase GitHub stars

Runners-up – AI/ML/Data

Agent Development Kit (ADK) – Code-first framework that applies software development principles to AI agent creation, designed to simplify building, deploying, and orchestrating agent workflows from simple tasks to complex systems. Features a rich tool ecosystem with pre-built tools, OpenAPI specs, and MCP tools integration, modular multi-agent system design for scalable applications, and flexible deployment options including Cloud Run and Vertex AI Agent Engine. The framework is model-agnostic and deployment-agnostic while being optimized for Gemini, includes a built-in development UI for testing and debugging, and supports agent evaluation workflows. It integrates with the Agent2Agent (A2A) protocol for remote agent communication and provides both single-agent and multi-agent coordinator patterns. Agent Development Kit (ADK) GitHub stars
Archon – Command center for AI coding assistants that serves as an MCP server enabling AI agents to access shared knowledge, context, and tasks. Features smart web crawling for documentation sites, document processing for PDFs and markdown files, vector search with semantic embeddings, and hierarchical project management with AI-assisted task creation. Built with microservices architecture including React frontend, FastAPI backend, MCP server interface, and PydanticAI agents service, all connected through real-time WebSocket updates and collaborative workflows. Integrates with popular AI coding assistants like Claude Code, Cursor, and Windsurf to enhance their capabilities with custom knowledge bases and structured task management. Archon GitHub stars
Attachments – File processing pipeline designed to extract text and images from diverse file formats for large language model consumption. Supports PDFs, Microsoft Office documents, images, web pages, CSV files, repositories, and archives through a unified API with DSL syntax for advanced operations. Features extensible plugin architecture with loaders, modifiers, presenters, refiners, and adapters for customizing processing pipelines. Includes built-in integrations for OpenAI, Anthropic Claude, and DSPy frameworks, plus advanced capabilities like CSS selector highlighting for web scraping and image transformations. Attachments GitHub stars
Claude Agent SDK – SDK for integrating with Claude Agent that provides both simple query operations and advanced conversational capabilities through bidirectional communication. Features async query functions for basic interactions, custom tools implemented as in-process MCP servers for defining Python functions that Claude can invoke, and hooks for automated feedback and deterministic processing during the Claude agent loop. Supports tool management with both internal and external MCP servers, working directory configuration, permission modes, and comprehensive error handling for building sophisticated Claude-powered applications. Claude Agent SDK GitHub stars
df2tables – Utility designed for converting Pandas and Polars DataFrames into interactive HTML tables powered by the DataTables JavaScript library. The tool focuses on web framework integration with seamless embedding capabilities for Flask, Django, FastAPI, and other web frameworks. It renders tables directly from JavaScript arrays to deliver fast performance and compact file sizes, enabling smooth browsing of large datasets while maintaining full responsiveness. The utility includes features like filtering, sorting, column control, customizable DataTables configuration through Python, and minimal dependencies requiring only pandas or polars. df2tables GitHub stars
FlashMLA – Optimized attention kernels library specifically designed for Multi-head Latent Attention (MLA) computations, powering DeepSeek-V3 and DeepSeek-V3.2-Exp models. The library implements both sparse and dense attention kernels for prefill and decoding stages, featuring DeepSeek Sparse Attention (DSA) with token-level optimization and FP8 KV cache support. It provides high-performance implementations for SM90 and SM100 GPU architectures, achieving up to 660 TFlops in compute-bound configurations on H800 GPUs and supporting both Multi-Query Attention and Multi-Head Attention modes. The library is optimized for inference workloads and includes specialized kernels for memory-bound and computation-bound scenarios. FlashMLA GitHub stars
Flowfile – Visual ETL tool and library suite that combines drag-and-drop workflow building with the speed of Polars dataframes for high-performance data processing. It operates as three interconnected services including a visual designer (Electron + Vue), ETL engine (FastAPI), and computation worker, representing each flow as a directed acyclic graph (DAG) where nodes represent data operations. The platform supports complex data transformations like fuzzy matching joins, text processing, filtering, grouping, and custom formulas, while enabling users to export visual flows as standalone Python/Polars code for production deployment. Flowfile includes both a desktop application and a programmatic FlowFrame API that provides a Polars-like interface for creating data pipelines in Python code. Flowfile GitHub stars
Gitingest – Git repository text converter specifically designed to transform any Git repository into a format optimized for Large Language Model prompts. The tool intelligently processes repository content to create structured text digests that include file and directory structure, size statistics, and token count information. It supports both local directories and remote GitHub repositories (including private ones with token authentication), offers both command-line interface and Python package integration, and includes smart formatting features like .gitignore respect and submodule handling. The package is particularly valuable for developers working with AI tools who need to provide repository context to LLMs in an efficient, structured format. Gitingest GitHub stars
gpt-oss – Open-weight language models released in two variants: gpt-oss-120b (117B parameters with 5.1B active) for production use on single 80GB GPUs, and gpt-oss-20b (21B parameters with 3.6B active) for lower latency and local deployment. Both models feature configurable reasoning effort, full chain-of-thought access, native function calling capabilities, web browsing and Python code execution tools, and MXFP4 quantization for efficient memory usage. The models require the harmony response format and include Apache 2.0 licensing for commercial deployment. gpt-oss GitHub stars
MaxText – High performance, highly scalable LLM library written in pure Python/JAX targeting Google Cloud TPUs and GPUs for training. The library includes pre-built implementations of major models like Gemma, Llama, DeepSeek, Qwen, and Mistral, supporting both pre-training (up to tens of thousands of chips) and scalable post-training techniques such as Supervised Fine-Tuning (SFT) and Group Relative Policy Optimization (GRPO). MaxText achieves high Model FLOPs Utilization (MFU) and tokens/second performance from single host to very large clusters while maintaining simplicity through the power of JAX and XLA compiler. The library serves as both a reference implementation for building models from scratch and a scalable framework for post-training existing models, positioning itself as a launching point for ambitious LLM projects in both research and production environments. MaxText GitHub stars
Memvid – AI memory storage system that converts text chunks into QR codes embedded in video frames, leveraging video compression codecs to achieve 50-100× smaller storage than traditional vector databases. The system encodes text as QR codes in MP4 files while maintaining millisecond-level semantic search capabilities through smart indexing that maps embeddings to frame numbers. Features include PDF processing, interactive web UI, parallel processing, and offline-first design with zero infrastructure requirements. Performance includes processing ~10K chunks/second during indexing, sub-100ms search times for 1M chunks, and dramatic storage reduction from 100MB text to 1-2MB video files. Memvid GitHub stars
nanochat – Complete implementation of a large language model similar to ChatGPT in a single, minimal, hackable codebase that handles the entire pipeline from tokenization through web serving. Training system designed to run on GPU clusters with configurable model sizes ranging from $100 to $1000 training budgets, producing models with 1.9 billion parameters trained on tens of billions of tokens. Features include distributed training capabilities, evaluation metrics, reinforcement learning, synthetic data generation for customization, and a web-based chat interface. Framework serves as the capstone project for the LLM101n course and emphasizes accessibility through cognitive simplicity while maintaining performance comparable to historical models like GPT-2. nanochat GitHub stars
OmniParser – Screen parsing tool designed to parse user interface screenshots into structured and easy-to-understand elements, significantly enhancing the ability of vision-language models like GPT-4V to generate actions that can be accurately grounded in corresponding interface regions. The tool features interactive region detection, icon functional description capabilities, and fine-grained element detection including small icons and interactability prediction. It includes OmniTool for controlling Windows 11 VMs and supports integration with various large language models including OpenAI, DeepSeek, Qwen, and Anthropic Computer Use. OmniParser has achieved state-of-the-art results on GUI grounding benchmarks and is particularly effective for building pure vision-based GUI agents. OmniParser GitHub stars
OpenAI Agents SDK – Framework for building multi-agent workflows that supports OpenAI APIs and 100+ other LLMs through a provider-agnostic approach. Core features include agents configured with instructions, tools, and handoffs for transferring control between agents, configurable guardrails for input/output validation, automatic session management for conversation history, and built-in tracing for debugging and optimization. The framework enables complex agent patterns including deterministic flows and iterative loops, with support for long-running workflows through Temporal integration and human-in-the-loop capabilities. Session memory can be implemented using SQLite, Redis, or custom implementations to maintain conversation context across multiple agent runs. OpenAI Agents SDK GitHub stars
OpenManus – Open-source framework for building general AI agents that can perform computer use tasks and web automation without requiring invite codes or restricted access. The framework includes multiple agent types including general-purpose agents and specialized data analysis agents, with support for browser automation through Playwright integration. It provides multi-agent workflows and features integration with various LLM APIs including OpenAI GPT models, offering both single-agent and multi-agent execution modes. The project includes reinforcement learning capabilities through OpenManus-RL for advanced agent training and optimization. OpenManus GitHub stars
OWL – Multi-agent collaboration framework designed for general assistance and task automation in real-world scenarios. The framework leverages dynamic agent interactions to enable natural, efficient, and robust automation across diverse domains including web interaction, document processing, code execution, and multimedia analysis. Built on top of the CAMEL-AI Framework, it provides a comprehensive toolkit ecosystem with capabilities for browser automation, search integration, and specialized tools for various domains. OWL has achieved top performance on the GAIA benchmark, ranking #1 among open-source frameworks with advanced features for workforce learning and optimization. OWL GitHub stars
Parlant – AI agent framework that addresses the core problem of LLM unpredictability by ensuring agents follow instructions rather than hoping they will. Instead of relying on complex system prompts, it uses behavioral guidelines, conversational journeys, tool integration, and domain adaptation to create predictable, consistent agent behavior. The framework includes features like dynamic guideline matching, built-in guardrails to prevent hallucinations, conversation analytics, and full explainability of agent decisions. It's particularly suited for production environments where reliability and compliance are critical, such as financial services, healthcare, e-commerce, and legal applications. Parlant GitHub stars
TensorFlow Optimizers Collection – Comprehensive library implementing state-of-the-art optimization algorithms for deep learning in TensorFlow. The collection includes adaptive optimizers like AdaBelief, AdamP, and RAdam; second-order methods like Sophia and Shampoo; hybrid approaches like Ranger variants combining multiple techniques; memory-efficient optimizers like AdaFactor and SM3; distributed training optimizers like LAMB and Muon; and experimental methods like EmoNavi with emotion-driven updates. Many optimizers support advanced features including gradient centralization, lookahead mechanisms, subset normalization for memory efficiency, and automatic step-size adaptation. TensorFlow Optimizers Collection GitHub stars
trackio – Lightweight experiment tracking library designed as a drop-in replacement for wandb with API compatibility for wandb.init, wandb.log, and wandb.finish functions. Features a local-first design that runs dashboards locally by default while persisting logs in a local SQLite database, with optional deployment to Hugging Face Spaces for remote hosting. Includes a Gradio-based dashboard for visualizing experiments that can be embedded in websites and blog posts with customizable query parameters for filtering projects, metrics, and display options. Built with extensibility in mind using less than 5,000 lines of Python code, making it easy for developers to fork and add custom functionality while keeping everything free including Hugging Face hosting. trackio GitHub stars

Long tail

In addition to our top choices, many underrated libraries also stand out. We examined hundreds of them and organized everything into categories with short, helpful summaries for easy discovery.

{" "}

Category	Library	Description
AI Agents	agex	Python-native agentic framework that enables AI agents to work directly with existing libraries and codebases.
	agex-ui	Framework extension that enables AI agents to create dynamic, interactive user interfaces at runtime using NiceGUI components through direct API access.
	Grasp Agents	Modular framework for building agentic AI pipelines and applications with granular control over LLM handling and agent communication.
	IntentGraph	AI-native codebase intelligence library that provides pre-digested, structured code analysis with natural language interfaces for autonomous coding agents.
	Linden	Framework for building AI agents with multi-provider LLM support, persistent memory, and function calling capabilities.
	mcp-agent	Framework for building AI agents using Model Context Protocol (MCP) servers with composable patterns and durable execution capabilities.
	Notte	Web agent framework for building AI agents that interact with websites through natural language tasks and structured outputs.
	Pybotchi	Deterministic, intent-based AI agent builder with nested supervisor agent architecture.
AI Security	RESK-LLM	Security toolkit for Large Language Models providing protection against prompt injections, data leakage, and malicious use across multiple LLM providers.
AI Security	Rival AI	AI safety framework providing guardrails for production AI systems through real-time malicious query detection and automated red teaming capabilities.
AI Toolkits	Pipelex	Open-source language for building and running repeatable AI workflows with structured data types and validation.
AI Toolkits	RocketRAG	High-performance Retrieval-Augmented Generation (RAG) system focused on speed, simplicity, and extensibility.
Asynchronous Tools	CMQ	Cloud Multi Query library and CLI tool for running queries across multiple cloud accounts in parallel.
	throttlekit	Lightweight, asyncio-based rate limiting library providing flexible and efficient rate limiting solutions with Token Bucket and Leaky Bucket algorithms.
	transfunctions	Code generation library that eliminates sync/async code duplication by generating multiple function types from single templates.
	Wove	Async task execution framework for running high-latency concurrent operations with improved user experience over asyncio.
Caching and Persistence	TursoPy	Lightweight, dependency-minimal client for Turso databases with simple CRUD operations and batch processing support.
Command-Line Tools	Envyte	Command-line tool and API helper for auto-loading environment variables from .env files before running Python scripts or commands.
	FastAPI Cloud CLI	Command-line interface for cloud operations with FastAPI applications.
	gs-batch-pdf	Command-line tool for batch processing PDF files using Ghostscript with parallel execution.
	Mininterface	Universal interface library that provides automatic GUI, TUI, web, CLI, and config file access from a single codebase using dataclasses.
	SSHUP	Command-line SSH connection manager with interactive terminal interface for managing multiple SSH servers.
Computer Vision	Otary	Image processing and 2D geometry manipulation library with unified API for computer vision tasks.
Data Handling	fastquadtree	Rust-optimized quadtree data structure with spatial indexing capabilities for points and bounding boxes.
	molabel	Annotation widget for labeling examples with speech recognition support.
	Python Pest	PEG (Parsing Expression Grammar) parser generator ported from the Rust pest library.
	SeedLayer	Declarative fake data seeder for SQLAlchemy ORM models that generates realistic test data using Faker.
	SPDL	Data loading library designed for scalable and performant processing of array data. By Meta.
	Swizzle	Decorator-based utility for multi-attribute access and manipulation of Python objects using simple attribute syntax.
Data Interoperability	Archivey	Unified interface for reading various archive formats with automatic format detection.
	KickApi	Client library for integrating with the Kick streaming platform API to retrieve channel, video, clip, and chat data.
	pyro-mysql	High-performance MySQL driver for Python backed by Rust.
	StupidSimple Dataclasses Codec	Serialization codec for converting Python dataclasses to and from various formats including JSON.
Data Processing	calc-workbook	Excel file processor that loads spreadsheets, computes all formulas, and provides a clean API for accessing calculated cell values.
	Elusion	DataFrame data engineering library built on DataFusion query engine with END-TO-END capabilities including connectors for Microsoft stack (Fabric OneLake, SharePoint, Azure Blob), databases, APIs, and automated pipeline scheduling.
	Eruo Data Studio	Integrated data platform that combines Excel-like flexibility, business intelligence visualization, and ETL data preparation capabilities in a single environment.
	lilpipe	Lightweight, typed, sequential pipeline engine for building and running workflows.
	Parmancer	Text parsing library using parser combinators with comprehensive type annotations for structured data extraction.
	PipeFunc	Computational workflow library for creating and executing function pipelines represented as directed acyclic graphs (DAGs).
	Pipevine	Lightweight async pipeline library for building fast, concurrent dataflows with backpressure control, retries, and flexible worker orchestration.
	PydSQL	Lightweight utility that generates SQL CREATE TABLE statements directly from Pydantic models.
	trendspyg	Real-time Google Trends data extraction library with support for 188,000+ configuration options across RSS feeds and CSV exports.
DataFrame Tools	smartcols	Utilities for reordering and grouping pandas DataFrame columns without index gymnastics.
Database Extensions	Coffy	Local-first embedded database engine supporting NoSQL, SQL, and Graph models in pure Python.
Desktop Applications	MotionSaver	Windows screensaver application that displays video wallpapers with customizable widgets and security features.
	WinUp	Modern UI framework that wraps PySide6 (Qt) in a simple, declarative, and developer-friendly API for building beautiful desktop applications.
	Zypher	Windows-based video and audio downloader with GUI interface powered by yt_dlp.
Jupyter Tools	Erys	Terminal interface for opening, creating, editing, running, and saving Jupyter Notebooks in the terminal.
LLM Interfaces	ell	Lightweight, functional prompt engineering framework for language model programs with automatic versioning and multimodal support.
	flowmark	Markdown auto-formatter designed for better LLM workflows, clean git diffs, and flexible use from CLI, IDEs, or as a library.
	mcputil	Lightweight library that converts MCP (Model Context Protocol) tools into Python function-like objects.
	OpenAI Harmony	Response format implementation for OpenAI's open-weight gpt-oss model series. By OpenAI.
	ProML (Prompt Markup Language)	Structured markup language for Large Language Model prompts with a complete toolchain including parser, runtime, CLI, and registry.
	Prompt Components	Template-based component system using dataclasses for creating reusable, type-safe text components with support for standard string formatting and Jinja2 templating.
	Prompture	API-first library for extracting structured JSON and Pydantic models from LLMs with schema validation and multi-provider support.
	SimplePrompts	Minimal library for constructing LLM prompts with Python-native syntax and dynamic control flow.
	Universal Tool Calling Protocol (UTCP)	Secure, scalable standard for defining and interacting with tools across communication protocols using a modular plugin-based architecture.
ML Development	Fast-LLM	Open-source library for training large language models with optimized speed, scalability, and flexibility. By ServiceNow.
	TorchSystem	PyTorch-based framework for building scalable AI training systems using domain-driven design principles, dependency injection, and message patterns.
	Tsururu (TSForesight)	Time series forecasting strategies framework providing multi-series and multi-point-ahead prediction strategies compatible with any underlying model including neural networks.
ML Testing & Evaluation	DL Type	Runtime type checking library for PyTorch tensors and NumPy arrays with shape validation and symbolic dimension support.
	Python Testing Tools MCP Server	Model Context Protocol (MCP) server providing AI-powered Python testing capabilities including unit test generation, fuzz testing, coverage analysis, and mutation testing.
	treemind	High-performance library for interpreting tree-based models through feature analysis and interaction detection.
	Verdict	Declarative framework for specifying and executing compound LLM-as-a-judge systems with hierarchical reasoning capabilities.
Multi-Agent Systems	MCP Kit Python	Toolkit for developing and optimizing multi-agent AI systems using the Model Context Protocol (MCP).
Multi-Agent Systems	npcpy	Framework for building natural language processing pipelines and LLM-powered agent systems with support for multi-agent teams, fine-tuning, and evolutionary algorithms.
NLP	doespythonhaveit	Library search engine that allows natural language queries to discover Python packages.
NLP	tenets	NLP CLI tool that automatically finds and builds the most relevant context from codebases using statistical algorithms and optional deep learning techniques.
Networking and Communication	Cap'n Web Python	Complete implementation of the Cap'n Web protocol, providing capability-based RPC system with promise pipelining, structured errors, and multiple transport support.
	httpmorph	HTTP client library focused on mimicking browser fingerprints with Chrome 142 TLS fingerprint matching capabilities.
	Miniappi	Client library for the Miniappi app server that enables Python applications to interact with the Miniappi platform.
	PyWebTransport	Async-native WebTransport stack providing full protocol implementation with high-level frameworks for server applications and client management.
	robinzhon	High-performance library for concurrent S3 object transfers using Rust-optimized implementation.
	WebPath	HTTP client library that reduces boilerplate when interacting with APIs, built on httpx and jmespath.
Neural Networks	thoad	Lightweight reverse-mode automatic differentiation engine for computing arbitrary-order partial derivatives on PyTorch computational graphs.
Niche Tools	Clockwork	Infrastructure as Code framework that provides composable primitives with AI-powered assistance.
	Cybersecurity Psychology Framework (CPF)	Psychoanalytic-cognitive framework for assessing pre-cognitive security vulnerabilities in human behavior.
	darkcore	Lightweight functional programming toolkit bringing Functor/Applicative/Monad abstractions and classic monads like Maybe, Either/Result, Reader, Writer, and State with an expressive operator DSL.
	DiscoveryLastFM	Music discovery automation tool that integrates Last.fm, MusicBrainz, Headphones, and Lidarr to automatically discover and queue new albums based on listening history.
	Fusebox	Lightweight dependency injection container built for simplicity and minimalism with automatic dependency resolution.
	Injectipy	Dependency injection library that uses explicit scopes instead of global state, providing type-safe dependency resolution with circular dependency detection.
	Klyne	Privacy-first analytics platform for tracking Python package usage, version adoption, OS distribution, and custom events.
	MIDI Scripter	Framework for filtering, modifying, routing and handling MIDI, Open Sound Control (OSC), keyboard and mouse input and output.
	numeth	Numerical methods library implementing core algorithms for engineering and applied mathematics with educational clarity.
	PAR CLI TTS	Command-line text-to-speech tool supporting multiple TTS providers (ElevenLabs, OpenAI, and Kokoro ONNX) with intelligent voice caching and flexible output options.
	pycaps	Tool for adding CSS-styled subtitles to videos with automated transcription and customizable animations.
	PyDepends	Lightweight dependency injection library with decorator-based API supporting both synchronous and asynchronous code in a FastAPI-like style.
	Pylan	Library for calculating and analyzing the combined impact of recurring events such as financial projections, investment gains, and savings.
	Python for Nonprofits	Educational guide for applying Python programming in nonprofit organizations, covering data analysis, visualization, and reporting techniques.
	Quantium	Lightweight library for unit-safe scientific and mathematical computation with dimensional analysis.
	Reduino	Python-to-Arduino transpiler that converts Python code into Arduino C++ and optionally uploads it to microcontrollers via PlatformIO.
	TiBi	GUI application for performing Tight Binding calculations with graphical system construction.
	Torch Lens Maker	Differentiable geometric optics library based on PyTorch for designing complex optical systems using automatic differentiation and numerical optimization.
	torch-molecule	Deep learning framework for molecular discovery featuring predictive, generative, and representation models with a sklearn-style interface.
	TurtleSC	Mini-language extension for Python's turtle module that provides shortcut instructions for function calls.
OCR	bbox-align	Library that reorders bounding boxes from OCR engines into logical lines and correct reading order for document processing.
	Morphik	AI-native toolset for processing, searching, and managing visually rich documents and multimodal data.
	OCR-StringDist	String distance library for learning, modeling, explaining and correcting OCR errors using weighted Levenshtein distance algorithms.
Optimization Tools	ConfOpt	Hyperparameter optimization library using conformal uncertainty quantification and multiple surrogate models for machine learning practitioners.
	Functioneer	Batch runner for function analysis and optimization with parameter sweeps.
	generalized-dual	Minimal library for generalized dual numbers and automatic differentiation supporting arbitrary-order derivatives, complex numbers, and vectorized operations.
	Solvex	REST API service for solving Linear Programming optimization problems using SciPy.
Reactive Programming and State Management	python-cq	Lightweight library for separating code according to Command and Query Responsibility Segregation principles.
System Utilities	cogeol	Python version management tool that automatically aligns projects with supported Python versions using endoflife.date data.
	comver	Tool for calculating semantic versioning using commit messages without requiring Git tags.
	dirstree	Directory traversal library with advanced filtering, cancellation token support, and multiple crawling methods
	loadfig	One-liner Python pyproject config loader with root auto-discovery and VCS awareness.
	pipask	Drop-in replacement for pip that performs security checks before installing Python packages.
	pywinselect	Windows utility for detecting selected files and folders in File Explorer and Desktop.
	TripWire	Environment variable management system with import-time validation, type inference, secret detection, and team synchronization capabilities.
	veld	Terminal-based file manager with tileable panels and file previews built on Textual.
	venv-rs	High-level Python virtual environment manager with terminal user interface for inspecting and managing virtual environments.
	venv-stack	Lightweight PEP 668-compliant tool for creating layered Python virtual environments that can share dependencies across multiple base environments.
Testing, Debugging & Profiling	dowhen	Code instrumentation library for executing arbitrary code at specific points in applications with minimal overhead.
	GrapeQL	GraphQL security testing tool for detecting vulnerabilities in GraphQL APIs.
	lintkit	Framework for building custom linters and code checking rules.
	notata	Minimal library for structured filesystem logging of scientific runs.
	pretty-dir	Enhanced debugging tool providing organized and colorized output for Python's built-in `dir` function.
	Request Speed Test	High-throughput HTTP load testing project demonstrating over 20,000 requests per second using the Rust-based rnet library with optimized system configurations.
	structlog-journald	Structlog processor for sending logs to journald.
	Trevis	Console visualization tool for recursive function execution flows.
Time and Date Utilities	Temporals	Minimalistic utility library for working with time and date periods on top of Python's datetime module.
Visualization	detroit	Python implementation of the D3.js data visualization library.
Visualization	RowDump	Structured table output library with ASCII box drawing, custom formatting, and flexible column definitions.
Web Crawling & Scraping	proxyutils	Proxy parser and formatter for handling various proxy formats and integration with web automation tools.
Web Crawling & Scraping	PyBA	Browser automation software that uses AI to perform web testing, form filling, and exploratory web tasks without requiring exact inputs.
Web Development	AirFlask	Production deployment tool for Flask web applications using nginx and gunicorn.
	APIException	Standardized exception handling library for FastAPI that provides consistent JSON responses and improved Swagger documentation.
	ecma426	Source map implementation supporting both decoding and encoding according to the ECMA-426 specification.
	Fast Channels	WebSocket messaging library that brings Django Channels-style consumers and channel layers to FastAPI, Starlette, and other ASGI frameworks for real-time applications.
	fastapi-async-storages	Async-ready cloud object storage backend for FastAPI applications.
	Func To Web	Web application generator that converts Python functions with type hints into interactive web UIs with minimal boilerplate.
	html2pic	HTML and CSS to image converter that renders web markup to high-quality images without requiring a browser engine.
	Lazy Ninja	Django library that simplifies the generation of API endpoints using Django Ninja through dynamic model scanning and automatic Pydantic schema creation.
	panel-material-ui	Extension library that integrates Material UI design components and theming capabilities into Panel applications.
	pyeasydeploy	Simple server deployment toolkit for deploying applications to remote servers with minimal setup.
	Python Hiccup	Library for representing HTML using plain Python data structures with Hiccup syntax.
	WEP — Web Embedded Python	Lightweight server-side template engine and micro-framework for embedding native Python directly inside HTML using .wep files and <wep> tags.