Specialized AI Agent Tools
Domain-specific agent tools designed for particular use cases like customer service, sales, marketing, data analysis, security, and DevOps automation.
Open-source technical standard developed by a coalition including Scope3, Yahoo, and PubMatic, designed to allow AI agents from advertisers, publishers, and ad tech platforms to communicate and autonomously execute advertising tasks.
Agentic AI assistant integrated into Adobe Express that allows users to create and edit designs through natural, conversational language, aimed at non-design professionals.
Enterprise tool integration platform designed to connect AI agents with business applications and workflows. Provides secure, managed connections between agents and enterprise systems with authentication handling, compliance controls, and centralized tool management for production deployments.
Official direct benchmark harness from the AutoGPT project for evaluating autonomous agent performance across diverse tasks without Agent Protocol server overhead. Provides standardized challenge suites, scoring workflows, and CLI tooling for comparing agent capabilities on planning, reasoning, tool use, and task completion.
Comprehensive benchmark for evaluating LLM-as-Agent capabilities across 8 distinct environments including coding, gaming, web browsing, and household tasks. Provides standardized evaluation protocols, multi-dimensional metrics, and leaderboards for comparing agent performance across diverse real-world scenarios.
Monitoring and analytics platform designed specifically for autonomous AI agents. Provides real-time tracking of agent behaviors, decision patterns, and performance metrics with anomaly detection and comprehensive dashboards for production agent systems.
Tool library and integration framework providing 70+ reusable tools for AI agents including image processing, OCR, search, and data analysis capabilities. Offers standardized tool interfaces compatible with LangChain, Transformers Agents, and other frameworks with easy-to-use APIs and comprehensive documentation.
Open-source Python SDK for AI agent observability providing session replays, metrics, and monitoring for LangChain, CrewAI, and AutoGen. Features detailed agent execution tracking, LLM call logging, cost analysis, and performance metrics for debugging and optimizing multi-agent systems.
Open-source observability framework for monitoring and debugging AI agent systems. Features execution tracing, state inspection, and visual debugging tools for understanding agent behaviors and optimizing performance in multi-agent environments.
Modular benchmark and development platform for evaluating and building LLM agents. Features customizable evaluation pipelines, standardized metrics, and tooling for systematic agent testing across reasoning, planning, and execution capabilities.
Local-first TUI and CLI for AI coding-agent trace logs. Agenttrace turns local agent traces into cost, token, latency, and health regression reports for debugging coding-agent workflows.
Open-source Base Sepolia testnet settlement rails for humans and AI agents to hire AI agents with EIP-712 signed offers, USDC escrow, proof submission, and programmable release/refund/dispute lifecycle. Includes a CLI, JavaScript SDK, read-only MCP server, and x402 compatibility notes for composing pay-per-call access with escrowed outcome-based work.
Official Apify MCP server that lets agents run Apify Actors for web scraping, crawling, search, maps, ecommerce, and social-media data extraction through the MCP tool interface.
Open-source AI observability platform for evaluating, troubleshooting, and monitoring LLM applications and agents. Provides experiment tracking, prompt tracing, retrieval analysis, and LLM evaluations with support for traces, spans, and comprehensive debugging tools for production AI systems.
Official Asana MCP server for giving AI assistants access to the Asana Work Graph. It supports authenticated interactions with tasks, projects, portfolios, teams, and project-management workflows.
LLM monitoring and evaluation platform for production AI applications. Athina provides real-time monitoring, batch evaluations, hallucination checks, PII checks, dataset evaluation, and RAG pipeline quality workflows.
Python SDK for Atla Insights, a platform for monitoring and improving AI agents. It helps developers instrument agent workflows, capture evaluation data, and feed results into Atla's agent monitoring stack.
Atlassian's Rovo MCP server for connecting agents to Jira, Confluence, Compass, and Atlassian work data. It supports search, summarization, issue creation, page updates, and remote MCP access with enterprise controls.
Automated evaluation and governance platform for generative AI systems. Aymara generates policy-grounded safety, accuracy, fairness, and compliance evaluations, scores model or application responses, and helps teams monitor and improve deployed AI behavior.
Open-source Python library for synthetic data curation, post-training data generation, and structured data extraction. Bespoke Curator helps teams build scalable LLM-powered data pipelines with async execution, caching, fault recovery, interactive viewing, and dataset curation recipes.
Enterprise voice AI platform for automating inbound and outbound phone calls. Bland AI provides phone agents, call APIs, webhooks, workflow integrations, simulations, regression testing, analytics, and custom voice experiences for large-scale customer communication.
Enterprise-grade AI product stack providing evaluations, prompt playground, logging, and dataset management for AI agents. Offers end-to-end workflow for building reliable AI products with continuous evaluation, prompt optimization, and production monitoring capabilities.
AI-powered phone call automation for scheduling built into Cal.com, featuring customizable human-like conversations that reduce no-shows and boost conversions. Allows users to assign dedicated phone numbers, write custom script prompts, define agent personality and tone, trigger calls on form submission or before meetings, and automate booking workflows at $0.29 per minute.
Real-time speech generation API optimized for conversational voice agents. Cartesia Sonic provides low-latency text-to-speech, expressive voices, voice cloning, multilingual output, and streaming APIs used as the voice layer in interactive agent pipelines.
End-to-end quality assurance platform for conversational AI agents providing automated testing, observability, and monitoring for voice and chat bots. Covers full agent lifecycle from pre-production simulation to post-deployment analytics with real-time failure alerts and regression tracking.
Open-source search infrastructure and embedding database for AI applications. Chroma supports vector, full-text, metadata, and hybrid search locally or through Chroma Cloud, making it common infrastructure for RAG prototypes and production applications.
Open-source document intelligence API for layout analysis, OCR, and semantic chunking. Chunkr converts PDFs, presentations, Word documents, and images into structured HTML, markdown, or JSON chunks for RAG and LLM pipelines.
Integrated edge computing platform announced November 2025 for distributed agentic AI workloads, combining compute, networking, and storage into a single modular system. Features CPU and GPU configurations, up to 120TB storage, redundant power and cooling, integrated 25-gigabit networking, zero-touch deployment, and pre-validated blueprints designed for real-time AI inferencing from retail stores to healthcare facilities to factory floors.
Specialized version of Anthropic Claude model aimed at supporting the entire scientific process, featuring new connectors to scientific platforms like Benchling to assist with research and discovery.
API gateway for AI agents that exposes developer services such as web scraping, screenshots, DNS lookup, geolocation, crypto prices, code execution, storage, scheduling, and x402/USDC payments through a unified OpenAPI-described service surface.
Official ClickUp MCP server for connecting AI assistants to ClickUp workspace data such as tasks, lists, folders, docs, and project workflows through authenticated MCP access.
Cloudflare's AI gateway proxy for monitoring, controlling, and optimizing traffic between applications and model providers. It provides request logs, analytics, caching, rate limiting, retries, model fallback, and edge deployment options for production AI applications.
Official Cloudflare MCP server exposing Cloudflare services as tools for agents and AI assistants. It enables natural-language workflows for Workers, storage, domains, and other Cloudflare platform resources.
Platform and SDK for connecting AI agents to external tools, authenticated apps, and sandboxed workbench environments. Composio provides toolkits, tool search, context management, auth handling, and integrations for production agents that need to act across third-party services.
Open-source LLM evaluation framework with a pytest-like interface. DeepEval provides metrics for RAG, hallucination, answer relevance, bias, and custom criteria, with Confident AI offering a managed evaluation platform.
Unified voice agent API that combines Deepgram speech-to-text, text-to-speech, and LLM orchestration for real-time conversational AI. It supports streaming audio, interruption handling, function calls, and developer controls for building responsive voice agents.
Open-source framework from Argilla for building synthetic data and AI feedback pipelines. Distilabel generates and labels datasets with LLMs, supports preference and evaluation data workflows, and provides scalable pipeline primitives for fine-tuning and alignment datasets.
ElevenLabs platform for building low-latency voice and chat agents with human-like speech, configurable behavior, knowledge sources, web and phone deployment, SDKs, and monitoring. It combines ElevenLabs speech models with agent orchestration for customer-facing voice experiences.
Open-source decision tree-based agentic RAG framework by Weaviate that dynamically displays data, learns from user feedback, and chunks documents on-demand. Features intelligent tool selection with transparent decision-making, context-aware on-the-fly document chunking, feedback-driven learning without cross-user contamination, and both full frontend interface and pip-installable Python package.
AI meeting assistant and meeting agent for recording, transcribing, summarizing, searching, and acting on meeting content. Fellow can generate follow-ups, update CRM fields, and integrate meeting insights with project tools.
Official Figma MCP server that exposes design context to AI coding agents and allows agents to read design information or write native Figma content back to the canvas.
Non-profit research lab building AI agents to automate and scale scientific research, with a primary focus on accelerating discovery in biology and other complex sciences.
Enterprise LLM evaluation and observability platform for testing, monitoring, and improving AI applications. Galileo focuses on RAG quality, data quality, hallucination detection, and production evaluation workflows.
GitHub's official MCP server for repositories, issues, pull requests, code search, Actions, and related GitHub API operations. It enables MCP-compatible agents to inspect and operate on GitHub resources.
Official GitLab MCP server for exposing GitLab resources such as projects, repositories, issues, merge requests, and CI/CD information to compatible AI agents and editor assistants.
AI-powered observability agent within Grafana Cloud that assists with investigations, incident response, and system monitoring. Uses LLMs to analyze metrics, logs, and traces, providing intelligent insights and automated root cause analysis for complex distributed systems.
Open-source framework from Zep for building real-time temporal knowledge graphs for AI agents. Graphiti extracts entities, relationships, facts, and time-aware memory from conversations and external data.
Open-source observability platform for AI agents offering one-line integration for logging, monitoring, and debugging LLM applications. Features request logging, cost tracking, latency monitoring, caching, rate limiting, and prompt versioning with support for all major LLM providers.
GTM Intelligence platform with AI agents (Odin and Nova) that analyze buyer journeys, connect to GTM tech stack, provide account and lead scoring, touchpoint analysis, and actionable recommendations without coding.
AI evaluation and observability platform for production LLM apps. HoneyHive records traces, manages evaluation datasets, supports human annotation, and runs regression tests for prompt and agent changes.
Hume's Empathic Voice Interface for building voice AI that can understand and respond to vocal emotion in real time. EVI combines speech recognition, emotion understanding, language modeling, and voice output for emotionally responsive conversational agents.
AI agent built with Google Gemini models embedded in the Xvantage distribution platform, designed to provide actionable daily briefs and data-driven recommendations to sales teams.
Open-source framework for large language model evaluations from the UK AI Safety Institute. Inspect AI supports multi-turn tasks, agent evaluations, sandboxed code execution, scorers, datasets, and reproducible eval runs.
AI-powered email outreach platform that automates sales prospecting with unlimited email account connections, AI personalization, and campaign analytics. Focuses on scaling cold email outreach with deliverability optimization.
JetBrains MCP server plugin for exposing IDE context, project structure, files, and development actions from JetBrains IDEs to MCP-compatible AI assistants and coding agents.
AI gateway capability in Kong's API platform for routing AI requests through a provider-agnostic API. Kong AI Gateway centralizes credentials, request routing, prompt and response controls, semantic caching, token-aware policies, and enterprise governance for AI API traffic.
Open-source embedded retrieval library and vector database for multimodal AI applications. LanceDB is built on the Lance columnar format and supports vector search, full-text search, hybrid search, and local or cloud retrieval workflows.
Serverless AI developer platform for building and deploying AI agents, apps, and features. Langbase provides composable AI primitives, memory, tools, model routing, and infrastructure for production LLM applications.
Open-source LLM observability and analytics platform providing tracing, prompt management, evaluation, and analytics for AI agents. Features detailed execution traces, cost tracking, quality metrics, and collaborative prompt versioning for debugging and optimizing agentic systems in production.
LangChain's observability, tracing, and evaluation platform for LLM applications and agents. LangSmith records chain and tool traces, manages datasets, runs evaluations, and supports debugging and regression testing.
Open-source agent engineering, prompt management, and evaluation platform. Latitude version-controls prompts, runs automated evaluations, tracks regressions, and supports collaboration around LLM and agent workflows.
Linear's MCP integration for connecting Claude and other compatible agents to Linear issues, projects, comments, and project-management workflows through secure authenticated access.
Open-source Python SDK and proxy server that exposes a unified OpenAI-compatible API for 100+ LLM providers. LiteLLM Proxy acts as an AI gateway with logging, cost tracking, retries, rate limits, load balancing, guardrails, and provider failover.
LlamaIndex's managed document parsing service for turning complex documents into AI-ready data. LlamaParse handles PDFs, tables, charts, handwriting, checkboxes, images, and many file formats, returning clean markdown, text, or JSON for RAG and agent pipelines.
LLM observability and prompt management platform for tracking prompts, traces, costs, user feedback, evaluations, and analytics across AI products. Lunary provides SDKs and hosted monitoring for production LLM applications.
AI-native search and discovery platform for ecommerce teams. Marqo uses semantic search, personalization, clickstream, purchase, and event data to improve product search relevance, recommendations, conversion, and merchandising workflows.
Official visual testing and debugging tool for MCP servers. MCP Inspector provides a web UI for connecting to a server, browsing tools and resources, and manually executing calls while developing MCP integrations.
Official repository of Model Context Protocol reference server implementations. It includes examples for common integrations such as filesystem, databases, search, messaging, and browser-adjacent tool access.
Open-source framework for connecting MCP servers to LLM applications and agent clients beyond Claude. mcp-use helps developers build MCP apps and integrate tools with OpenAI, Anthropic, local models, and agent frameworks.
Universal self-improving memory layer for AI agents and LLM applications, enabling personalized AI interactions with just three lines of code. Features long-term, short-term, semantic, and episodic memory types, integrates with OpenAI, LangGraph, CrewAI, and selected as exclusive memory provider for AWS Agent SDK. Achieves 26% improvement in LLM-as-a-Judge metrics with 91% lower p95 latency and 90% token cost savings.
Open-source, cloud-native vector database for scalable approximate nearest-neighbor search over high-dimensional data. Milvus supports large-scale vector indexing, distributed deployments, multimodal search, and managed Zilliz Cloud deployments for RAG and AI search workloads.
Open-source analytics and evaluation platform for voice AI agents, functioning as Mixpanel for conversational AI with auto-generation of interactive call flow visualizations. Enables developers to analyze, visualize, evaluate, and optimize conversational AI performance by understanding common user paths, behaviors, and agent interaction patterns for continuous improvement.
Serverless cloud platform for running AI workloads, agents, sandboxes, batch jobs, and model inference from Python. Modal is commonly used to host code execution, tool execution, and GPU-backed agent infrastructure.
Open protocol for connecting AI applications and agents to external tools, data sources, and prompts. MCP defines a client-server architecture that lets models discover and call capabilities exposed by compatible servers.
Official SDK collection for implementing MCP clients and servers across languages including Python, TypeScript, Kotlin, Java, C#, Go, Ruby, and Rust. These SDKs provide the base libraries for protocol-compliant MCP integrations.
Zero-code open-source platform for auto-generating intelligent agents from natural language prompts through a simple workflow: prompt -> plan -> execute. Eliminates complex orchestration and drag-and-drop requirements while offering powerful agent running control, data processing capabilities, and MCP tool integration for building sophisticated agents without technical expertise.
Agent-first search engine that indexes 8,000+ MCP servers and other agent-readable services ranked across 7 agentic readiness signals (llms.txt, OpenAPI, ai-plugin, MCP, structured API, robots.txt, schema.org). Useful as an agent-discovery primitive — one agent can query NHS to find another agent to delegate work to. Includes verify_mcp live JSON-RPC probe. Queryable via MCP, REST API, or browser. Listed in the official MCP registry as `ai.nothumansearch/search`.
Official Notion MCP server that gives compatible agents access to Notion pages, databases, blocks, and workspace content for knowledge retrieval and write-back workflows.
NVIDIA family of open models, training data, and recipes for building specialized AI agents and generating training data. Nemotron includes open weights and model families used for agentic reasoning, synthetic data generation, reward modeling, and fine-tuning workflows.
Open-source benchmark framework for evaluating web operators and agents on their ability to complete web tasks. Provides transparent, reproducible performance evaluations with WebVoyager30 benchmark dataset covering 30 diverse web tasks.
Autonomous AI security agent powered by GPT-5 that operates as an agentic security researcher to continuously monitor repositories, discover vulnerabilities, assess exploitability, and propose targeted patches.
OpenAI API surface for low-latency, realtime model interactions over live audio and other streaming inputs. The Realtime API is commonly used to build speech-to-speech voice agents in browsers or servers with WebRTC, WebSocket, tool calling, and multimodal interaction patterns.
Autonomous research agent specifically tailored for the analysis of health and medical data.
Unified API and model marketplace for accessing hundreds of AI models through an OpenAI-compatible endpoint. OpenRouter supports model discovery, provider routing, fallbacks, price comparison, and pay-per-token access across major model providers.
Open-source LLM evaluation and observability platform from Comet. Opik traces agentic workflows, RAG systems, and LLM applications, then supports automated evaluation, dashboards, and production monitoring.
LLMOps platform for prompt management, model routing, experimentation, observability, and deployment workflows. Orq.ai helps teams manage prompt changes, monitor usage, and route requests across models from one platform.
AI meeting assistant and conversational knowledge engine for meetings. Otter joins meetings, records and transcribes conversations, summarizes action items, answers questions over meeting history, and supports meeting-agent workflows.
Automated evaluation, testing, and red-teaming platform for LLM and agent applications. Patronus provides evaluators for hallucination, safety, policy compliance, off-topic behavior, PII, and custom production quality criteria.
Fully managed vector database for production AI applications. Pinecone provides serverless vector search, metadata filtering, namespaces, automatic indexing, and managed scaling for RAG, agent memory, semantic search, and recommendation workloads.
Managed MCP server from Pipedream that exposes app integrations and workflow actions to AI agents through a hosted MCP endpoint, avoiding local server setup for common SaaS integrations.
Enterprise conversational voice AI platform for contact centers. PolyAI builds customer-led voice agents for phone support, reservations, account servicing, payments, and other high-volume customer-service workflows across regulated and global businesses.
Open-source AI gateway and production platform for routing requests across LLM providers. Portkey adds retries, fallbacks, load balancing, caching, observability, guardrails, prompt management, and model catalog support for production LLM and agent applications.
Prompt management and LLM observability platform for logging requests, versioning prompts, running prompt experiments, tracking metadata, and monitoring production performance over time.
AI model access and routing platform for configuring model selection across Pulze spaces. Pulze supports model and router configuration, custom routing policies, provider access, cost controls, and reliability settings for AI applications.
Observability platform from Pydantic designed specifically for Python applications and AI agents. Provides structured logging, tracing, and monitoring with type-safe instrumentation, seamless integration with Pydantic models, and powerful debugging capabilities for production systems.
Open-source vector similarity search engine and vector database written in Rust. Qdrant provides payload filtering, vector search APIs, production indexing, cloud hosting, and managed on-prem options for retrieval and RAG applications.
Modular framework for building Retrieval-Augmented Generation pipelines with support for Agentic RAG featuring multi-step reasoning and tool usage. Includes seamless MCP Server integration for external tool interaction, customizable LLM providers (OpenAI, Ollama), vector store integration, and support for multiple knowledge sources including local folders and GitHub repositories.
Document parsing and extraction API for converting complex PDFs, spreadsheets, presentations, and scanned documents into structured output for RAG and LLM workflows. Reducto focuses on layout-aware parsing, tables, figures, OCR, splitting, and extraction.
Low-code platform for building AI sales agents and teams that automate lead generation, research, and follow-up processes. Features specialized sales prospecting agents with CRM integration and customizable workflow automation.
Voice AI platform for building low-latency phone agents for sales, support, scheduling, and service workflows. Retell provides turn-taking, interruption handling, telephony integrations, testing, analytics, and APIs for production inbound and outbound call automation.
Autonomous AI agent focused on sales automation, designed to handle sales tasks and customer interactions to close deals.
MCP server for connecting AI agents to Sentry projects, issues, stack traces, releases, and performance data. Sentry MCP lets agents inspect production errors and assist with debugging workflows.
Open-source Conversational Speech Model from Sesame for generating natural conversational speech from text and audio inputs. The model underpins Sesame's voice companion demos and provides a research-grade speech generation component for voice agent experiments.
Enterprise AI data development platform for curating training data, evaluating models, optimizing RAG pipelines, and fine-tuning LLMs. Snorkel Flow supports programmatic labeling, SME collaboration, annotation, data quality workflows, and synthetic-data-oriented development.
Stripe's official toolkit for building AI-powered products and connecting agents to Stripe payments, customers, invoices, subscriptions, refunds, and related financial workflows, including MCP-compatible tooling.
Healthcare AI employee platform for hospitals and medical practices. Sully automates clinical, administrative, and patient-operations workflows with agents for scribing, reception, nursing, review replies, review insights, and integrations across EHRs, payments, forms, communications, and analytics systems.
No-code enterprise platform for creating and deploying natural-sounding voice agents with multilingual support for 30+ languages and dialects. Features sub-100ms latency with in-house telephony, HIPAA and GDPR compliance, 200+ enterprise integrations including Salesforce and HubSpot, and white-label capabilities for handling customer support, appointment scheduling, and complex workflows at scale.
Enterprise-grade AI speech-to-text platform offering industry-leading transcription accuracy with Word Error Rate under 4%, featuring emotion detection across 7 emotions and purchase intent analysis. Provides secure deployment options across on-premises, public, private, or hybrid cloud with advanced capabilities including dialogue summarization, topic extraction, and PII redaction for customer interaction insights.
Provides advanced AI agents for data analysis including Discover agents for research, Chain of Thought agents for complex problem-solving, and Analyst agents for real-time financial analysis. Features comprehensive workflow automation from data gathering to insight generation.
Comprehensive benchmark for evaluating LLM agents on tool usage and API interaction capabilities. Features 16,000+ real-world APIs, standardized evaluation metrics, and test scenarios covering tool selection, parameter filling, and multi-step tool orchestration.
Open-source platform for search, recommendations, RAG, and analytics delivered through APIs. Trieve combines vector search, keyword search, ranking, chunk management, and hosted infrastructure for teams adding retrieval to AI products.
Open-source platform for building and running long-running workflows, background jobs, and AI agents in TypeScript and Python. Trigger.dev provides durable execution, retries, queues, observability, and hosted deployment for agentic workloads.
Enterprise AI gateway from TrueFoundry that provides a proxy layer between applications, LLM providers, MCP servers, and agents. It offers unified access, observability, governance, access control, routing policies, budget controls, and deployment integration for organization-wide AI usage.
Serverless vector and full-text search engine built on object storage. turbopuffer supports hybrid search, metadata filtering, automatic scaling, and low-latency retrieval over billions of vectors for RAG, semantic search, and AI application workloads.
Open-source AI framework for semantic search, RAG, LLM orchestration, and language-model workflows. txtai combines vector search, sparse retrieval, graph networks, relational storage, pipelines, and workflow orchestration in a Python library.
Open-source document ETL toolkit and enterprise data platform for transforming complex files into clean, structured inputs for language models. Unstructured supports parsing, chunking, enrichment, embedding, and connectors for production RAG pipelines.
Developer-focused voice AI platform for building advanced voice agents with enterprise infrastructure, featuring response times under 500ms and support for 100+ languages. VAPI provides Flow Studio for visual conversational logic design, highly customizable STT/LLM/TTS provider selection, and scalable phone operations for inbound and outbound calls across industries like healthcare, finance, and travel.
Enterprise AI agent and RAG platform for grounded search, retrieval, governed agent workflows, and factual-consistency enforcement. Vectara provides managed retrieval, citations, hallucination evaluation, policy controls, and deployment options for trusted AI applications.
Open-source library for building voice-based LLM agents and real-time streaming conversations. Vocode provides abstractions and integrations for speech recognition, language models, text-to-speech, telephony, phone calls, meetings, and voice assistants.
All-in-one platform for building, testing, and deploying AI voice agents with access to latest super-realistic voice models including Sesame CSM-1B, Dia, and Orpheus. Features optimized compute for real-time inference with sub-200ms time-to-first-token, supports both zero-shot voice cloning and fine-tuning, and provides unified API for multiple voice model integration.
Open-source, cloud-native vector database for storing objects and vectors together. Weaviate supports vector search, hybrid keyword and vector retrieval, structured filtering, RAG, reranking, and managed Weaviate Cloud deployments.
W&B's LLM observability and evaluation toolkit for tracing AI application calls, capturing inputs and outputs, managing evaluation datasets, and comparing model or prompt behavior inside the broader Weights & Biases platform.
Zapier's MCP endpoint for giving AI agents access to Zapier's large library of app actions and automations. It lets agents use business apps through a managed, authenticated MCP tool surface.
No Results Found
Try adjusting your search or filters