NLWeb: Building the AI Web with Natural Language Interfaces
Overview
NLWeb is an open-source framework from Microsoft that simplifies building conversational interfaces for websites. It represents a foundational shift in how we think about web interaction—moving from traditional search and navigation to natural language conversations with structured data.
Key Innovation: NLWeb natively supports Model Context Protocol (MCP), allowing the same natural language APIs to serve both humans and AI agents. As the project states: “NLWeb is to MCP/A2A what HTML is to HTTP.”
The Vision: An AI-Native Web
Just as HTML revolutionized document sharing in the 1990s, NLWeb aims to establish a foundational layer for the AI Web. The framework leverages existing web standards—particularly Schema.org markup used by over 100 million websites—to enable natural language interfaces without requiring sites to rebuild their entire infrastructure.
Core Principles
- Leverage Existing Standards - Schema.org and RSS are already widely adopted
- Conversational by Default - Natural language as a first-class interface
- Hallucination-Free Results - All responses come from actual database records
- Extensible Architecture - Tools, prompts, and workflows can be customized
- Platform Agnostic - Works on data centers, laptops, and (soon) mobile devices
How It Works
NLWeb has two primary components:
1. Simple Natural Language Protocol
A RESTful API that accepts natural language queries and returns responses in JSON using Schema.org vocabulary:
POST /ask
{
"query": "Find vegan recipes for a summer party",
"site": "recipe-site",
"mode": "list"
}
Response Format:
{
"query_id": "abc123",
"results": [
{
"url": "https://example.com/recipes/grilled-veggie-skewers",
"name": "Grilled Veggie Skewers",
"score": 0.95,
"description": "Perfect summer appetizer with seasonal vegetables",
"schema_object": { /* Full Schema.org Recipe object */ }
}
]
}
2. Straightforward Implementation
The framework uses existing markup on sites with structured lists (products, recipes, events, reviews) and provides:
- Vector database integration (Qdrant, Milvus, Snowflake, Postgres, Elasticsearch, Azure AI Search, Cloudflare AutoRAG)
- LLM connectors (OpenAI, DeepSeek, Gemini, Anthropic, HuggingFace)
- Web server front-end with sample UI
- Tools for ingesting Schema.org JSONL and RSS feeds
Life of a Chat Query
NLWeb processes queries through a sophisticated pipeline that mirrors modern web search but uses LLMs for tasks that previously required specialized algorithms:
┌─────────────────────────────────────────────────────────────┐
│ User submits query │
└────────────────────────┬────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────┐
│ Parallel Analysis (Step 2) │
│ • Check relevancy │
│ • Decontextualize based on conversation history │
│ • Determine memory requirements │
│ • Fast-track check (most queries skip heavy processing) │
└────────────────────────┬────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────┐
│ Tool Selection & Execution (Step 3) │
│ • LLM selects appropriate tool from manifest │
│ • Extract parameters │
│ • Execute tool (Search, Item Details, Ensemble Queries) │
└────────────────────────┬────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────┐
│ Result Scoring & Snippet Generation (Step 4) │
│ • Score results with LLM calls │
│ • Generate appropriate snippets │
│ • Collect top N results above threshold │
└────────────────────────┬────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────┐
│ Optional Post-Processing (Step 4a) │
│ • Summarize results │
│ • Generate answers from results │
└────────────────────────┬────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────┐
│ Return results in specified format │
└─────────────────────────────────────────────────────────────┘
Performance Note: Processing a single query might involve over 50 LLM API calls. However, these calls are narrow, specific, and can use different models optimized for each task.
Built-in Tools
NLWeb includes three primary tools out of the box:
1. Search Tool
Traditional search flow with AI enhancements:
- Query sent to vector database (TF-IDF scores on embeddings)
- Results returned as Schema.org JSON objects
- LLM scoring with snippet generation
- Top N results above threshold collected
2. Item Details Tool
Retrieves specific information about items:
- Items specified by name, description, or context
- Vector database query for candidates
- LLM scoring to match candidates
- Detail extraction via LLM calls
3. Ensemble Queries Tool
Combines multiple items of different types:
- Handles complex queries: “appetizer, entree and dessert, Asian fusion themed”
- Extracts separate queries for each item type
- Independent vector database queries
- LLM ranking for appropriateness
- Creates ensembles from top 2-3 of each query
MCP Integration
Every NLWeb instance acts as an MCP server and supports core MCP methods:
list_tools
- Enumerate available toolslist_prompts
- Show available promptscall_tool
- Execute a specific toolget_prompt
- Retrieve a prompt template
The /mcp
endpoint returns responses in MCP-compatible format, making NLWeb instances discoverable and usable by any MCP client.
Future Vision: NLWeb will enable calling other NLWeb/MCP servers, allowing distributed tool execution across different services.
Platform Support
Operating Systems
- Windows
- macOS
- Linux
Vector Stores
- Qdrant (local and cloud)
- Snowflake
- Milvus
- Azure AI Search
- Elasticsearch
- Postgres (with pgvector)
- Cloudflare AutoRAG
LLM Providers
- OpenAI (GPT-4, GPT-4 Turbo, GPT-3.5)
- DeepSeek
- Google Gemini
- Anthropic Claude
- Inception
- HuggingFace models
Quick Start Example
Prerequisites
- Python 3.10+
- API key for your preferred LLM provider
Setup (5 minutes)
# Clone repository
git clone https://github.com/nlweb-ai/NLWeb.git
cd NLWeb
# Create virtual environment
python -m venv myenv
source myenv/bin/activate # Windows: myenv\Scripts\activate
# Install dependencies
cd code/python
pip install -r requirements.txt
# Configure environment
cd ../../
cp .env.template .env
# Edit .env with your LLM API keys
# Verify configuration
cd code/python
python testing/check_connectivity.py
# Load sample data (podcast RSS feed)
python -m data_loading.db_load https://feeds.libsyn.com/121695/rss Behind-the-Tech
# Start server
python app-aiohttp.py
# Visit http://localhost:8000/
You now have a working conversational interface for podcast episodes!
REST API
Endpoints
/ask
- Returns results in standard JSON format
/mcp
- Returns results in MCP-compatible format
Required Parameter
query
- The current query in natural language
Optional Parameters
site
- Token for a subset of data (multi-site support)prev
- Comma-separated list of previous queries (conversation context)decontextualized_query
- Pre-decontextualized query (skips server-side processing)streaming
- Enable/disable streaming (default: true)query_id
- Custom query ID (auto-generated if not provided)mode
- Response mode:list
(default) - Top matches from backendsummarize
- Summary + listgenerate
- RAG-style answer generation
Response Format
{
"query_id": "unique-id",
"results": [
{
"url": "https://example.com/item",
"name": "Item Name",
"site": "site-token",
"score": 0.95,
"description": "LLM-generated description",
"schema_object": { /* Full Schema.org object */ }
}
]
}
Hallucination-Free Guarantee
Critical Feature: Since all returned items come directly from the database, results cannot be hallucinated. Each result includes the full schema_object
from the data store.
- Results may be less than perfectly relevant
- Results may be ranked sub-optimally
- But results will never be fabricated
Note: Post-processing (summarize/generate modes) may degrade this guarantee, so test carefully.
Architecture Insights
Customization Points
- Prompts - Declaratively specialized for object types (Recipe vs. Real Estate)
- Tools - Domain-specific tools with additional knowledge (e.g., recipe substitutions)
- Control Flow - Modify query processing pipeline
- User Interface - Replace sample UI with custom design
- Memory - Add conversation memory and context retention
Production Considerations
Most production deployments will:
- Custom UI - Replace sample interface with branded design
- Direct Integration - Integrate NLWeb into application environment
- Live Database Connection - Connect to production databases (avoid data freshness issues)
- Multi-Model Strategy - Use different LLMs for different tasks (cost optimization)
- Caching & Performance - Implement query caching and result optimization
Use Cases
E-Commerce
Natural language product search with filtering:
- “Find wireless headphones under $200 with noise cancellation”
- “Show me vegan protein powders with chocolate flavor”
Recipe Sites
Dietary restriction handling and meal planning:
- “Gluten-free desserts for a birthday party”
- “Plan a week of dinners under 500 calories”
Real Estate
Property search with complex criteria:
- “3 bedroom homes near good schools under $500k”
- “Condos with mountain views and low HOA fees”
Content Discovery
Podcast, blog, and video recommendations:
- “Episodes about AI ethics from the last 6 months”
- “Articles explaining quantum computing for beginners”
Event Platforms
Smart event discovery and planning:
- “Family-friendly events this weekend downtown”
- “Networking events for software engineers”
Technical Deep-Dive: Schema.org Integration
NLWeb exploits a key insight: LLMs understand Schema.org markup very well because it’s prevalent in their training data (100+ million websites use it).
Why Schema.org Works
- Common Vocabulary - Standardized types and properties across domains
- Rich Semantics - Detailed descriptions of entities and relationships
- LLM Native - Models trained on billions of pages with Schema.org markup
- Type Hierarchy - Inheritance allows specialized and generalized handling
Example: Recipe Schema
{
"@type": "Recipe",
"name": "Chocolate Chip Cookies",
"recipeIngredient": [
"2 cups all-purpose flour",
"1 cup butter",
"1 cup chocolate chips"
],
"recipeInstructions": [
{"@type": "HowToStep", "text": "Preheat oven to 350°F"},
{"@type": "HowToStep", "text": "Mix butter and sugar"}
],
"nutrition": {
"@type": "NutritionInformation",
"calories": "150 calories"
},
"suitableForDiet": "https://schema.org/VegetarianDiet"
}
LLMs can:
- Extract dietary restrictions (
suitableForDiet
) - Calculate serving sizes (
nutrition
) - Suggest substitutions (domain knowledge + schema structure)
- Generate cooking instructions summaries
Comparison to Traditional RAG
Feature | Traditional RAG | NLWeb |
---|---|---|
Data Format | Unstructured text chunks | Structured Schema.org objects |
Hallucination Risk | High (LLM generates freely) | Low (results from database) |
Result Granularity | Passage-level | Entity-level |
Multi-faceted Queries | Limited | Native support (ensemble queries) |
Conversation Context | Basic | Decontextualization pipeline |
Tool Ecosystem | Custom per deployment | Extensible tool manifest |
Agent Compatibility | Manual integration | Native MCP support |
Development Roadmap
Current Status
- ✅ REST API (
/ask
and/mcp
endpoints) - ✅ MCP server implementation
- ✅ Multiple vector store connectors
- ✅ Multiple LLM provider support
- ✅ Docker deployment
- ✅ Azure deployment guides
Coming Soon
- 🚧 A2A (Agent-to-Agent) protocol support
- 🚧 Distributed NLWeb/MCP server calling
- 🚧 Mobile device deployment
- 🚧 GCP deployment guides
- 🚧 AWS deployment guides
- 🚧 CI/CD pipeline templates
Learning Resources
Official Documentation
- GitHub Repository: https://github.com/nlweb-ai/NLWeb
- Hello World Tutorial: Getting Started Guide
- Life of a Chat Query: Architecture Deep-Dive
- REST API Docs: API Reference
Setup Guides
Customization
Integration Examples
Example 1: Recipe Site Integration
# Load recipe data from RSS feed
python -m data_loading.db_load https://example.com/recipes.rss RecipeSite
# Query for vegan desserts
curl -X POST http://localhost:8000/ask \
-H "Content-Type: application/json" \
-d '{
"query": "vegan chocolate desserts",
"site": "RecipeSite",
"mode": "list"
}'
Example 2: E-Commerce Product Search
# Load product catalog (Schema.org JSONL)
python -m data_loading.db_load products.jsonl MyStore
# Search with filters
curl -X POST http://localhost:8000/ask \
-H "Content-Type: application/json" \
-d '{
"query": "wireless headphones with ANC under $200",
"site": "MyStore",
"mode": "summarize"
}'
Example 3: MCP Client Integration
# Using NLWeb as MCP server
import mcp_client
server = mcp_client.connect("http://localhost:8000/mcp")
# List available tools
tools = server.list_tools()
# Call search tool
result = server.call_tool(
"search",
query="Find episodes about machine learning",
site="Behind-the-Tech"
)
Performance Optimization
Multi-Model Strategy
Different tasks have different requirements:
# config_llm.yaml example
tasks:
relevancy_check:
model: gpt-4o-mini # Fast, cheap for simple classification
decontextualization:
model: gpt-4o # Better context understanding
scoring:
model: gpt-4o-mini # Simple scoring task
snippet_generation:
model: gpt-4o # Creative text generation
Caching Strategies
- Query Caching - Cache decontextualized queries
- Embedding Caching - Cache vector embeddings
- Result Caching - Cache scored results for common queries
- LLM Response Caching - Cache LLM responses for identical prompts
Fast-Track Optimization
The “fast-track” path bypasses heavy processing for simple queries:
- Lightweight relevancy check
- Skip decontextualization if not needed
- Parallel execution with full pipeline
- Results blocked until validation completes
Impact: 2-3x speedup for 60-70% of queries.
Security & Privacy
Data Privacy
- No Server-Side State - Conversation context passed by client
- Local Deployment - Run entirely on-premises if required
- Data Isolation - Multi-site support with access controls
API Security
- OAuth integration available
- GitHub OAuth example included
- Token-based authentication supported
Content Safety
- Relevancy checks prevent off-topic queries
- Domain-specific tools limit scope
- Database-only results prevent hallucinated content
Community & Contribution
Contributing
NLWeb is open source under the MIT License. Contributions welcome:
- Code Contributions - New tools, connectors, optimizations
- Documentation - Guides, tutorials, examples
- Testing - Vector store testing, LLM provider testing
- Use Cases - Share production deployments and lessons learned
Contact: NLWebSup@microsoft.com
License
MIT License
Copyright (c) Microsoft Corporation.
Full license: https://github.com/nlweb-ai/NLWeb/blob/main/LICENSE
Why This Matters
NLWeb represents a paradigm shift in web architecture:
- Democratizes AI Interfaces - Any site with structured data can add conversational UI
- Builds on Standards - Schema.org and RSS provide instant data readiness
- Enables Agent Ecosystem - MCP compatibility makes sites agent-accessible
- Prevents Hallucination - Database-backed results ensure accuracy
- Extensible by Design - Tools, prompts, and flows are customizable
The Vision: Just as HTML enabled document sharing across the internet, NLWeb aims to enable conversational interaction across the AI Web—with shared protocols, sample implementations, and community participation.
Getting Started Checklist
- Clone NLWeb repository
- Set up Python 3.10+ virtual environment
- Configure
.env
with LLM API keys - Choose vector store (Qdrant local for testing)
- Run connectivity check script
- Load sample data (RSS feed or Schema.org JSONL)
- Start server and test at
http://localhost:8000
- Explore sample UIs in
static/
directory - Read Life of a Chat Query docs
- Experiment with custom prompts and tools
Attribution
Project: NLWeb - Natural Language Interfaces for Websites Organization: Microsoft Corporation Repository: https://github.com/nlweb-ai/NLWeb License: MIT License Documentation: https://github.com/nlweb-ai/NLWeb/tree/main/docs
This article is an educational resource created for Start AI Tools. All credit for NLWeb development goes to Microsoft Corporation and the NLWeb contributors. For official project information, please visit the GitHub repository.
Next Steps
- Explore the Documentation - Deep-dive into Life of a Chat Query
- Run Hello World - Follow the 5-minute setup guide
- Join the Community - Star the repo and contribute
- Build Something - Create a conversational interface for your site
- Share Your Experience - Document your use case and lessons learned
Ready to build the AI Web? Start with NLWeb today.
Last Updated: October 9, 2025 Research & Curriculum Article by Jeremy Longshore Start AI Tools - Presented by Intent Solutions