Rename to AIS (#25100)
* Rename from ARAG to AIS * Making sure nothing's broken * model support and deprecation strategy * resolve conflicts * update name * Rename from ARAG to AIS * Making sure nothing's broken * model support and deprecation strategy * resolve conflicts * update name * PCX: Minor fixes * Adding splat redirect for AutoRAG to AI Search * Fixing broken links * Moving over content from PR #25332 * update model section * update crawler section name * updated model list --------- Co-authored-by: Anni Wang <anni@cloudflare.com>
2
.github/CODEOWNERS
vendored
|
|
@ -26,7 +26,7 @@
|
|||
/src/content/release-notes/workers-ai.yaml @kathayl @mchenco @kodster28 @cloudflare/pcx-technical-writing
|
||||
/src/content/release-notes/ai-gateway.yaml @kathayl @mchenco @kodster28 @cloudflare/pcx-technical-writing
|
||||
/src/content/release-notes/vectorize.yaml @elithrar @mchenco @sejoker @cloudflare/pcx-technical-writing
|
||||
/src/content/docs/autorag/ @rita3ko @irvinebroque @aninibread @cloudflare/pcx-technical-writing
|
||||
/src/content/docs/ai-search/ @rita3ko @irvinebroque @aninibread @cloudflare/pcx-technical-writing
|
||||
|
||||
# Analytics & Logs
|
||||
|
||||
|
|
|
|||
2
.github/workflows/publish-production.yml
vendored
|
|
@ -67,7 +67,7 @@ jobs:
|
|||
--config bin/rclone.conf \
|
||||
distmd \
|
||||
zt:zt-dashboard-dev-docs
|
||||
- name: Upload vendored Markdown files to AutoRAG DevDocs bucket
|
||||
- name: Upload vendored Markdown files to AI Search DevDocs bucket
|
||||
env:
|
||||
AWS_ACCESS_KEY_ID: ${{ secrets.AUTORAG_DEVDOCS_ACCESS_KEY_ID }}
|
||||
AWS_SECRET_ACCESS_KEY: ${{ secrets.AUTORAG_DEVDOCS_SECRET_ACCESS_KEY }}
|
||||
|
|
|
|||
|
|
@ -245,8 +245,8 @@
|
|||
/api-shield/security/sequence-mitigation/configure/ /api-shield/security/sequence-mitigation/api/ 301
|
||||
|
||||
#autorag
|
||||
/autorag/usage/recipes/ /autorag/how-to/ 301
|
||||
/autorag/configuration/metadata-filtering/ /autorag/configuration/metadata/ 301
|
||||
/autorag/usage/recipes/ /ai-search/how-to/ 301
|
||||
/autorag/configuration/metadata-filtering/ /ai-search/configuration/metadata/ 301
|
||||
|
||||
# bots
|
||||
/bots/about/plans/ /bots/plans/ 301
|
||||
|
|
@ -2322,6 +2322,9 @@
|
|||
# AI Crawl Control
|
||||
/ai-audit/* /ai-crawl-control/:splat 301
|
||||
|
||||
# AutoRAG to AI search
|
||||
/autorag/* /ai-search/:splat 301
|
||||
|
||||
# Cloudflare One / Zero Trust
|
||||
/cloudflare-one/connections/connect-networks/install-and-setup/tunnel-guide/local/as-a-service/* /cloudflare-one/connections/connect-networks/configure-tunnels/local-management/as-a-service/:splat 301
|
||||
/cloudflare-one/connections/connect-apps/install-and-setup/deployment-guides/* /cloudflare-one/connections/connect-networks/deployment-guides/:splat 301
|
||||
|
|
|
|||
|
Before Width: | Height: | Size: 38 KiB After Width: | Height: | Size: 38 KiB |
|
Before Width: | Height: | Size: 413 KiB After Width: | Height: | Size: 413 KiB |
|
Before Width: | Height: | Size: 253 KiB After Width: | Height: | Size: 253 KiB |
|
Before Width: | Height: | Size: 192 KiB After Width: | Height: | Size: 192 KiB |
|
Before Width: | Height: | Size: 131 KiB After Width: | Height: | Size: 131 KiB |
|
Before Width: | Height: | Size: 5.1 MiB After Width: | Height: | Size: 5.1 MiB |
|
Before Width: | Height: | Size: 11 MiB After Width: | Height: | Size: 11 MiB |
|
|
@ -2,22 +2,22 @@
|
|||
title: Create fully-managed RAG pipelines for your AI applications with AutoRAG
|
||||
description: AutoRAG lets you create fully-managed, retrieval-augmented generation (RAG) pipelines that continuously updates and scales on Cloudflare.
|
||||
products:
|
||||
- autorag
|
||||
- ai-search
|
||||
- vectorize
|
||||
date: 2025-04-07
|
||||
---
|
||||
|
||||
[AutoRAG](/autorag) is now in open beta, making it easy for you to build fully-managed retrieval-augmented generation (RAG) pipelines without managing infrastructure. Just upload your docs to [R2](/r2/get-started/), and AutoRAG handles the rest: embeddings, indexing, retrieval, and response generation via API.
|
||||
[AutoRAG](/ai-search/) is now in open beta, making it easy for you to build fully-managed retrieval-augmented generation (RAG) pipelines without managing infrastructure. Just upload your docs to [R2](/r2/get-started/), and AutoRAG handles the rest: embeddings, indexing, retrieval, and response generation via API.
|
||||
|
||||

|
||||

|
||||
|
||||
With AutoRAG, you can:
|
||||
|
||||
- **Customize your pipeline:** Choose from [Workers AI](/workers-ai) models, configure chunking strategies, edit system prompts, and more.
|
||||
- **Instant setup:** AutoRAG provisions everything you need from [Vectorize](/vectorize), [AI gateway](/ai-gateway), to pipeline logic for you, so you can go from zero to a working RAG pipeline in seconds.
|
||||
- **Keep your index fresh:** AutoRAG continuously syncs your index with your data source to ensure responses stay accurate and up to date.
|
||||
- **Ask questions:** Query your data and receive grounded responses via a [Workers binding](/autorag/usage/workers-binding/) or [API](/autorag/usage/rest-api/).
|
||||
- **Ask questions:** Query your data and receive grounded responses via a [Workers binding](/ai-search/usage/workers-binding/) or [API](/ai-search/usage/rest-api/).
|
||||
|
||||
Whether you're building internal tools, AI-powered search, or a support assistant, AutoRAG gets you from idea to deployment in minutes.
|
||||
|
||||
Get started in the [Cloudflare dashboard](https://dash.cloudflare.com/?to=/:account/ai/autorag) or check out the [guide](/autorag/get-started/) for instructions on how to build your RAG pipeline today.
|
||||
Get started in the [Cloudflare dashboard](https://dash.cloudflare.com/?to=/:account/ai/autorag) or check out the [guide](/ai-search/get-started/) for instructions on how to build your RAG pipeline today.
|
||||
|
|
@ -2,13 +2,13 @@
|
|||
title: Metadata filtering and multitenancy support in AutoRAG
|
||||
description: Add metadata filters to AutoRAG queries to enable multitenancy and control the scope of retrieved results.
|
||||
products:
|
||||
- autorag
|
||||
- ai-search
|
||||
date: 2025-04-23
|
||||
---
|
||||
|
||||
You can now filter [AutoRAG](/autorag) search results by `folder` and `timestamp` using [metadata filtering](/autorag/configuration/metadata) to narrow down the scope of your query.
|
||||
You can now filter [AutoRAG](/ai-search/) search results by `folder` and `timestamp` using [metadata filtering](/ai-search/configuration/metadata) to narrow down the scope of your query.
|
||||
|
||||
This makes it easy to build [multitenant experiences](/autorag/how-to/multitenancy/) where each user can only access their own data. By organizing your content into per-tenant folders and applying a `folder` filter at query time, you ensure that each tenant retrieves only their own documents.
|
||||
This makes it easy to build [multitenant experiences](/ai-search/how-to/multitenancy/) where each user can only access their own data. By organizing your content into per-tenant folders and applying a `folder` filter at query time, you ensure that each tenant retrieves only their own documents.
|
||||
|
||||
**Example folder structure:**
|
||||
|
||||
|
|
@ -33,4 +33,4 @@ const response = await env.AI.autorag("my-autorag").search({
|
|||
|
||||
You can use metadata filtering by creating a new AutoRAG or reindexing existing data. To reindex all content in an existing AutoRAG, update any chunking setting and select **Sync index**. Metadata filtering is available for all data indexed on or after **April 21, 2025**.
|
||||
|
||||
If you are new to AutoRAG, get started with the [Get started AutoRAG guide](/autorag/get-started/).
|
||||
If you are new to AutoRAG, get started with the [Get started AutoRAG guide](/ai-search/get-started/).
|
||||
|
|
@ -2,11 +2,11 @@
|
|||
title: View custom metadata in responses and guide AI-search with context in AutoRAG
|
||||
description: You can now view custom metadata in AutoRAG search responses and use a context field to provide additional guidance to AI-generated answers.
|
||||
products:
|
||||
- autorag
|
||||
- ai-search
|
||||
date: 2025-06-19
|
||||
---
|
||||
|
||||
In [AutoRAG](/autorag/), you can now view your object's custom metadata in the response from [`/search`](/autorag/usage/workers-binding/) and [`/ai-search`](/autorag/usage/workers-binding/), and optionally add a `context` field in the custom metadata of an object to provide additional guidance for AI-generated answers.
|
||||
In [AutoRAG](/ai-search/), you can now view your object's custom metadata in the response from [`/search`](/ai-search/usage/workers-binding/) and [`/ai-search`](/ai-search/usage/workers-binding/), and optionally add a `context` field in the custom metadata of an object to provide additional guidance for AI-generated answers.
|
||||
|
||||
You can add [custom metadata](/r2/api/workers/workers-api-reference/#r2putoptions) to an object when uploading it to your R2 bucket.
|
||||
|
||||
|
|
@ -46,4 +46,4 @@ For example:
|
|||
|
||||
This gives you more control over how your content is interpreted, without requiring you to modify the original contents of the file.
|
||||
|
||||
Learn more in AutoRAG's [metadata filtering documentation](/autorag/configuration/metadata).
|
||||
Learn more in AutoRAG's [metadata filtering documentation](/ai-search/configuration/metadata).
|
||||
|
|
@ -2,11 +2,11 @@
|
|||
title: Filter your AutoRAG search by file name
|
||||
description: You can now filter AutoRAG search queries by file name, allowing you to control which files can be retrieved for a given query.
|
||||
products:
|
||||
- autorag
|
||||
- ai-search
|
||||
date: 2025-06-19
|
||||
---
|
||||
|
||||
In [AutoRAG](/autorag/), you can now [filter](/autorag/configuration/metadata/) by an object's file name using the `filename` attribute, giving you more control over which files are searched for a given query.
|
||||
In [AutoRAG](/ai-search/), you can now [filter](/ai-search/configuration/metadata/) by an object's file name using the `filename` attribute, giving you more control over which files are searched for a given query.
|
||||
|
||||
This is useful when your application has already determined which files should be searched. For example, you might query a PostgreSQL database to get a list of files a user has access to based on their permissions, and then use that list to limit what AutoRAG retrieves.
|
||||
|
||||
|
|
@ -25,4 +25,4 @@ const response = await env.AI.autorag("my-autorag").search({
|
|||
|
||||
This allows you to connect your application logic with AutoRAG's retrieval process, making it easy to control what gets searched without needing to reindex or modify your data.
|
||||
|
||||
Learn more in AutoRAG's [metadata filtering documentation](/autorag/configuration/metadata/).
|
||||
Learn more in AutoRAG's [metadata filtering documentation](/ai-search/configuration/metadata/).
|
||||
|
|
@ -2,13 +2,13 @@
|
|||
title: Faster indexing and new Jobs view in AutoRAG
|
||||
description: Track your indexing pipeline in real time with 3–5× faster indexing and a new Jobs dashboard.
|
||||
products:
|
||||
- autorag
|
||||
- ai-search
|
||||
date: 2025-07-08
|
||||
---
|
||||
|
||||
You can now expect **3-5× faster indexing** in AutoRAG, and with it, a brand new **Jobs view** to help you monitor indexing progress.
|
||||
|
||||
With each AutoRAG, indexing jobs are automatically triggered to sync your data source (i.e. R2 bucket) with your Vectorize index, ensuring new or updated files are reflected in your query results. You can also trigger jobs manually via the [Sync API](/api/resources/autorag/subresources/rags/) or by clicking “Sync index” in the dashboard.
|
||||
With each AutoRAG, indexing jobs are automatically triggered to sync your data source (i.e. R2 bucket) with your Vectorize index, ensuring new or updated files are reflected in your query results. You can also trigger jobs manually via the [Sync API](/api/resources/ai-search/subresources/rags/) or by clicking “Sync index” in the dashboard.
|
||||
|
||||
With the new jobs observability, you can now:
|
||||
|
||||
|
|
@ -16,7 +16,7 @@ With the new jobs observability, you can now:
|
|||
- Inspect real-time logs of job events (e.g. `Starting indexing data source...`)
|
||||
- See a history of past indexing jobs under the Jobs tab of your AutoRAG
|
||||
|
||||

|
||||

|
||||
|
||||
This makes it easier to understand what’s happening behind the scenes.
|
||||
|
||||
|
|
@ -2,13 +2,13 @@
|
|||
title: New Metrics View in AutoRAG
|
||||
description: Track file indexing, search activity, and top retrievals to understand how your AutoRAG instance is being used.
|
||||
products:
|
||||
- autorag
|
||||
- ai-search
|
||||
date: 2025-09-19
|
||||
---
|
||||
|
||||
[AutoRAG](/autorag/) now includes a **Metrics** tab that shows how your data is indexed and searched. Get a clear view of the health of your indexing pipeline, compare usage between `ai-search` and `search`, and see which files are retrieved most often.
|
||||
[AutoRAG](/ai-search/) now includes a **Metrics** tab that shows how your data is indexed and searched. Get a clear view of the health of your indexing pipeline, compare usage between `ai-search` and `search`, and see which files are retrieved most often.
|
||||
|
||||

|
||||

|
||||
|
||||
You can find these metrics within each AutoRAG instance:
|
||||
|
||||
|
|
@ -16,4 +16,4 @@ You can find these metrics within each AutoRAG instance:
|
|||
- Search breakdown: Compare usage between `ai-search` and `search` endpoints.
|
||||
- Top file retrievals: Identify which files are most frequently retrieved in a given period.
|
||||
|
||||
Try it today in [AutoRAG](/autorag/get-started/).
|
||||
Try it today in [AutoRAG](/ai-search/get-started/).
|
||||
|
|
@ -8,13 +8,13 @@ We're excited to be a launch partner alongside [Google](https://developers.googl
|
|||
|
||||
[`@cf/google/embeddinggemma-300m`](/workers-ai/models/embeddinggemma-300m/) is a 300M parameter embedding model from Google, built from Gemma 3 and the same research used to create Gemini models. This multilingual model supports 100+ languages, making it ideal for RAG systems, semantic search, content classification, and clustering tasks.
|
||||
|
||||
**Using EmbeddingGemma in AutoRAG:**
|
||||
Now you can leverage EmbeddingGemma directly through AutoRAG for your RAG pipelines. EmbeddingGemma's multilingual capabilities make it perfect for global applications that need to understand and retrieve content across different languages with exceptional accuracy.
|
||||
**Using EmbeddingGemma in AI Search:**
|
||||
Now you can leverage EmbeddingGemma directly through AI Search for your RAG pipelines. EmbeddingGemma's multilingual capabilities make it perfect for global applications that need to understand and retrieve content across different languages with exceptional accuracy.
|
||||
|
||||
To use EmbeddingGemma for your AutoRAG projects:
|
||||
1. Go to **Create** in the [AutoRAG dashboard](https://dash.cloudflare.com/?to=/:account/ai/autorag)
|
||||
To use EmbeddingGemma for your AI Search projects:
|
||||
1. Go to **Create** in the [AI Search dashboard](https://dash.cloudflare.com/?to=/:account/ai/ai-search)
|
||||
2. Follow the setup flow for your new RAG instance
|
||||
3. In the **Generate Index** step, open up **More embedding models** and select `@cf/google/embeddinggemma-300m` as your embedding model
|
||||
4. Complete the setup to create an AutoRAG
|
||||
4. Complete the setup to create an AI Search
|
||||
|
||||
Try it out and let us know what you think!
|
||||
|
|
@ -300,7 +300,7 @@
|
|||
"parent": ["AI"]
|
||||
},
|
||||
{
|
||||
"name": "AutoRAG",
|
||||
"name": "AI Search",
|
||||
"deeplink": "/?to=/:account/ai/autorag",
|
||||
"parent": ["AI"]
|
||||
},
|
||||
|
|
|
|||
|
|
@ -25,7 +25,7 @@ These MCP servers allow your MCP Client to read configurations from your account
|
|||
| [Browser rendering server](https://github.com/cloudflare/mcp-server-cloudflare/tree/main/apps/browser-rendering) | Fetch web pages, convert them to markdown and take screenshots | `https://browser.mcp.cloudflare.com/sse` |
|
||||
| [Logpush server](https://github.com/cloudflare/mcp-server-cloudflare/tree/main/apps/logpush) | Get quick summaries for Logpush job health | `https://logs.mcp.cloudflare.com/sse` |
|
||||
| [AI Gateway server](https://github.com/cloudflare/mcp-server-cloudflare/tree/main/apps/ai-gateway) | Search your logs, get details about the prompts and responses | `https://ai-gateway.mcp.cloudflare.com/sse` |
|
||||
| [AutoRAG server](https://github.com/cloudflare/mcp-server-cloudflare/tree/main/apps/autorag) | List and search documents on your AutoRAGs | `https://autorag.mcp.cloudflare.com/sse` |
|
||||
| [AI Search server](https://github.com/cloudflare/mcp-server-cloudflare/tree/main/apps/autorag) | List and search documents on your AI Searchs | `https://autorag.mcp.cloudflare.com/sse` |
|
||||
| [Audit Logs server](https://github.com/cloudflare/mcp-server-cloudflare/tree/main/apps/auditlogs) | Query audit logs and generate reports for review | `https://auditlogs.mcp.cloudflare.com/sse` |
|
||||
| [DNS Analytics server](https://github.com/cloudflare/mcp-server-cloudflare/tree/main/apps/dns-analytics) | Optimize DNS performance and debug issues based on current set up | `https://dns-analytics.mcp.cloudflare.com/sse` |
|
||||
| [Digital Experience Monitoring server](https://github.com/cloudflare/mcp-server-cloudflare/tree/main/apps/dex-analysis) | Get quick insight on critical applications for your organization | `https://dex.mcp.cloudflare.com/sse` |
|
||||
|
|
|
|||
|
|
@ -1,7 +1,7 @@
|
|||
---
|
||||
pcx_content_type: navigation
|
||||
title: REST API
|
||||
external_link: /api/resources/autorag/
|
||||
external_link: /api/resources/ai-search/
|
||||
sidebar:
|
||||
order: 9
|
||||
---
|
||||
44
src/content/docs/ai-search/concepts/how-ai-search-works.mdx
Normal file
|
|
@ -0,0 +1,44 @@
|
|||
---
|
||||
pcx_content_type: concept
|
||||
title: How AI Search works
|
||||
sidebar:
|
||||
order: 2
|
||||
---
|
||||
|
||||
AI Search (formerly AutoRAG) is Cloudflare’s managed search service. You can connect your data such as websites or unstructured content, and it automatically creates a continuously updating index that you can query with natural language in your applications or AI agents.
|
||||
|
||||
AI Search consists of two core processes:
|
||||
|
||||
- **Indexing:** An asynchronous background process that monitors your data source for changes and converts your data into vectors for search.
|
||||
- **Querying:** A synchronous process triggered by user queries. It retrieves the most relevant content and generates context-aware responses.
|
||||
|
||||
## How indexing works
|
||||
|
||||
Indexing begins automatically when you create an AI Search instance and connect a data source.
|
||||
|
||||
Here is what happens during indexing:
|
||||
|
||||
1. **Data ingestion:** AI Search reads from your connected data source.
|
||||
2. **Markdown conversion:** AI Search uses [Workers AI’s Markdown Conversion](/workers-ai/features/markdown-conversion/) to convert [supported data types](/ai-search/configuration/data-source/) into structured Markdown. This ensures consistency across diverse file types. For images, Workers AI is used to perform object detection followed by vision-to-language transformation to convert images into Markdown text.
|
||||
3. **Chunking:** The extracted text is [chunked](/ai-search/configuration/chunking/) into smaller pieces to improve retrieval granularity.
|
||||
4. **Embedding:** Each chunk is embedded using Workers AI’s embedding model to transform the content into vectors.
|
||||
5. **Vector storage:** The resulting vectors, along with metadata like file name, are stored in a the [Vectorize](/vectorize/) database created on your Cloudflare account.
|
||||
|
||||
After the initial data set is indexed, AI Search will regularly check for updates in your data source (e.g. additions, updates, or deletes) and index changes to ensure your vector database is up to date.
|
||||
|
||||

|
||||
|
||||
## How querying works
|
||||
|
||||
Once indexing is complete, AI Search is ready to respond to end-user queries in real time.
|
||||
|
||||
Here is how the querying pipeline works:
|
||||
|
||||
1. **Receive query from AI Search API:** The query workflow begins when you send a request to either the AI Search’s [AI Search](/ai-search/usage/rest-api/#ai-search) or [Search](/ai-search/usage/rest-api/#search) endpoints.
|
||||
2. **Query rewriting (optional):** AI Search provides the option to [rewrite the input query](/ai-search/configuration/query-rewriting/) using one of Workers AI’s LLMs to improve retrieval quality by transforming the original query into a more effective search query.
|
||||
3. **Embedding the query:** The rewritten (or original) query is transformed into a vector via the same embedding model used to embed your data so that it can be compared against your vectorized data to find the most relevant matches.
|
||||
4. **Querying Vectorize index:** The query vector is [queried](/vectorize/best-practices/query-vectors/) against stored vectors in the associated Vectorize database for your AI Search.
|
||||
5. **Content retrieval:** Vectorize returns the metadata of the most relevant chunks, and the original content is retrieved from the R2 bucket. If you are using the Search endpoint, the content is returned at this point.
|
||||
6. **Response generation:** If you are using the AI Search endpoint, then a text-generation model from Workers AI is used to generate a response using the retrieved content and the original user’s query, combined via a [system prompt](/ai-search/configuration/system-prompt/). The context-aware response from the model is returned.
|
||||
|
||||

|
||||
|
|
@ -19,10 +19,10 @@ Here’s a simplified overview of the RAG pipeline:
|
|||
|
||||
The resulting response should be accurate, relevant, and based on your own data.
|
||||
|
||||

|
||||

|
||||
|
||||
:::note[How does AutoRAG work]
|
||||
To learn more details about how AutoRAG uses RAG under the hood, reference [How AutoRAG works](/autorag/concepts/how-autorag-works/).
|
||||
:::note[How does AI Search work]
|
||||
To learn more details about how AI Search uses RAG under the hood, reference [How AI Search works](/ai-search/concepts/how-ai-search-works/).
|
||||
:::
|
||||
|
||||
## Why use RAG?
|
||||
|
|
@ -5,13 +5,13 @@ sidebar:
|
|||
order: 6
|
||||
---
|
||||
|
||||
Similarity-based caching in AutoRAG lets you serve responses from Cloudflare’s cache for queries that are similar to previous requests, rather than creating new, unique responses for every request. This speeds up response times and cuts costs by reusing answers for questions that are close in meaning.
|
||||
Similarity-based caching in AI Search lets you serve responses from Cloudflare’s cache for queries that are similar to previous requests, rather than creating new, unique responses for every request. This speeds up response times and cuts costs by reusing answers for questions that are close in meaning.
|
||||
|
||||
## How It Works
|
||||
|
||||
Unlike with basic caching, which creates a new response with every request, this is what happens when a request is received using similarity-based caching:
|
||||
|
||||
1. AutoRAG checks if a _similar_ prompt (based on your chosen threshold) has been answered before.
|
||||
1. AI Search checks if a _similar_ prompt (based on your chosen threshold) has been answered before.
|
||||
2. If a match is found, it returns the cached response instantly.
|
||||
3. If no match is found, it generates a new response and caches it.
|
||||
|
||||
|
|
@ -27,14 +27,14 @@ Consider these behaviors when using similarity caching:
|
|||
|
||||
## How similarity matching works
|
||||
|
||||
AutoRAG’s similarity cache uses **MinHash and Locality-Sensitive Hashing (LSH)** to find and reuse responses for prompts that are worded similarly.
|
||||
AI Search’s similarity cache uses **MinHash and Locality-Sensitive Hashing (LSH)** to find and reuse responses for prompts that are worded similarly.
|
||||
|
||||
Here’s how it works when a new prompt comes in:
|
||||
|
||||
1. The prompt is split into small overlapping chunks of words (called shingles), like “what’s the” or “the weather.”
|
||||
2. These shingles are turned into a “fingerprint” using MinHash. The more overlap two prompts have, the more similar their fingerprints will be.
|
||||
3. Fingerprints are placed into LSH buckets, which help AutoRAG quickly find similar prompts without comparing every single one.
|
||||
4. If a past prompt in the same bucket is similar enough (based on your configured threshold), AutoRAG reuses its cached response.
|
||||
3. Fingerprints are placed into LSH buckets, which help AI Search quickly find similar prompts without comparing every single one.
|
||||
4. If a past prompt in the same bucket is similar enough (based on your configured threshold), AI Search reuses its cached response.
|
||||
|
||||
## Choosing a threshold
|
||||
|
||||
|
|
@ -47,4 +47,4 @@ The similarity threshold decides how close two prompts need to be to reuse a cac
|
|||
| Broad | Moderate match, more hits | "What’s the weather like today?" matches with "Tell me today’s weather" |
|
||||
| Loose | Low similarity, max reuse | "What’s the weather like today?" matches with "Give me the forecast" |
|
||||
|
||||
Test these values to see which works best with your [RAG application](/autorag/).
|
||||
Test these values to see which works best with your [RAG application](/ai-search/).
|
||||
|
|
@ -5,7 +5,7 @@ sidebar:
|
|||
order: 6
|
||||
---
|
||||
|
||||
Chunking is the process of splitting large data into smaller segments before embedding them for search. AutoRAG uses **recursive chunking**, which breaks your content at natural boundaries (like paragraphs or sentences), and then further splits it if the chunks are too large.
|
||||
Chunking is the process of splitting large data into smaller segments before embedding them for search. AI Search uses **recursive chunking**, which breaks your content at natural boundaries (like paragraphs or sentences), and then further splits it if the chunks are too large.
|
||||
|
||||
## What is recursive chunking
|
||||
|
||||
|
|
@ -18,7 +18,7 @@ This way, chunks are easy to embed and retrieve, without cutting off thoughts mi
|
|||
|
||||
## Chunking controls
|
||||
|
||||
AutoRAG exposes two parameters to help you control chunking behavior:
|
||||
AI Search exposes two parameters to help you control chunking behavior:
|
||||
|
||||
- **Chunk size**: The number of tokens per chunk.
|
||||
- Minimum: `64`
|
||||
|
|
@ -0,0 +1,13 @@
|
|||
---
|
||||
title: Data source
|
||||
pcx_content_type: how-to
|
||||
sidebar:
|
||||
order: 2
|
||||
---
|
||||
|
||||
AI Search can directly ingest data from the following sources:
|
||||
|
||||
| Data Source | Description |
|
||||
|---------------|-------------|
|
||||
| [Website](/ai-search/configuration/data-source/website/) | Connect a domain you own to index website pages. |
|
||||
| [R2 Bucket](/ai-search/configuration/data-source/r2/) | Connect a Cloudflare R2 bucket to index stored documents. |
|
||||
|
|
@ -9,11 +9,11 @@ import { Render } from "~/components";
|
|||
|
||||
You can use Cloudflare R2 to store data for indexing. To get started, [configure an R2 bucket](/r2/get-started/) containing your data.
|
||||
|
||||
AutoRAG will automatically scan and process supported files stored in that bucket. Files that are unsupported or exceed the size limit will be skipped during indexing and logged as errors.
|
||||
AI Search will automatically scan and process supported files stored in that bucket. Files that are unsupported or exceed the size limit will be skipped during indexing and logged as errors.
|
||||
|
||||
## File limits
|
||||
|
||||
AutoRAG has different file size limits depending on the file type:
|
||||
AI Search has different file size limits depending on the file type:
|
||||
|
||||
- **Plain text files:** Up to **4 MB**
|
||||
- **Rich format files:** Up to **4 MB**
|
||||
|
|
@ -22,11 +22,11 @@ Files that exceed these limits will not be indexed and will show up in the error
|
|||
|
||||
## File types
|
||||
|
||||
AutoRAG can ingest a variety of different file types to power your RAG. The following plain text files and rich format files are supported.
|
||||
AI Search can ingest a variety of different file types to power your RAG. The following plain text files and rich format files are supported.
|
||||
|
||||
### Plain text file types
|
||||
|
||||
AutoRAG supports the following plain text file types:
|
||||
AI Search supports the following plain text file types:
|
||||
|
||||
| Format | File extensions | Mime Type |
|
||||
| ---------- | ------------------------------------------------------------------------------ | --------------------------------------------------------------------- |
|
||||
|
|
@ -55,6 +55,6 @@ AutoRAG supports the following plain text file types:
|
|||
|
||||
### Rich format file types
|
||||
|
||||
AutoRAG uses [Markdown Conversion](/workers-ai/features/markdown-conversion/) to convert rich format files to markdown. The following table lists the supported formats that will be converted to Markdown:
|
||||
AI Search uses [Markdown Conversion](/workers-ai/features/markdown-conversion/) to convert rich format files to markdown. The following table lists the supported formats that will be converted to Markdown:
|
||||
|
||||
<Render file="markdown-conversion-support" product="workers-ai" />
|
||||
|
|
@ -0,0 +1,71 @@
|
|||
---
|
||||
title: Website
|
||||
pcx_content_type: how-to
|
||||
sidebar:
|
||||
order: 2
|
||||
---
|
||||
|
||||
import { DashButton, Steps } from "~/components"
|
||||
|
||||
The Website data source allows you to connect a domain you own so its pages can be crawled, stored, and indexed.
|
||||
|
||||
:::note
|
||||
You can only crawl domains that you have onboarded onto the same Cloudflare account.
|
||||
|
||||
Refer to [Onboard a domain](/fundamentals/manage-domains/add-site/) for more information on adding a domain to your Cloudflare account.
|
||||
:::
|
||||
|
||||
:::caution[Bot protection may block crawling]
|
||||
If you use Cloudflare products that control or restrict bot traffic such as [Bot Management](/bots/), [Web Application Firewall (WAF)](/waf/), or [Turnstile](/turnstile/), the same rules will apply to the AI Search (AutoRAG) crawler. Make sure to configure an exception or an allow-list for the AutoRAG crawler in your settings.
|
||||
:::
|
||||
|
||||
## How website crawling works
|
||||
|
||||
When you connect a domain, the crawler looks for your website’s sitemap to determine which pages to visit:
|
||||
|
||||
1. The crawler first checks the `robots.txt` for listed sitemaps. If it exists, it reads all sitemaps existing inside.
|
||||
2. If no `robots.txt` is found, the crawler first checks for a sitemap at `/sitemap.xml`.
|
||||
3. If no sitemap is available, the domain cannot be crawled.
|
||||
|
||||
Pages are visited, according to the `<priority>` attribute set on the sitemaps, if this field is defined.
|
||||
|
||||
## How to set WAF rules to allowlist the crawler
|
||||
|
||||
If you have Security rules configured to block bot activity, you can add a rule to allowlist the crawler bot.
|
||||
|
||||
<Steps>
|
||||
1. In the Cloudflare dashboard, go to the **Security rules** page of your account and domain.
|
||||
|
||||
<DashButton url="/?to=/:account/:zone/security/security-rules" />
|
||||
|
||||
2. To create a new empty rule, select **Create rule** > **Custom rules**.
|
||||
3. Enter a descriptive name for the rule in **Rule name**, such as `Allow AI Search`.
|
||||
4. Under **When incoming requests match**, use the **Field** drop-down list to choose _Bot Detection ID_. For **Operator**, select _equals_. For **Value**, enter `122933950`.
|
||||
5. Under **Then take action**, in the **Choose action** dropdown, choose _Skip_.
|
||||
6. Under **Place at**, select the order of the rule in the **Select order** dropdown to be _First_. Setting the order as _First_ allows this rule to be applied before subsequent rules.
|
||||
7. To save and deploy your rule, select **Deploy**.
|
||||
|
||||
</Steps>
|
||||
|
||||
## Parsing options
|
||||
You can choose how pages are parsed during crawling:
|
||||
|
||||
- **Static sites**: Downloads the raw HTML for each page.
|
||||
- **Rendered sites**: Loads pages with a headless browser and downloads the fully rendered version, including dynamic JavaScript content. Note that the [Browser Rendering](/browser-rendering/platform/pricing/) limits and billing apply.
|
||||
|
||||
## Storage
|
||||
During setup, AI Search creates a dedicated R2 bucket in your account to store the pages that have been crawled and downloaded as HTML files. This bucket is automatically managed and is used only for content discovered by the crawler. Any files or objects that you add directly to this bucket will not be indexed.
|
||||
|
||||
:::note
|
||||
We recommend not to modify the bucket as it may distrupt the indexing flow and cause content to not be updated properly.
|
||||
:::
|
||||
|
||||
## Sync and updates
|
||||
During scheduled or manual [sync jobs](/ai-search/configuration/indexing/), the crawler will check for changes to the `<lastmod>` attribute in your sitemap. If it has been changed to a date occuring after the last sync date, then the page will be crawled, the updated version is stored in the R2 bucket, and automatically reindexed so that your search results always reflect the latest content.
|
||||
|
||||
If the `<lastmod>` attribute is not defined, then AI Search will automatically crawl each link defined in the sitemap once a day.
|
||||
|
||||
## Limits
|
||||
The regular AI Search [limits](/ai-search/platform/limits-pricing/) apply when using the Website data source.
|
||||
|
||||
The crawler will download and index pages only up to the maximum object limit supported for an AI Search instance, and it processes the first set of pages it visits until that limit is reached. In addition, any files that are downloaded but exceed the file size limit will not be indexed.
|
||||
35
src/content/docs/ai-search/configuration/index.mdx
Normal file
|
|
@ -0,0 +1,35 @@
|
|||
---
|
||||
pcx_content_type: navigation
|
||||
title: Configuration
|
||||
sidebar:
|
||||
order: 5
|
||||
---
|
||||
|
||||
import { MetaInfo, Type } from "~/components";
|
||||
|
||||
You can customize how your AI Search instance indexes your data, and retrieves and generates responses for queries. Some settings can be updated after the instance is created, while others are fixed at creation time.
|
||||
|
||||
The table below lists all available configuration options:
|
||||
|
||||
| Configuration | Editable after creation | Description |
|
||||
| ---------------------------------------------------------------------------- | ----------------------- | ------------------------------------------------------------------------------------------ |
|
||||
| [Data source](/ai-search/configuration/data-source/) | no | The source where your knowledge base is stored |
|
||||
| [Chunk size](/ai-search/configuration/chunking/) | yes | Number of tokens per chunk |
|
||||
| [Chunk overlap](/ai-search/configuration/chunking/) | yes | Number of overlapping tokens between chunks |
|
||||
| [Embedding model](/ai-search/configuration/models/) | no | Model used to generate vector embeddings |
|
||||
| [Query rewrite](/ai-search/configuration/query-rewriting/) | yes | Enable or disable query rewriting before retrieval |
|
||||
| [Query rewrite model](/ai-search/configuration/models/) | yes | Model used for query rewriting |
|
||||
| [Query rewrite system prompt](/ai-search/configuration/system-prompt/) | yes | Custom system prompt to guide query rewriting behavior |
|
||||
| [Match threshold](/ai-search/configuration/retrieval-configuration/) | yes | Minimum similarity score required for a vector match |
|
||||
| [Maximum number of results](/ai-search/configuration/retrieval-configuration/) | yes | Maximum number of vector matches returned (`top_k`) |
|
||||
| [Generation model](/ai-search/configuration/models/) | yes | Model used to generate the final response |
|
||||
| [Generation system prompt](/ai-search/configuration/system-prompt/) | yes | Custom system prompt to guide response generation |
|
||||
| [Similarity caching](/ai-search/configuration/cache/) | yes | Enable or disable caching of responses for similar (not just exact) prompts |
|
||||
| [Similarity caching threshold](/ai-search/configuration/cache/) | yes | Controls how similar a new prompt must be to a previous one to reuse its cached response |
|
||||
| [AI Gateway](/ai-gateway) | yes | AI Gateway for monitoring and controlling model usage |
|
||||
| AI Search name | no | Name of your AI Search instance |
|
||||
| Service API token | yes | API token granted to AI Search to give it permission to configure resources on your account. |
|
||||
|
||||
:::note[API token]
|
||||
The Service API token is different from the AI Search API token that you can make to interact with your AI Search. The Service API token is only used by AI Search to get permissions to configure resources on your account.
|
||||
:::
|
||||
|
|
@ -5,11 +5,11 @@ sidebar:
|
|||
order: 4
|
||||
---
|
||||
|
||||
AutoRAG automatically indexes your data into vector embeddings optimized for semantic search. Once a data source is connected, indexing runs continuously in the background to keep your knowledge base fresh and queryable.
|
||||
AI Search automatically indexes your data into vector embeddings optimized for semantic search. Once a data source is connected, indexing runs continuously in the background to keep your knowledge base fresh and queryable.
|
||||
|
||||
## Jobs
|
||||
|
||||
AutoRAG automatically monitors your data source for updates and reindexes your content every **6 hours**. During each cycle, new or modified files are reprocessed to keep your Vectorize index up to date.
|
||||
AI Search automatically monitors your data source for updates and reindexes your content every **6 hours**. During each cycle, new or modified files are reprocessed to keep your Vectorize index up to date.
|
||||
|
||||
You can monitor the status and history of all indexing activity in the Jobs tab, including real-time logs for each job to help you troubleshoot and verify successful syncs.
|
||||
|
||||
|
|
@ -17,7 +17,7 @@ You can monitor the status and history of all indexing activity in the Jobs tab,
|
|||
|
||||
You can control indexing behavior through the following actions on the dashboard:
|
||||
|
||||
- **Sync Index**: Force AutoRAG to scan your data source for new or modified files and initiate an indexing job to update the associated Vectorize index. A new indexing job can be initiated every 30 seconds.
|
||||
- **Sync Index**: Force AI Search to scan your data source for new or modified files and initiate an indexing job to update the associated Vectorize index. A new indexing job can be initiated every 30 seconds.
|
||||
- **Pause Indexing**: Temporarily stop all scheduled indexing checks and reprocessing. Useful for debugging or freezing your knowledge base.
|
||||
|
||||
## Performance
|
||||
|
|
@ -32,6 +32,6 @@ The total time to index depends on the number and type of files in your data sou
|
|||
|
||||
To ensure smooth and reliable indexing:
|
||||
|
||||
- Make sure your files are within the [**size limit**](/autorag/platform/limits-pricing/#limits) and in a supported format to avoid being skipped.
|
||||
- Make sure your files are within the [**size limit**](/ai-search/platform/limits-pricing/#limits) and in a supported format to avoid being skipped.
|
||||
- Keep your Service API token valid to prevent indexing failures.
|
||||
- Regularly clean up outdated or unnecessary content in your knowledge base to avoid hitting [Vectorize index limits](/vectorize/platform/limits/).
|
||||
|
|
@ -12,7 +12,7 @@ Use metadata to filter documents before retrieval and provide context to guide A
|
|||
## Metadata filtering
|
||||
Metadata filtering narrows down search results based on metadata, so only relevant content is retrieved. The filter narrows down results prior to retrieval, so that you only query the scope of documents that matter.
|
||||
|
||||
Here is an example of metadata filtering using [Workers Binding](/autorag/usage/workers-binding/) but it can be easily adapted to use the [REST API](/autorag/usage/rest-api/) instead.
|
||||
Here is an example of metadata filtering using [Workers Binding](/ai-search/usage/workers-binding/) but it can be easily adapted to use the [REST API](/ai-search/usage/rest-api/) instead.
|
||||
|
||||
```js
|
||||
const answer = await env.AI.autorag("my-autorag").search({
|
||||
63
src/content/docs/ai-search/configuration/models/index.mdx
Normal file
|
|
@ -0,0 +1,63 @@
|
|||
---
|
||||
title: Models
|
||||
pcx_content_type: how-to
|
||||
sidebar:
|
||||
order: 2
|
||||
---
|
||||
|
||||
AI Search uses models at multiple stages. You can configure which models are used, or let AI Search automatically select a smart default for you.
|
||||
|
||||
## Models usage
|
||||
|
||||
AI Search leverages Workers AI models in the following stages:
|
||||
|
||||
- Image to markdown conversion (if images are in data source): Converts image content to Markdown using object detection and captioning models.
|
||||
- Embedding: Transforms your documents and queries into vector representations for semantic search.
|
||||
- Query rewriting (optional): Reformulates the user’s query to improve retrieval accuracy.
|
||||
- Generation: Produces the final response from retrieved context.
|
||||
|
||||
## Model providers
|
||||
|
||||
All AI Search instances support models from [Workers AI](/workers-ai). You can use other providers (such as OpenAI or Anthropic) in AI Search by adding their API keys to an [AI Gateway](/ai-gateway) and connecting that gateway to your AI Search.
|
||||
|
||||
To use AI Search with other model providers:
|
||||
|
||||
1. Add provider keys to AI Gateway
|
||||
- Go to **AI > AI Gateway** in the dashboard.
|
||||
- Select or create an AI gateway.
|
||||
- In **Provider Keys**, choose your provider, click **Add**, and enter the key.
|
||||
2. Connect the gateway to AI Search
|
||||
- When creating a new AI Search, select the AI Gateway with your provider keys.
|
||||
- For an existing AI Search, go to **Settings** and switch to a gateway that has your keys under **Resources**.
|
||||
3. Select models
|
||||
- Embedding model: Only available to be changed when creating a new AI Search.
|
||||
- Generation model: Can be selected when creating a new AI Search and can be changed at any time in **Settings**.
|
||||
|
||||
AI Search supports a subset of models that have been selected to provide the best experience. See list of [supported models](/ai-search/configuration/models/supported-models/).
|
||||
|
||||
### Smart default
|
||||
|
||||
If you choose **Smart Default** in your model selection, then AI Search will select a Cloudflare recommended model and will update it automatically for you over time. You can switch to explicit model configuration at any time by visiting **Settings**.
|
||||
|
||||
### Per-request generation model override
|
||||
|
||||
While the generation model can be set globally at the AI Search instance level, you can also override it on a per-request basis in the [AI Search API](/ai-search/usage/rest-api/#ai-search). This is useful if your [RAG application](/ai-search/) requires dynamic selection of generation models based on context or user preferences.
|
||||
|
||||
## Model deprecation
|
||||
AI Search may deprecate support for a given model in order to provide support for better-performing models with improved capabilities. When a model is being deprecated, we announce the change and provide an end-of-life date after which the model will no longer be accessible. Applications that depend on AI Search may therefore require occasional updates to continue working reliably.
|
||||
|
||||
### Model lifecycle
|
||||
AI Search models follow a defined lifecycle to ensure stability and predictable deprecation:
|
||||
|
||||
1. **Production:** The model is actively supported and recommended for use. It is included in Smart Defaults and receives ongoing updates and maintenance.
|
||||
2. **Announcement & Transition:** The model remains available but has been marked for deprecation. An end-of-life date is communicated through documentation, release notes, and other official channels. During this phase, users are encouraged to migrate to the recommended replacement model.
|
||||
3. **Automatic Upgrade (if applicable):** If you have selected the Smart Default option, AI Search will automatically upgrade requests to a recommended replacement.
|
||||
4. **End of life:** The model is no longer available. Any requests to the retired model return a clear error message, and the model is removed from documentation and Smart Defaults.
|
||||
|
||||
See models are their lifecycle status in [supported models](/ai-search/configuration/models/supported-models/).
|
||||
|
||||
### Best practices
|
||||
|
||||
- Regularly check the [release note](/ai-search/platform/release-note/) for updates.
|
||||
- Plan migration efforts according to the communicated end-of-life date.
|
||||
- Migrate and test the recommended replacement models before the end-of-life date.
|
||||
|
|
@ -0,0 +1,55 @@
|
|||
---
|
||||
title: Supported models
|
||||
pcx_content_type: how-to
|
||||
sidebar:
|
||||
order: 2
|
||||
---
|
||||
|
||||
This page lists all models supported by AI Search and their lifecycle status.
|
||||
|
||||
:::note[Request model support]
|
||||
If you would like to use a model that is not currently supported, reach out to us on [Discord](https://discord.gg/cloudflaredev) to request it.
|
||||
:::
|
||||
|
||||
|
||||
## Production models
|
||||
Production models are the actively supported and recommended models that are stable, fully available.
|
||||
|
||||
### Text generation
|
||||
| Provider | Alias | Context window (tokens) |
|
||||
|---|---|---|
|
||||
| **Anthropic** | `anthropic/claude-3-7-sonnet` | 200,000 |
|
||||
| | `anthropic/claude-sonnet-4` | 200,000 |
|
||||
| | `anthropic/claude-opus-4` | 200,000 |
|
||||
| | `anthropic/claude-3-5-haiku` | 200,000 |
|
||||
| **Cerebras** | `cerebras/qwen-3-235b-a22b-instruct` | 64,000 |
|
||||
| | `cerebras/qwen-3-235b-a22b-thinking` | 65,000 |
|
||||
| | `cerebras/llama-3.3-70b` | 65,000 |
|
||||
| | `cerebras/llama-4-maverick-17b-128e-instruct` | 8,000 |
|
||||
| | `cerebras/llama-4-scout-17b-16e-instruct` | 8,000 |
|
||||
| | `cerebras/gpt-oss-120b` | 64,000 |
|
||||
| **Google AI Studio** | `google-ai-studio/gemini-2.5-flash` | 1,048,576 |
|
||||
| | `google-ai-studio/gemini-2.5-pro` | 1,048,576 |
|
||||
| **Grok (x.ai)** | `grok/grok-4` | 256,000 |
|
||||
| **Groq** | `groq/llama-3.3-70b-versatile` | 131,072 |
|
||||
| | `groq/llama-3.1-8b-instant` | 131,072 |
|
||||
| **OpenAI** | `openai/gpt-5` | 400,000 |
|
||||
| | `openai/gpt-5-mini` | 400,000 |
|
||||
| | `openai/gpt-5-nano` | 400,000 |
|
||||
| **Workers AI** | `@cf/meta/llama-3.3-70b-instruct-fp8-fast` | 24,000 |
|
||||
| | `@cf/meta/llama-3.1-8b-instruct-fast` | 60,000 |
|
||||
| | `@cf/meta/llama-3.1-8b-instruct-fp8` | 32,000 |
|
||||
| | `@cf/meta/llama-4-scout-17b-16e-instruct` | 131,000 |
|
||||
|
||||
### Embedding
|
||||
| Provider | Alias | Vector dims | Input tokens | Metric |
|
||||
|---|---|---|---|---|
|
||||
| **Google AI Studio** | `google-ai-studio/gemini-embedding-001` | 1,536 | 2048 | cosine |
|
||||
| **OpenAI** | `openai/text-embedding-3-small` | 1,536 | 8192 | cosine |
|
||||
| | `openai/text-embedding-3-large` | 1,536 | 8192 | cosine |
|
||||
| **Workers AI** | `@cf/baai/bge-m3` | 1,024 | 512 | cosine |
|
||||
| | `@cf/baai/bge-large-en-v1.5` | 1,024 | 512 | cosine |
|
||||
|
||||
## Transition models
|
||||
|
||||
There are currently no models marked for end-of-life.
|
||||
|
|
@ -5,9 +5,9 @@ sidebar:
|
|||
order: 5
|
||||
---
|
||||
|
||||
Query rewriting is an optional step in the AutoRAG pipeline that improves retrieval quality by transforming the original user query into a more effective search query.
|
||||
Query rewriting is an optional step in the AI Search pipeline that improves retrieval quality by transforming the original user query into a more effective search query.
|
||||
|
||||
Instead of embedding the raw user input directly, AutoRAG can use a large language model (LLM) to rewrite the query based on a system prompt. The rewritten query is then used to perform the vector search.
|
||||
Instead of embedding the raw user input directly, AI Search can use a large language model (LLM) to rewrite the query based on a system prompt. The rewritten query is then used to perform the vector search.
|
||||
|
||||
## Why use query rewriting?
|
||||
|
||||
|
|
@ -30,11 +30,11 @@ In this example, the original query is conversational and vague. The rewritten v
|
|||
|
||||
## How it works
|
||||
|
||||
If query rewriting is enabled, AutoRAG performs the following:
|
||||
If query rewriting is enabled, AI Search performs the following:
|
||||
|
||||
1. Sends the **original user query** and the **query rewrite system prompt** to the configured LLM
|
||||
2. Receives the **rewritten query** from the model
|
||||
3. Embeds the rewritten query using the selected embedding model
|
||||
4. Performs vector search in your AutoRAG’s Vectorize index
|
||||
4. Performs vector search in your AI Search’s Vectorize index
|
||||
|
||||
For details on how to guide model behavior during this step, see the [system prompt](/autorag/configuration/system-prompt/) documentation.
|
||||
For details on how to guide model behavior during this step, see the [system prompt](/ai-search/configuration/system-prompt/) documentation.
|
||||
|
|
@ -5,12 +5,12 @@ sidebar:
|
|||
order: 5
|
||||
---
|
||||
|
||||
AutoRAG allows you to configure how content is retrieved from your vector index and used to generate a final response. Two options control this behavior:
|
||||
AI Search allows you to configure how content is retrieved from your vector index and used to generate a final response. Two options control this behavior:
|
||||
|
||||
- **Match threshold**: Minimum similarity score required for a vector match to be considered relevant.
|
||||
- **Maximum number of results**: Maximum number of top-matching results to return (`top_k`).
|
||||
|
||||
AutoRAG uses the [`query()`](/vectorize/best-practices/query-vectors/) method from [Vectorize](/vectorize/) to perform semantic search. This function compares the embedded query vector against the stored vectors in your index and returns the most similar results.
|
||||
AI Search uses the [`query()`](/vectorize/best-practices/query-vectors/) method from [Vectorize](/vectorize/) to perform semantic search. This function compares the embedded query vector against the stored vectors in your index and returns the most similar results.
|
||||
|
||||
## Match threshold
|
||||
|
||||
|
|
@ -28,17 +28,17 @@ This setting controls the number of top-matching chunks returned by Vectorize af
|
|||
|
||||
## How they work together
|
||||
|
||||
AutoRAG's retrieval step follows this sequence:
|
||||
AI Search's retrieval step follows this sequence:
|
||||
|
||||
1. Your query is embedded using the configured Workers AI model.
|
||||
2. `query()` is called to search the Vectorize index, with `topK` set to the `maximum_number_of_results`.
|
||||
3. Results are filtered using the `match_threshold`.
|
||||
4. The filtered results are passed into the generation step as context.
|
||||
|
||||
If no results meet the threshold, AutoRAG will not generate a response.
|
||||
If no results meet the threshold, AI Search will not generate a response.
|
||||
|
||||
## Configuration
|
||||
|
||||
These values can be configured at the AutoRAG instance level or overridden on a per-request basis using the [REST API](/autorag/usage/rest-api/) or the [Workers Binding](/autorag/usage/workers-binding/).
|
||||
These values can be configured at the AI Search instance level or overridden on a per-request basis using the [REST API](/ai-search/usage/rest-api/) or the [Workers Binding](/ai-search/usage/workers-binding/).
|
||||
|
||||
Use the parameters `match_threshold` and `max_num_results` to customize retrieval behavior per request.
|
||||
|
|
@ -5,7 +5,7 @@ sidebar:
|
|||
order: 4
|
||||
---
|
||||
|
||||
System prompts allow you to guide the behavior of the text-generation models used by AutoRAG at query time. AutoRAG supports system prompt configuration in two steps:
|
||||
System prompts allow you to guide the behavior of the text-generation models used by AI Search at query time. AI Search supports system prompt configuration in two steps:
|
||||
|
||||
- **Query rewriting**: Reformulates the original user query to improve semantic retrieval. A system prompt can guide how the model interprets and rewrites the query.
|
||||
- **Generation**: Generates the final response from retrieved context. A system prompt can help define how the model should format, filter, or prioritize information when constructing the answer.
|
||||
|
|
@ -23,17 +23,17 @@ System prompts are particularly useful for:
|
|||
|
||||
## How to set your system prompt
|
||||
|
||||
The system prompt for your AutoRAG can be set after it has been created by:
|
||||
The system prompt for your AI Search can be set after it has been created by:
|
||||
|
||||
1. Navigating to the [Cloudflare dashboard](https://dash.cloudflare.com/?to=/:account/ai/autorag), and go to AI > AutoRAG
|
||||
2. Select your AutoRAG
|
||||
1. Navigating to the [Cloudflare dashboard](https://dash.cloudflare.com/?to=/:account/ai/autorag), and go to AI > AI Search
|
||||
2. Select your AI Search
|
||||
3. Go to Settings page and find the System prompt setting for either Query rewrite or Generation
|
||||
|
||||
### Default system prompt
|
||||
|
||||
When configuring your AutoRAG instance, you can provide your own system prompts. If you do not provide a system prompt, AutoRAG will use the **default system prompt** provided by Cloudflare.
|
||||
When configuring your AI Search instance, you can provide your own system prompts. If you do not provide a system prompt, AI Search will use the **default system prompt** provided by Cloudflare.
|
||||
|
||||
You can view the effective system prompt used for any AutoRAG's model call through AI Gateway logs, where model inputs and outputs are recorded.
|
||||
You can view the effective system prompt used for any AI Search's model call through AI Gateway logs, where model inputs and outputs are recorded.
|
||||
|
||||
:::note
|
||||
The default system prompt can change and evolve over time to improve performance and quality.
|
||||
63
src/content/docs/ai-search/get-started.mdx
Normal file
|
|
@ -0,0 +1,63 @@
|
|||
---
|
||||
title: Getting started
|
||||
pcx_content_type: get-started
|
||||
sidebar:
|
||||
order: 2
|
||||
head:
|
||||
- tag: title
|
||||
content: Get started with AI Search
|
||||
Description: Get started creating fully-managed, retrieval-augmented generation pipelines with Cloudflare AI Search.
|
||||
---
|
||||
|
||||
import { DashButton } from "~/components";
|
||||
|
||||
AI Search (formerly AutoRAG) is Cloudflare’s managed search service. You can connect your data such as websites or unstructured content, and it automatically creates a continuously updating index that you can query with natural language in your applications or AI agents.
|
||||
|
||||
## Prerequisite
|
||||
|
||||
AI Search integrates with R2 for storing your data. You must have an **active R2 subscription** before creating your first AI Search. You can purchase the subscription on the Cloudflare R2 dashboard.
|
||||
|
||||
<DashButton url="/?to=/:account/r2/overview" />
|
||||
|
||||
## 1. Create an AI Search
|
||||
|
||||
To create a new AI Search:
|
||||
|
||||
1. In the Cloudflare dashboard, go to the **AI Search** page.
|
||||
|
||||
<DashButton url="/?to=/:account/ai/autorag" />
|
||||
|
||||
2. Select **Create**
|
||||
3. In Create a RAG, select **Get Started**
|
||||
3. Then choose how you want to connect your data:
|
||||
- **R2 bucket**: Index the content from one of your R2 buckets.
|
||||
- **Website**: Provide a domain from your Cloudflare account and AI Search will automatically crawl your site, store the content in R2, and index it.
|
||||
3. Configure the AI Search and complete the setup process.
|
||||
4. Select **Create**.
|
||||
|
||||
## 2. Monitor indexing
|
||||
|
||||
After setup, AI Search creates a Vectorize index in your account and begins indexing the data.
|
||||
|
||||
To monitor progress:
|
||||
|
||||
1. From the **AI Search** page in the dashboard, locate and select your AI Search.
|
||||
2. Navigate to the **Overview** page to view the current indexing status.
|
||||
|
||||
## 3. Try it out
|
||||
|
||||
Once indexing is complete, you can run your first query:
|
||||
|
||||
1. From the **AI Search** page in the dashboard, locate and select your AI Search.
|
||||
2. Navigate to the **Playground** tab.
|
||||
3. Select **Search with AI** or **Search**.
|
||||
4. Enter a **query** to test out its response.
|
||||
|
||||
## 4. Add to your application
|
||||
|
||||
Once you are ready, go to **Connect** for instructions on how to connect AI Search to your application.
|
||||
|
||||
There are multiple ways you can connect:
|
||||
|
||||
- [Workers Binding](/ai-search/usage/workers-binding/)
|
||||
- [REST API](/ai-search/usage/rest-api/)
|
||||
|
|
@ -19,9 +19,14 @@ import {
|
|||
TypeScriptExample,
|
||||
} from "~/components";
|
||||
|
||||
When using `AI Search`, AutoRAG leverages a Workers AI model to generate the response. If you want to use a model outside of Workers AI, you can use AutoRAG for search while leveraging a model outside of Workers AI to generate responses.
|
||||
When using `AI Search`, AI Search leverages a Workers AI model to generate the response. If you want to use a model outside of Workers AI, you can use AI Search for `search` while leveraging a model outside of Workers AI to generate responses.
|
||||
|
||||
Here is an example of how you can use an OpenAI model to generate your responses. This example uses [Workers Binding](/ai-search/usage/workers-binding/), but can be easily adapted to use the [REST API](/ai-search/usage/rest-api/) instead.
|
||||
|
||||
:::note
|
||||
AI Search now supports [bringing your own models natively](/ai-search/configuration/models/). You can attach provider keys through AI Gateway and select third-party models directly in your AI Search settings. The example below still works, but the recommended way is to configure your external model through AI Gateway.
|
||||
:::
|
||||
|
||||
Here is an example of how you can use an OpenAI model to generate your responses. This example uses [Workers Binding](/autorag/usage/workers-binding/), but can be easily adapted to use the [REST API](/autorag/usage/rest-api/) instead.
|
||||
|
||||
<TypeScriptExample>
|
||||
|
||||
|
|
@ -44,7 +49,7 @@ export default {
|
|||
url.searchParams.get("query") ??
|
||||
"How do I train a llama to deliver coffee?";
|
||||
|
||||
// Search for documents in AutoRAG
|
||||
// Search for documents in AI Search
|
||||
const searchResult = await env.AI.autorag("my-rag").search({
|
||||
query: userQuery,
|
||||
});
|
||||
|
|
@ -7,7 +7,7 @@ sidebar:
|
|||
|
||||
import { FileTree } from "~/components"
|
||||
|
||||
AutoRAG supports multitenancy by letting you segment content by tenant, so each user, customer, or workspace can only access their own data. This is typically done by organizing documents into per-tenant folders and applying [metadata filters](/autorag/configuration/metadata/) at query time.
|
||||
AI Search supports multitenancy by letting you segment content by tenant, so each user, customer, or workspace can only access their own data. This is typically done by organizing documents into per-tenant folders and applying [metadata filters](/ai-search/configuration/metadata/) at query time.
|
||||
|
||||
## 1. Organize Content by Tenant
|
||||
|
||||
|
|
@ -23,13 +23,13 @@ Example folder structure:
|
|||
- contracts/
|
||||
</FileTree>
|
||||
|
||||
When indexing, AutoRAG will automatically store the folder path as metadata under the `folder` attribute. It is recommended to enforce folder separation during upload or indexing to prevent accidental data access across tenants.
|
||||
When indexing, AI Search will automatically store the folder path as metadata under the `folder` attribute. It is recommended to enforce folder separation during upload or indexing to prevent accidental data access across tenants.
|
||||
|
||||
## 2. Search Using Folder Filters
|
||||
|
||||
To ensure a tenant only retrieves their own documents, apply a `folder` filter when performing a search.
|
||||
|
||||
Example using [Workers Binding](/autorag/usage/workers-binding/):
|
||||
Example using [Workers Binding](/ai-search/usage/workers-binding/):
|
||||
|
||||
```js
|
||||
const response = await env.AI.autorag("my-autorag").search({
|
||||
|
|
@ -42,7 +42,7 @@ const response = await env.AI.autorag("my-autorag").search({
|
|||
});
|
||||
```
|
||||
|
||||
To filter across multiple folders, or to add date-based filtering, you can use a compound filter with an array of [comparison filters](/autorag/configuration/metadata/#compound-filter).
|
||||
To filter across multiple folders, or to add date-based filtering, you can use a compound filter with an array of [comparison filters](/ai-search/configuration/metadata/#compound-filter).
|
||||
|
||||
## Tip: Use "Starts with" filter
|
||||
|
||||
|
|
@ -56,7 +56,7 @@ While an `eq` filter targets files at the specific folder, you'll often want to
|
|||
- contract-1.pdf
|
||||
</FileTree>
|
||||
|
||||
To achieve this [starts with](/autorag/configuration/metadata/#starts-with-filter-for-folders) behavior, use a compound filter like:
|
||||
To achieve this [starts with](/ai-search/configuration/metadata/#starts-with-filter-for-folders) behavior, use a compound filter like:
|
||||
|
||||
```js
|
||||
filters: {
|
||||
|
|
@ -5,7 +5,7 @@ sidebar:
|
|||
order: 1
|
||||
---
|
||||
|
||||
Enable conversational search on your website with NLWeb and Cloudflare AutoRAG. This template crawls your site, indexes the content, and deploys NLWeb-standard endpoints to serve both people and AI agents.
|
||||
Enable conversational search on your website with NLWeb and Cloudflare AI Search. This template crawls your site, indexes the content, and deploys NLWeb-standard endpoints to serve both people and AI agents.
|
||||
|
||||
:::note
|
||||
This is a public preview ideal for experimentation. If you're interested in running this in production workflows, please contact us at nlweb@cloudflare.com.
|
||||
|
|
@ -20,21 +20,21 @@ This is a public preview ideal for experimentation. If you're interested in runn
|
|||
|
||||
## How to use it
|
||||
|
||||
You can deploy NLWeb on your website directly through the AutoRAG dashboard:
|
||||
You can deploy NLWeb on your website directly through the AI Search dashboard:
|
||||
|
||||
1. Log in to your [Cloudflare dashboard](https://dash.cloudflare.com/).
|
||||
2. Go to **Compute & AI** > **AutoRAG**.
|
||||
3. Select **Create AutoRAG**, then choose the **NLWeb Website** option.
|
||||
2. Go to **Compute & AI** > **AI Search**.
|
||||
3. Select **Create AI Search**, then choose the **NLWeb Website** option.
|
||||
4. Select your domain from your Cloudflare account.
|
||||
5. Click **Start indexing**.
|
||||
|
||||
Once complete, AutoRAG will crawl and index your site, then deploy an NLWeb Worker for you.
|
||||
Once complete, AI Search will crawl and index your site, then deploy an NLWeb Worker for you.
|
||||
|
||||
## What this template includes
|
||||
|
||||
Choosing the NLWeb Website option extends a normal AutoRAG by tailoring it for content‑heavy websites and giving you everything that is required to adopt NLWeb as the standard for conversational search on your site. Specifically, the template provides:
|
||||
Choosing the NLWeb Website option extends a normal AI Search by tailoring it for content‑heavy websites and giving you everything that is required to adopt NLWeb as the standard for conversational search on your site. Specifically, the template provides:
|
||||
|
||||
- **Website as a data source:** Uses [Website](/autorag/configuration/data-source/website/) as data source option to crawl and ingest pages with the Rendered Sites option.
|
||||
- **Website as a data source:** Uses [Website](/ai-search/configuration/data-source/website/) as data source option to crawl and ingest pages with the Rendered Sites option.
|
||||
- **Defaults for content-heavy websites:** Applies tuned embedding and retrieval configurations ideal for publishing and content‑rich websites.
|
||||
- **NLWeb Worker deployment:** Automatically spins up a Cloudflare Worker from the [NLWeb Worker template](https://github.com/cloudflare/templates).
|
||||
|
||||
|
|
@ -54,7 +54,7 @@ These endpoints give both people and agents structured access to your content.
|
|||
To integrate NLWeb search directly into your site you can:
|
||||
|
||||
1. Find your deployed Worker in the [Cloudflare dashboard](https://dash.cloudflare.com/):
|
||||
- Go to **Compute & AI** > **AutoRAG**.
|
||||
- Go to **Compute & AI** > **AI Search**.
|
||||
- Select **Connect**, then go to the **NLWeb** tab.
|
||||
- Select **Go to Worker**.
|
||||
2. Add a [custom domain](/workers/configuration/routing/custom-domains/) to your Worker (for example, ask.example.com)
|
||||
|
|
@ -66,14 +66,14 @@ You can also use the embeddable snippet to add a search UI directly into your we
|
|||
<!-- Add css on head -->
|
||||
<link rel="stylesheet" href="https://ask.example.com/nlweb-dropdown-chat.css">
|
||||
<link rel="stylesheet" href="https://ask.example.com/common-chat-styles.css">
|
||||
|
||||
|
||||
<!-- Add container on body -->
|
||||
<div id="docs-search-container"></div>
|
||||
|
||||
|
||||
<!-- Include JavaScript -->
|
||||
<script type="module">
|
||||
import { NLWebDropdownChat } from 'https://ask.example.com/nlweb-dropdown-chat.js';
|
||||
|
||||
|
||||
const chat = new NLWebDropdownChat({
|
||||
containerId: 'docs-search-container',
|
||||
site: 'https://ask.example.com',
|
||||
|
|
@ -96,10 +96,10 @@ The simplest way to apply changes or updates is to redeploy the Worker template:
|
|||
|
||||
[](https://deploy.workers.cloudflare.com/?url=https://github.com/cloudflare/templates/tree/main/nlweb-template)
|
||||
|
||||
To do so:
|
||||
To do so:
|
||||
|
||||
1. Select the **Deploy to Cloudflare** button from above to deploy the Worker template to your Cloudflare account.
|
||||
2. Enter the name of your AutoRAG in the `RAG_ID` environment variable field.
|
||||
2. Enter the name of your AI Search in the `RAG_ID` environment variable field.
|
||||
3. Click **Deploy**.
|
||||
4. Select the **GitHub/GitLab** icon on the Workers Dashboard.
|
||||
4. Clone the repository that is created for your Worker.
|
||||
|
|
@ -7,12 +7,12 @@ sidebar:
|
|||
|
||||
import { TypeScriptExample } from "~/components";
|
||||
|
||||
By using the `search` method, you can implement a simple but fast search engine. This example uses [Workers Binding](/autorag/usage/workers-binding/), but can be easily adapted to use the [REST API](/autorag/usage/rest-api/) instead.
|
||||
By using the `search` method, you can implement a simple but fast search engine. This example uses [Workers Binding](/ai-search/usage/workers-binding/), but can be easily adapted to use the [REST API](/ai-search/usage/rest-api/) instead.
|
||||
|
||||
To replicate this example remember to:
|
||||
|
||||
- Disable `rewrite_query`, as you want to match the original user query
|
||||
- Configure your AutoRAG to have small chunk sizes, usually 256 tokens is enough
|
||||
- Configure your AI Search to have small chunk sizes, usually 256 tokens is enough
|
||||
|
||||
<TypeScriptExample>
|
||||
|
||||
|
|
@ -2,14 +2,14 @@
|
|||
pcx_content_type: overview
|
||||
title: Overview
|
||||
|
||||
description: Build scalable, fully-managed RAG applications with Cloudflare AutoRAG. Create retrieval-augmented generation pipelines to deliver accurate, context-aware AI without managing infrastructure.
|
||||
description: Build scalable, fully-managed RAG applications with Cloudflare AI Search. Create retrieval-augmented generation pipelines to deliver accurate, context-aware AI without managing infrastructure.
|
||||
tags:
|
||||
- AI
|
||||
sidebar:
|
||||
order: 1
|
||||
head:
|
||||
- tag: title
|
||||
content: Cloudflare AutoRAG
|
||||
content: Cloudflare AI Search
|
||||
---
|
||||
|
||||
import {
|
||||
|
|
@ -23,27 +23,24 @@ import {
|
|||
} from "~/components";
|
||||
|
||||
<Description>
|
||||
Create fully-managed RAG applications that continuously update and scale on Cloudflare.
|
||||
Create AI-powered search for your data
|
||||
</Description>
|
||||
|
||||
<Plan type="all" />
|
||||
|
||||
AutoRAG lets you create retrieval-augmented generation (RAG) pipelines that power your AI applications with accurate and up-to-date information. Create RAG applications that integrate context-aware AI without managing infrastructure.
|
||||
AI Search (formerly AutoRAG) is Cloudflare’s managed search service. You can connect your data such as websites or unstructured content, and it automatically creates a continuously updating index that you can query with natural language in your applications or AI agents. It natively integrates with Cloudflare’s developer platform tools like Vectorize, AI Gateway, R2, and Workers AI, while also supporting third-party providers and open standards.
|
||||
|
||||
You can use AutoRAG to build:
|
||||
|
||||
- **Product Chatbot:** Answer customer questions using your own product content.
|
||||
- **Docs Search:** Make documentation easy to search and use.
|
||||
It supports retrieval-augmented generation (RAG) patterns, enabling you to build enterprise search, natural language search, and AI-powered chat without managing infrastructure.
|
||||
|
||||
<div>
|
||||
<LinkButton href="/autorag/get-started">Get started</LinkButton>
|
||||
<LinkButton href="/ai-search/get-started">Get started</LinkButton>
|
||||
<LinkButton
|
||||
target="_blank"
|
||||
variant="secondary"
|
||||
icon="external"
|
||||
href="https://www.youtube.com/watch?v=JUFdbkiDN2U"
|
||||
>
|
||||
Watch AutoRAG demo
|
||||
Watch AI Search demo
|
||||
</LinkButton>
|
||||
</div>
|
||||
|
||||
|
|
@ -51,25 +48,25 @@ You can use AutoRAG to build:
|
|||
|
||||
## Features
|
||||
|
||||
<Feature header="Automated indexing" href="/autorag/configuration/indexing/" cta="View indexing">
|
||||
<Feature header="Automated indexing" href="/ai-search/configuration/indexing/" cta="View indexing">
|
||||
|
||||
Automatically and continuously index your data source, keeping your content fresh without manual reprocessing.
|
||||
|
||||
</Feature>
|
||||
|
||||
<Feature header="Multitenancy support" href="/autorag/how-to/multitenancy/" cta="Add filters">
|
||||
<Feature header="Multitenancy support" href="/ai-search/how-to/multitenancy/" cta="Add filters">
|
||||
|
||||
Create multitenancy by scoping search to each tenant’s data using folder-based metadata filters.
|
||||
|
||||
</Feature>
|
||||
|
||||
<Feature header="Workers Binding" href="/autorag/usage/workers-binding/" cta="Add to Worker">
|
||||
<Feature header="Workers Binding" href="/ai-search/usage/workers-binding/" cta="Add to Worker">
|
||||
|
||||
Call your AutoRAG instance for search or AI Search directly from a Cloudflare Worker using the native binding integration.
|
||||
Call your AI Search instance for search or AI Search directly from a Cloudflare Worker using the native binding integration.
|
||||
|
||||
</Feature>
|
||||
|
||||
<Feature header="Similarity caching" href="/autorag/configuration/cache/" cta="Use caching">
|
||||
<Feature header="Similarity caching" href="/ai-search/configuration/cache/" cta="Use caching">
|
||||
|
||||
Cache repeated queries and results to improve latency and reduce compute on repeated requests.
|
||||
|
||||
|
|
@ -7,7 +7,7 @@ sidebar:
|
|||
|
||||
## Pricing
|
||||
|
||||
During the open beta, AutoRAG is **free to enable**. When you create an AutoRAG instance, it provisions and runs on top of Cloudflare services in your account. These resources are **billed as part of your Cloudflare usage**, and includes:
|
||||
During the open beta, AI Search is **free to enable**. When you create an AI Search instance, it provisions and runs on top of Cloudflare services in your account. These resources are **billed as part of your Cloudflare usage**, and includes:
|
||||
|
||||
| Service & Pricing | Description |
|
||||
| ------------------------------------------------ | ----------------------------------------------------------------------------------------- |
|
||||
|
|
@ -15,18 +15,18 @@ During the open beta, AutoRAG is **free to enable**. When you create an AutoRAG
|
|||
| [**Vectorize**](/vectorize/platform/pricing/) | Stores vector embeddings and powers semantic search |
|
||||
| [**Workers AI**](/workers-ai/platform/pricing/) | Handles image-to-Markdown conversion, embedding, query rewriting, and response generation |
|
||||
| [**AI Gateway**](/ai-gateway/reference/pricing/) | Monitors and controls model usage |
|
||||
| [**Browser Rendering**](/browser-rendering/platform/pricing/) | Loads dynamic JavaScript content during [website](/autorag/configuration/data-source/website/) crawling with the Render option |
|
||||
| [**Browser Rendering**](/browser-rendering/platform/pricing/) | Loads dynamic JavaScript content during [website](/ai-search/configuration/data-source/website/) crawling with the Render option |
|
||||
|
||||
For more information about how each resource is used within AutoRAG, reference [How AutoRAG works](/autorag/concepts/how-autorag-works/).
|
||||
For more information about how each resource is used within AI Search, reference [How AI Search works](/ai-search/concepts/how-ai-search-works/).
|
||||
|
||||
## Limits
|
||||
|
||||
The following limits currently apply to AutoRAG during the open beta:
|
||||
The following limits currently apply to AI Search during the open beta:
|
||||
|
||||
| Limit | Value |
|
||||
| --------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------- |
|
||||
| Max AutoRAG instances per account | 10 |
|
||||
| Max files per AutoRAG | 100,000 |
|
||||
| Max AI Search instances per account | 10 |
|
||||
| Max files per AI Search | 100,000 |
|
||||
| Max file size | 4 MB |
|
||||
|
||||
These limits are subject to change as AutoRAG evolves beyond open beta.
|
||||
These limits are subject to change as AI Search evolves beyond open beta.
|
||||
18
src/content/docs/ai-search/platform/release-note.mdx
Normal file
|
|
@ -0,0 +1,18 @@
|
|||
---
|
||||
pcx_content_type: release-notes
|
||||
title: Release note
|
||||
release_notes_file_name:
|
||||
- ai-search
|
||||
sidebar:
|
||||
order: 8
|
||||
head: []
|
||||
description: Review recent changes to Cloudflare AI Search.
|
||||
---
|
||||
|
||||
import { ProductReleaseNotes } from "~/components";
|
||||
|
||||
This release notes section covers regular updates and minor fixes. For major feature releases or significant updates, see the [changelog](/changelog).
|
||||
|
||||
{/* <!-- Actual content lives in /src/content/release-notes/ai-search.yaml. Update the file there for new entries to appear here. For more details, refer to https://developers.cloudflare.com/style-guide/documentation-content-strategy/content-types/changelog/#yaml-file --> */}
|
||||
|
||||
<ProductReleaseNotes />
|
||||
|
|
@ -7,11 +7,11 @@ sidebar:
|
|||
|
||||
import { TypeScriptExample } from "~/components";
|
||||
|
||||
AutoRAG is designed to work out of the box with data in R2 buckets. But what if your content lives on a website or needs to be rendered dynamically?
|
||||
AI Search is designed to work out of the box with data in R2 buckets. But what if your content lives on a website or needs to be rendered dynamically?
|
||||
|
||||
:::note
|
||||
|
||||
AutoRAG now lets you use your [website](/autorag/configuration/data-source/website/) as a data source. When enabled, AutoRAG will automatically crawl and parse your site content for you.
|
||||
AutoRAG now lets you use your [website](/ai-search/configuration/data-source/website/) as a data source. When enabled, AutoRAG will automatically crawl and parse your site content for you.
|
||||
|
||||
:::
|
||||
|
||||
|
|
@ -20,7 +20,13 @@ In this tutorial, we’ll walk through how to:
|
|||
|
||||
1. Render your website using Cloudflare's Browser Rendering API
|
||||
2. Store the rendered HTML in R2
|
||||
3. Connect it to AutoRAG for querying
|
||||
3. Connect it to AI Search for querying
|
||||
|
||||
:::note
|
||||
|
||||
AI Search now lets you use your [website](/ai-search/configuration/data-source/website/) as a data source. When enabled, AI Search will automatically crawl and parse your site content for you.
|
||||
|
||||
:::
|
||||
|
||||
## Step 1. Create a Worker to fetch webpages and upload into R2
|
||||
|
||||
|
|
@ -144,34 +150,30 @@ npx wrangler deploy
|
|||
```bash
|
||||
curl -X POST https://browser-r2-worker.<YOUR_SUBDOMAIN>.workers.dev \
|
||||
-H "Content-Type: application/json" \
|
||||
-d '{"url": "https://developers.cloudflare.com/autorag/tutorial/brower-rendering-autorag-tutorial/"}'
|
||||
-d '{"url": "https://developers.cloudflare.com/ai-search/tutorial/brower-rendering-autorag-tutorial/"}'
|
||||
```
|
||||
|
||||
## Step 2. Create your AutoRAG and monitor the indexing
|
||||
## Step 2. Create your AI Search and monitor the indexing
|
||||
|
||||
Now that you have created your R2 bucket and filled it with your content that you’d like to query from, you are ready to create an AutoRAG instance:
|
||||
Now that you have created your R2 bucket and filled it with your content that you’d like to query from, you are ready to create an AI Search instance:
|
||||
|
||||
1. In your [Cloudflare Dashboard](https://dash.cloudflare.com/?to=/:account/ai/autorag), navigate to AI > AutoRAG
|
||||
2. Select Create AutoRAG and complete the setup process:
|
||||
1. In your [Cloudflare Dashboard](https://dash.cloudflare.com/?to=/:account/ai/autorag), navigate to AI > AI Search
|
||||
2. Select Create AI Search and complete the setup process:
|
||||
1. Select the **R2 bucket** which contains your knowledge base, in this case, select the `html-bucket`.
|
||||
2. Select an **embedding model** used to convert your data to vector representation. It is recommended to use the Default.
|
||||
3. Select an **LLM** to use to generate your responses. It is recommended to use the Default.
|
||||
4. Select or create an **AI Gateway** to monitor and control your model usage.
|
||||
5. **Name** your AutoRAG as `my-rag`
|
||||
6. Select or create a **Service API** token to grant AutoRAG access to create and access resources in your account.
|
||||
3. Select Create to spin up your AutoRAG.
|
||||
5. **Name** your AI Search as `my-rag`
|
||||
6. Select or create a **Service API** token to grant AI Search access to create and access resources in your account.
|
||||
3. Select Create to spin up your AI Search.
|
||||
|
||||
Once you’ve created your AutoRAG, it will automatically create a Vectorize database in your account and begin indexing the data.
|
||||
|
||||
You can view the progress of your indexing job in the Overview page of your AutoRAG.
|
||||
|
||||

|
||||
Once you’ve created your AI Search, it will automatically create a Vectorize database in your account and begin indexing the data.
|
||||
|
||||
## Step 3. Test and add to your application
|
||||
|
||||
Once AutoRAG finishes indexing your content, you’re ready to start asking it questions. You can open up your AutoRAG instance, navigate to the Playground tab, and ask a question based on your uploaded content, like “What is AutoRAG?”.
|
||||
Once AI Search finishes indexing your content, you’re ready to start asking it questions. You can open up your AI Search instance, navigate to the Playground tab, and ask a question based on your uploaded content, like “What is AI Search?”.
|
||||
|
||||
Once you’re happy with the results in the Playground, you can integrate AutoRAG directly into the application that you are building. If you are using a Worker to build your [RAG application](/autorag/), then you can use the AI binding to directly call your AutoRAG:
|
||||
Once you’re happy with the results in the Playground, you can integrate AI Search directly into the application that you are building. If you are using a Worker to build your [RAG application](/ai-search/), then you can use the AI binding to directly call your AI Search:
|
||||
|
||||
```jsonc
|
||||
{
|
||||
|
|
@ -181,12 +183,12 @@ Once you’re happy with the results in the Playground, you can integrate AutoRA
|
|||
}
|
||||
```
|
||||
|
||||
Then, query your AutoRAG instance from your Worker code by calling the `aiSearch()` method.
|
||||
Then, query your AI Search instance from your Worker code by calling the `aiSearch()` method.
|
||||
|
||||
```javascript
|
||||
const answer = await env.AI.autorag("my-rag").aiSearch({
|
||||
query: "What is AutoRAG?",
|
||||
query: "What is AI Search?",
|
||||
});
|
||||
```
|
||||
|
||||
For more information on how to add AutoRAG into your application, go to your AutoRAG then navigate to Use AutoRAG for more instructions.
|
||||
For more information on how to add AI Search into your application, go to your AI Search then navigate to Use AI Search for more instructions.
|
||||
|
|
@ -17,18 +17,22 @@ import {
|
|||
DashButton
|
||||
} from "~/components";
|
||||
|
||||
This guide will instruct you through how to use the AutoRAG REST API to make a query to your AutoRAG.
|
||||
This guide will instruct you through how to use the AI Search REST API to make a query to your AI Search.
|
||||
|
||||
## Prerequisite: Get AutoRAG API token
|
||||
:::note[AI Search is the new name for AutoRAG]
|
||||
API endpoints may still reference `autorag` for the time being. Functionality remains the same, and support for the new naming will be introduced gradually.
|
||||
:::
|
||||
|
||||
You need an API token with the `AutoRAG - Read` and `AutoRAG Edit` permissions to use the REST API. To create a new token:
|
||||
## Prerequisite: Get AI Search API token
|
||||
|
||||
1. In the Cloudflare dashboard, go to the **AutoRAG** page.
|
||||
You need an API token with the `AI Search - Read` and `AI Search Edit` permissions to use the REST API. To create a new token:
|
||||
|
||||
1. In the Cloudflare dashboard, go to the **AI Search** page.
|
||||
|
||||
<DashButton url="/?to=/:account/ai/autorag" />
|
||||
|
||||
2. Select your AutoRAG.
|
||||
3. Select **Use AutoRAG** and then select **API**.
|
||||
2. Select your AI Search.
|
||||
3. Select **Use AI Search** and then select **API**.
|
||||
4. Select **Create an API Token**.
|
||||
5. Review the prefilled information then select **Create API Token**.
|
||||
6. Select **Copy API Token** and save that value for future use.
|
||||
|
|
@ -39,7 +43,7 @@ This REST API searches for relevant results from your data source and generates
|
|||
|
||||
```bash
|
||||
|
||||
curl https://api.cloudflare.com/client/v4/accounts/{ACCOUNT_ID}/autorag/rags/{AUTORAG_NAME}/ai-search \
|
||||
curl https://api.cloudflare.com/client/v4/accounts/{ACCOUNT_ID}/ai-search/rags/{AUTORAG_NAME}/ai-search \
|
||||
-H 'Content-Type: application/json' \
|
||||
-H "Authorization: Bearer {API_TOKEN}" \
|
||||
-d '{
|
||||
|
|
@ -63,13 +67,13 @@ You can get your `ACCOUNT_ID` by navigating to [Workers & Pages on the dashboard
|
|||
|
||||
### Parameters
|
||||
|
||||
<Render file="ai-search-api-params" product="autorag" />
|
||||
<Render file="ai-search-api-params" product="ai-search" />
|
||||
|
||||
### Response
|
||||
|
||||
This is the response structure without `stream` enabled.
|
||||
|
||||
<Render file="ai-search-response" product="autorag" />
|
||||
<Render file="ai-search-response" product="ai-search" />
|
||||
|
||||
## Search
|
||||
|
||||
|
|
@ -77,7 +81,7 @@ This REST API searches for results from your data source and returns the relevan
|
|||
|
||||
```bash
|
||||
|
||||
curl https://api.cloudflare.com/client/v4/accounts/{ACCOUNT_ID}/autorag/rags/{AUTORAG_NAME}/search \
|
||||
curl https://api.cloudflare.com/client/v4/accounts/{ACCOUNT_ID}/ai-search/rags/{AUTORAG_NAME}/search \
|
||||
-H 'Content-Type: application/json' \
|
||||
-H "Authorization: Bearer {API_TOKEN}" \
|
||||
-d '{
|
||||
|
|
@ -99,8 +103,8 @@ You can get your `ACCOUNT_ID` by navigating to Workers & Pages on the dashboard,
|
|||
|
||||
### Parameters
|
||||
|
||||
<Render file="search-api-params" product="autorag" />
|
||||
<Render file="search-api-params" product="ai-search" />
|
||||
|
||||
### Response
|
||||
|
||||
<Render file="search-response" product="autorag" />
|
||||
<Render file="search-response" product="ai-search" />
|
||||
|
|
@ -20,7 +20,7 @@ import {
|
|||
|
||||
Cloudflare’s serverless platform allows you to run code at the edge to build full-stack applications with [Workers](/workers/). A [binding](/workers/runtime-apis/bindings/) enables your Worker or Pages Function to interact with resources on the Cloudflare Developer Platform.
|
||||
|
||||
To use your AutoRAG with Workers or Pages, create an AI binding either in the Cloudflare dashboard (refer to [AI bindings](/pages/functions/bindings/#workers-ai) for instructions), or you can update your [Wrangler file](/workers/wrangler/configuration/). To bind AutoRAG to your Worker, add the following to your Wrangler file:
|
||||
To use your AI Search with Workers or Pages, create an AI binding either in the Cloudflare dashboard (refer to [AI bindings](/pages/functions/bindings/#workers-ai) for instructions), or you can update your [Wrangler file](/workers/wrangler/configuration/). To bind AI Search to your Worker, add the following to your Wrangler file:
|
||||
|
||||
<WranglerConfig>
|
||||
|
||||
|
|
@ -31,9 +31,13 @@ binding = "AI" # i.e. available in your Worker on env.AI
|
|||
|
||||
</WranglerConfig>
|
||||
|
||||
:::note[AI Search is the new name for AutoRAG]
|
||||
API endpoints may still reference `autorag` for the time being. Functionality remains the same, and support for the new naming will be introduced gradually.
|
||||
:::
|
||||
|
||||
## `aiSearch()`
|
||||
|
||||
This method searches for relevant results from your data source and generates a response using your default model and the retrieved context, for an AutoRAG named `my-autorag`:
|
||||
This method searches for relevant results from your data source and generates a response using your default model and the retrieved context, for an AI Search named `my-autorag`:
|
||||
|
||||
```js
|
||||
const answer = await env.AI.autorag("my-autorag").aiSearch({
|
||||
|
|
@ -50,7 +54,7 @@ const answer = await env.AI.autorag("my-autorag").aiSearch({
|
|||
|
||||
### Parameters
|
||||
|
||||
<Render file="ai-search-api-params" product="autorag" />
|
||||
<Render file="ai-search-api-params" product="ai-search" />
|
||||
|
||||
### Response
|
||||
|
||||
|
|
@ -103,7 +107,7 @@ This is the response structure without `stream` enabled.
|
|||
|
||||
## `search()`
|
||||
|
||||
This method searches for results from your corpus and returns the relevant results, for the AutoRAG instance named `my-autorag`:
|
||||
This method searches for results from your corpus and returns the relevant results, for the AI Search instance named `my-autorag`:
|
||||
|
||||
```js
|
||||
const answer = await env.AI.autorag("my-autorag").search({
|
||||
|
|
@ -118,7 +122,7 @@ const answer = await env.AI.autorag("my-autorag").search({
|
|||
|
||||
### Parameters
|
||||
|
||||
<Render file="search-api-params" product="autorag" />
|
||||
<Render file="search-api-params" product="ai-search" />
|
||||
|
||||
### Response
|
||||
|
||||
|
|
@ -168,4 +172,4 @@ const answer = await env.AI.autorag("my-autorag").search({
|
|||
|
||||
## Local development
|
||||
|
||||
Local development is supported by proxying requests to your deployed AutoRAG instance. When running in local mode, your application forwards queries to the configured remote AutoRAG instance and returns the generated responses as if they were served locally.
|
||||
Local development is supported by proxying requests to your deployed AI Search instance. When running in local mode, your application forwards queries to the configured remote AI Search instance and returns the generated responses as if they were served locally.
|
||||
|
|
@ -1,44 +0,0 @@
|
|||
---
|
||||
pcx_content_type: concept
|
||||
title: How AutoRAG works
|
||||
sidebar:
|
||||
order: 2
|
||||
---
|
||||
|
||||
AutoRAG sets up and manages your RAG pipeline for you. It connects the tools needed for indexing, retrieval, and generation, and keeps everything up to date by syncing with your data with the index regularly. Once set up, AutoRAG indexes your content in the background and responds to queries in real time.
|
||||
|
||||
AutoRAG consists of two core processes:
|
||||
|
||||
- **Indexing:** An asynchronous background process that monitors your data source for changes and converts your data into vectors for search.
|
||||
- **Querying:** A synchronous process triggered by user queries. It retrieves the most relevant content and generates context-aware responses.
|
||||
|
||||
## How indexing works
|
||||
|
||||
Indexing begins automatically when you create an AutoRAG instance and connect a data source.
|
||||
|
||||
Here is what happens during indexing:
|
||||
|
||||
1. **Data ingestion:** AutoRAG reads from your connected data source.
|
||||
2. **Markdown conversion:** AutoRAG uses [Workers AI’s Markdown Conversion](/workers-ai/features/markdown-conversion/) to convert [supported data types](/autorag/configuration/data-source/) into structured Markdown. This ensures consistency across diverse file types. For images, Workers AI is used to perform object detection followed by vision-to-language transformation to convert images into Markdown text.
|
||||
3. **Chunking:** The extracted text is [chunked](/autorag/configuration/chunking/) into smaller pieces to improve retrieval granularity.
|
||||
4. **Embedding:** Each chunk is embedded using Workers AI’s embedding model to transform the content into vectors.
|
||||
5. **Vector storage:** The resulting vectors, along with metadata like file name, are stored in a the [Vectorize](/vectorize/) database created on your Cloudflare account.
|
||||
|
||||
After the initial data set is indexed, AutoRAG will regularly check for updates in your data source (e.g. additions, updates, or deletes) and index changes to ensure your vector database is up to date.
|
||||
|
||||

|
||||
|
||||
## How querying works
|
||||
|
||||
Once indexing is complete, AutoRAG is ready to respond to end-user queries in real time.
|
||||
|
||||
Here is how the querying pipeline works:
|
||||
|
||||
1. **Receive query from AutoRAG API:** The query workflow begins when you send a request to either the AutoRAG’s [AI Search](/autorag/usage/rest-api/#ai-search) or [Search](/autorag/usage/rest-api/#search) endpoints.
|
||||
2. **Query rewriting (optional):** AutoRAG provides the option to [rewrite the input query](/autorag/configuration/query-rewriting/) using one of Workers AI’s LLMs to improve retrieval quality by transforming the original query into a more effective search query.
|
||||
3. **Embedding the query:** The rewritten (or original) query is transformed into a vector via the same embedding model used to embed your data so that it can be compared against your vectorized data to find the most relevant matches.
|
||||
4. **Querying Vectorize index:** The query vector is [queried](/vectorize/best-practices/query-vectors/) against stored vectors in the associated Vectorize database for your AutoRAG.
|
||||
5. **Content retrieval:** Vectorize returns the metadata of the most relevant chunks, and the original content is retrieved from the R2 bucket. If you are using the Search endpoint, the content is returned at this point.
|
||||
6. **Response generation:** If you are using the AI Search endpoint, then a text-generation model from Workers AI is used to generate a response using the retrieved content and the original user’s query, combined via a [system prompt](/autorag/configuration/system-prompt/). The context-aware response from the model is returned.
|
||||
|
||||

|
||||
|
|
@ -1,13 +0,0 @@
|
|||
---
|
||||
title: Data source
|
||||
pcx_content_type: how-to
|
||||
sidebar:
|
||||
order: 2
|
||||
---
|
||||
|
||||
AutoRAG can directly ingest data from the following sources:
|
||||
|
||||
| Data Source | Description |
|
||||
|---------------|-------------|
|
||||
| [Website](/autorag/configuration/data-source/website/) | Connect a domain you own to index website pages. |
|
||||
| [R2 Bucket](/autorag/configuration/data-source/r2/) | Connect a Cloudflare R2 bucket to index stored documents. |
|
||||
|
|
@ -1,63 +0,0 @@
|
|||
---
|
||||
title: Website
|
||||
pcx_content_type: how-to
|
||||
sidebar:
|
||||
order: 2
|
||||
---
|
||||
|
||||
The Website data source allows you to connect a domain you own so its pages can be crawled, stored, and indexed.
|
||||
|
||||
:::note
|
||||
You can only crawl domains that you have onboarded onto the same Cloudflare account.
|
||||
|
||||
Refer to [Onboard a domain](/fundamentals/manage-domains/add-site/) for more information on adding a domain to your Cloudflare account.
|
||||
:::
|
||||
|
||||
:::caution[Bot protection may block crawling]
|
||||
If you use Cloudflare products that control or restrict bot traffic such as [Bot Management](/bots/), [WAF](/waf/), or [Turnstile](/turnstile/), the same rules will apply to the AutoRAG crawler. Make sure to configure an exception or an allow-list for the AutoRAG crawler in your settings.
|
||||
:::
|
||||
|
||||
## How website crawling works
|
||||
When you connect a domain, the crawler looks for your website’s sitemap to determine which pages to visit:
|
||||
|
||||
1. The crawler first checks the `robots.txt` for listed sitemaps. If it exists, it reads all sitemaps existing inside.
|
||||
2. If no `robots.txt` is found, the crawler first checks for a sitemap at `/sitemap.xml`.
|
||||
3. If no sitemap is available, the domain cannot be crawled.
|
||||
|
||||
Pages are visited, according to the `<priority>` attribute set on the sitemaps, if this field is defined.
|
||||
|
||||
## How to set WAF rules to allowlist AutoRAG crawler
|
||||
|
||||
If you have Security rules configured to block bot activity, you can add a rule to allowlist AutoRAG's crawler bot.
|
||||
|
||||
1. Log in to the [Cloudflare dashboard](https://dash.cloudflare.com/) and select your account and domain.
|
||||
2. Go to **Security** > **Security rules**.
|
||||
3. To create a new empty rule, select **Create rule** > **Custom rules**.
|
||||
4. Enter a descriptive name for the rule in **Rule name**, such as `Allow AutoRAG`.
|
||||
5. Under **When incoming requests match**, use the **Field** drop-down list to choose _Bot Detection ID_. For **Operator**, select _equals_. For **Value**, enter `122933950`.
|
||||
6. Under **Then take action**, in the **Choose action** dropdown, choose _Skip_.
|
||||
7. Under **Place at**, select the order of the rule in the **Select order** dropdown to be _First_. Setting the order as _First_ allows this rule to be applied before subsequent rules.
|
||||
8. To save and deploy your rule, select **Deploy**.
|
||||
|
||||
## Parsing options
|
||||
You can choose how pages are parsed during crawling:
|
||||
|
||||
- **Static sites**: Downloads the raw HTML for each page.
|
||||
- **Rendered sites**: Loads pages with a headless browser and downloads the fully rendered version, including dynamic JavaScript content. Note that the [Browser Rendering](/browser-rendering/platform/pricing/) limits and billing apply.
|
||||
|
||||
## Storage
|
||||
During setup, AutoRAG creates a dedicated R2 bucket in your account to store the pages that have been crawled and downloaded as HTML files. This bucket is automatically managed and is used only for content discovered by the crawler. Any files or objects that you add directly to this bucket will not be indexed.
|
||||
|
||||
:::note
|
||||
We recommend not to modify the bucket as it may distrupt the indexing flow and cause content to not be updated properly.
|
||||
:::
|
||||
|
||||
## Sync and updates
|
||||
During scheduled or manual [sync jobs](/autorag/configuration/indexing/), the crawler will check for changes to the `<lastmod>` attribute in your sitemap. If it has been changed to a date occuring after the last sync date, then the page will be crawled, the updated version is stored in the R2 bucket, and automatically reindexed so that your search results always reflect the latest content.
|
||||
|
||||
If the `<lastmod>` attribute is not defined, then AutoRAG will automatically crawl each link defined in the sitemap once a day.
|
||||
|
||||
## Limits
|
||||
The regular AutoRAG [limits](/autorag/platform/limits-pricing/) apply when using the Website data source.
|
||||
|
||||
The crawler will download and index pages only up to the maximum object limit supported for an AutoRAG instance, and it processes the first set of pages it visits until that limit is reached. In addition, any files that are downloaded but exceed the file size limit will not be indexed.
|
||||
|
|
@ -1,35 +0,0 @@
|
|||
---
|
||||
pcx_content_type: navigation
|
||||
title: Configuration
|
||||
sidebar:
|
||||
order: 5
|
||||
---
|
||||
|
||||
import { MetaInfo, Type } from "~/components";
|
||||
|
||||
When creating an AutoRAG instance, you can customize how your RAG pipeline ingests, processes, and responds to data using a set of configuration options. Some settings can be updated after the instance is created, while others are fixed at creation time.
|
||||
|
||||
The table below lists all available configuration options:
|
||||
|
||||
| Configuration | Editable after creation | Description |
|
||||
| ---------------------------------------------------------------------------- | ----------------------- | ------------------------------------------------------------------------------------------ |
|
||||
| [Data source](/autorag/configuration/data-source/) | no | The source where your knowledge base is stored |
|
||||
| [Chunk size](/autorag/configuration/chunking/) | yes | Number of tokens per chunk |
|
||||
| [Chunk overlap](/autorag/configuration/chunking/) | yes | Number of overlapping tokens between chunks |
|
||||
| [Embedding model](/autorag/configuration/models/) | no | Model used to generate vector embeddings |
|
||||
| [Query rewrite](/autorag/configuration/query-rewriting/) | yes | Enable or disable query rewriting before retrieval |
|
||||
| [Query rewrite model](/autorag/configuration/models/) | yes | Model used for query rewriting |
|
||||
| [Query rewrite system prompt](/autorag/configuration/system-prompt/) | yes | Custom system prompt to guide query rewriting behavior |
|
||||
| [Match threshold](/autorag/configuration/retrieval-configuration/) | yes | Minimum similarity score required for a vector match |
|
||||
| [Maximum number of results](/autorag/configuration/retrieval-configuration/) | yes | Maximum number of vector matches returned (`top_k`) |
|
||||
| [Generation model](/autorag/configuration/models/) | yes | Model used to generate the final response |
|
||||
| [Generation system prompt](/autorag/configuration/system-prompt/) | yes | Custom system prompt to guide response generation |
|
||||
| [Similarity caching](/autorag/configuration/cache/) | yes | Enable or disable caching of responses for similar (not just exact) prompts |
|
||||
| [Similarity caching threshold](/autorag/configuration/cache/) | yes | Controls how similar a new prompt must be to a previous one to reuse its cached response |
|
||||
| [AI Gateway](/ai-gateway) | yes | AI Gateway for monitoring and controlling model usage |
|
||||
| AutoRAG name | no | Name of your AutoRAG instance |
|
||||
| Service API token | yes | API token granted to AutoRAG to give it permission to configure resources on your account. |
|
||||
|
||||
:::note[API token]
|
||||
The Service API token is different from the AutoRAG API token that you can make to interact with your AutoRAG. The Service API token is only used by AutoRAG to get permissions to configure resources on your account.
|
||||
:::
|
||||
|
|
@ -1,39 +0,0 @@
|
|||
---
|
||||
pcx_content_type: concept
|
||||
title: Models
|
||||
sidebar:
|
||||
order: 4
|
||||
---
|
||||
|
||||
AutoRAG uses models at multiple steps of the RAG pipeline. You can configure which models are used, or let AutoRAG automatically select defaults optimized for general use.
|
||||
|
||||
## Models used
|
||||
|
||||
AutoRAG leverages Workers AI models in the following stages:
|
||||
|
||||
- **Image to markdown conversion (if images are in data source)**: Converts image content to Markdown using object detection and captioning models.
|
||||
- **Embedding**: Transforms your documents and queries into vector representations for semantic search.
|
||||
- **Query rewriting (optional)**: Reformulates the user’s query to improve retrieval accuracy.
|
||||
- **Generation**: Produces the final response from retrieved context.
|
||||
|
||||
## Model providers
|
||||
|
||||
AutoRAG currently only supports [Workers AI](/workers-ai/) as the model provider. Usage of models through AutoRAG contributes to your Workers AI usage and is billed as part of your account.
|
||||
|
||||
If you have connected your project to [AI Gateway](/ai-gateway), all model calls triggered by AutoRAG can be tracked in AI Gateway. This gives you full visibility into inputs, outputs, latency, and usage patterns.
|
||||
|
||||
## Choosing a model
|
||||
|
||||
When configuring your AutoRAG instance, you can specify the exact model to use for each step of embedding, rewriting, and generation. You can find available models that can be used with AutoRAG in the **Settings** of your AutoRAG.
|
||||
|
||||
:::note
|
||||
AutoRAG supports a subset of Workers AI models that have been selected to provide the best experience for RAG.
|
||||
:::
|
||||
|
||||
### Smart default
|
||||
|
||||
If you choose **Smart Default** in your model selection, then AutoRAG will select a Cloudflare recommended model. These defaults may change over time as Cloudflare evaluates and updates model choices. You can switch to explicit model configuration at any time by visiting **Settings**.
|
||||
|
||||
### Per-request generation model override
|
||||
|
||||
While the generation model can be set globally at the AutoRAG instance level, you can also override it on a per-request basis in the [AI Search API](/autorag/usage/rest-api/#ai-search). This is useful if your [RAG application](/autorag/) requires dynamic selection of generation models based on context or user preferences.
|
||||
|
|
@ -1,69 +0,0 @@
|
|||
---
|
||||
title: Getting started
|
||||
pcx_content_type: get-started
|
||||
sidebar:
|
||||
order: 2
|
||||
head:
|
||||
- tag: title
|
||||
content: Get started with AutoRAG
|
||||
Description: Get started creating fully-managed, retrieval-augmented generation pipelines with Cloudflare AutoRAG.
|
||||
---
|
||||
|
||||
import { DashButton } from "~/components";
|
||||
|
||||
AutoRAG allows developers to create fully managed retrieval-augmented generation (RAG) pipelines to power AI applications with accurate and up-to-date information without needing to manage infrastructure.
|
||||
|
||||
## 1. Upload data or use existing data in R2
|
||||
|
||||
AutoRAG integrates with R2 for data import. Create an R2 bucket if you do not have one and upload your data.
|
||||
|
||||
:::note
|
||||
Before you create your first bucket, you must purchase R2 from the Cloudflare dashboard.
|
||||
:::
|
||||
|
||||
To create and upload objects to your bucket from the Cloudflare dashboard:
|
||||
|
||||
1. In the Cloudflare dashboard, go to the **R2** page.
|
||||
|
||||
<DashButton url="/?to=/:account/r2/overview" />
|
||||
|
||||
2. Select Create bucket, name the bucket, and select **Create bucket**.
|
||||
3. Choose to either drag and drop your file into the upload area or **select from computer**. Review the [file limits](/autorag/configuration/data-source/) when creating your knowledge base.
|
||||
|
||||
_If you need inspiration for what document to use to make your first AutoRAG, try downloading and uploading the [RSS](/changelog/rss/index.xml) of the [Cloudflare Changelog](/changelog/)._
|
||||
|
||||
## 2. Create an AutoRAG
|
||||
|
||||
To create a new AutoRAG:
|
||||
|
||||
1. In the Cloudflare dashboard, go to the **AutoRAG* page.
|
||||
|
||||
<DashButton url="/?to=/:account/ai/autorag" />
|
||||
|
||||
2. Select **Create AutoRAG**, configure the AutoRAG, and complete the setup process.
|
||||
3. Select **Create**.
|
||||
|
||||
## 3. Monitor indexing
|
||||
|
||||
Once created, AutoRAG will create a Vectorize index in your account and begin indexing the data.
|
||||
|
||||
To monitor the indexing progress:
|
||||
|
||||
1. From the **AutoRAG** page in the dashboard, locate and select your AutoRAG.
|
||||
2. Navigate to the **Overview** page to view the current indexing status.
|
||||
|
||||
## 4. Try it out
|
||||
|
||||
Once indexing is complete, you can run your first query:
|
||||
|
||||
1. From the **AutoRAG** page in the dashboard, locate and select your AutoRAG.
|
||||
2. Navigate to the **Playground** tab.
|
||||
3. Select **Search with AI** or **Search**.
|
||||
4. Enter a **query** to test out its response.
|
||||
|
||||
## 5. Add to your application
|
||||
|
||||
There are multiple ways you can create [RAG applications](/autorag/) with Cloudflare AutoRAG:
|
||||
|
||||
- [Workers Binding](/autorag/usage/workers-binding/)
|
||||
- [REST API](/autorag/usage/rest-api/)
|
||||
|
|
@ -1,18 +0,0 @@
|
|||
---
|
||||
pcx_content_type: release-notes
|
||||
title: Release note
|
||||
release_notes_file_name:
|
||||
- autorag
|
||||
sidebar:
|
||||
order: 8
|
||||
head: []
|
||||
description: Review recent changes to Cloudflare AutoRAG.
|
||||
---
|
||||
|
||||
import { ProductReleaseNotes } from "~/components";
|
||||
|
||||
This release notes section covers regular updates and minor fixes. For major feature releases or significant updates, see the [changelog](/changelog).
|
||||
|
||||
{/* <!-- Actual content lives in /src/content/release-notes/autorag.yaml. Update the file there for new entries to appear here. For more details, refer to https://developers.cloudflare.com/style-guide/documentation-content-strategy/content-types/changelog/#yaml-file --> */}
|
||||
|
||||
<ProductReleaseNotes />
|
||||
|
|
@ -4,7 +4,7 @@ pcx_content_type: reference-architecture-diagram
|
|||
tags:
|
||||
- AI
|
||||
products:
|
||||
- AutoRAG
|
||||
- AI Search
|
||||
- Workers AI
|
||||
- Workers
|
||||
- Queues
|
||||
|
|
@ -26,7 +26,7 @@ Examples for application of these technique includes for instance customer servi
|
|||
In the context of Retrieval-Augmented Generation (RAG), knowledge seeding involves incorporating external information from pre-existing sources into the generative process, while querying refers to the mechanism of retrieving relevant knowledge from these sources to inform the generation of coherent and contextually accurate text. Both are shown below.
|
||||
|
||||
:::note[Looking for a managed option?]
|
||||
[AutoRAG](/autorag) offers a fully managed way to build RAG pipelines on Cloudflare, handling ingestion, indexing, and querying out of the box. [Get started with AutoRAG](/autorag/get-started/).
|
||||
[AI Search](/ai-search/) offers a fully managed way to build RAG pipelines on Cloudflare, handling ingestion, indexing, and querying out of the box. [Get started with AI Search](/ai-search/get-started/).
|
||||
:::
|
||||
|
||||
## Knowledge Seeding
|
||||
|
|
@ -54,6 +54,6 @@ In the context of Retrieval-Augmented Generation (RAG), knowledge seeding involv
|
|||
## Related resources
|
||||
|
||||
- [Tutorial: Build a RAG AI](/workers-ai/guides/tutorials/build-a-retrieval-augmented-generation-ai/)
|
||||
- [Get started with AutoRAG](/autorag/get-started/)
|
||||
- [Get started with AI Search](/ai-search/get-started/)
|
||||
- [Workers AI: Text embedding models](/workers-ai/models/)
|
||||
- [Workers AI: Text generation models](/workers-ai/models/)
|
||||
|
|
|
|||
|
|
@ -52,9 +52,9 @@ Learn how to use Vectorize to generate vector embeddings using Workers AI.
|
|||
|
||||
</Feature>
|
||||
|
||||
<Feature header="Search using Vectorize and AutoRAG" href="/autorag/" cta="Build a RAG with Vectorize">
|
||||
<Feature header="Search using Vectorize and AI Search" href="/ai-search/" cta="Build a RAG with Vectorize">
|
||||
|
||||
Learn how to automatically index your data and store it in Vectorize, then query it to generate context-aware responses using AutoRAG.
|
||||
Learn how to automatically index your data and store it in Vectorize, then query it to generate context-aware responses using AI Search.
|
||||
|
||||
</Feature>
|
||||
|
||||
|
|
|
|||
|
|
@ -52,7 +52,7 @@ When a user initiates a prompt, instead of passing it (without additional contex
|
|||
3. These vectors are used to look up the content they relate to (if not embedded directly alongside the vectors as metadata).
|
||||
4. This content is provided as context alongside the original user prompt, providing additional context to the LLM and allowing it to return an answer that is likely to be far more contextual than the standalone prompt.
|
||||
|
||||
[Create a RAG application today with AutoRAG](/autorag/) to deploy a fully managed RAG pipeline in just a few clicks. AutoRAG automatically sets up Vectorize, handles continuous indexing, and serves responses through a single API.
|
||||
[Create a RAG application today with AI Search](/ai-search/) to deploy a fully managed RAG pipeline in just a few clicks. AI Search automatically sets up Vectorize, handles continuous indexing, and serves responses through a single API.
|
||||
|
||||
<sup>1</sup> You can learn more about the theory behind RAG by reading the [RAG
|
||||
paper](https://arxiv.org/abs/2005.11401).
|
||||
|
|
|
|||
|
|
@ -21,7 +21,7 @@ import { Details, Render, PackageManagers, WranglerConfig } from "~/components";
|
|||
This guide will instruct you through setting up and deploying your first application with Cloudflare AI. You will build a fully-featured AI-powered application, using tools like Workers AI, Vectorize, D1, and Cloudflare Workers.
|
||||
|
||||
:::note[Looking for a managed option?]
|
||||
[AutoRAG](/autorag) offers a fully managed way to build RAG pipelines on Cloudflare, handling ingestion, indexing, and querying out of the box. [Get started](/autorag/get-started/).
|
||||
[AI Search](/ai-search/) offers a fully managed way to build RAG pipelines on Cloudflare, handling ingestion, indexing, and querying out of the box. [Get started](/ai-search/get-started/).
|
||||
:::
|
||||
|
||||
At the end of this tutorial, you will have built an AI tool that allows you to store information and query it using a Large Language Model. This pattern, known as Retrieval Augmented Generation, or RAG, is a useful project you can build by combining multiple aspects of Cloudflare's AI toolkit. You do not need to have experience working with AI tools to build this application.
|
||||
|
|
|
|||
|
|
@ -10,7 +10,7 @@ The input query.
|
|||
|
||||
`model` <Type text="string" /> <MetaInfo text="optional" />
|
||||
|
||||
The text-generation model that is used to generate the response for the query. For a list of valid options, check the AutoRAG Generation model Settings. Defaults to the generation model selected in the AutoRAG Settings.
|
||||
The text-generation model that is used to generate the response for the query. For a list of valid options, check the AI Search Generation model Settings. Defaults to the generation model selected in the AI Search Settings.
|
||||
|
||||
`rewrite_query` <Type text="boolean" /> <MetaInfo text="optional" />
|
||||
|
||||
|
|
@ -33,4 +33,4 @@ Returns a stream of results as they are available. Defaults to `false`.
|
|||
|
||||
`filters` <Type text="object" /> <MetaInfo text="optional" />
|
||||
|
||||
Narrow down search results based on metadata, like folder and date, so only relevant content is retrieved. For more details, refer to [Metadata filtering](/autorag/configuration/metadata/).
|
||||
Narrow down search results based on metadata, like folder and date, so only relevant content is retrieved. For more details, refer to [Metadata filtering](/ai-search/configuration/metadata/).
|
||||
|
|
@ -25,4 +25,4 @@ Configurations for customizing result ranking. Defaults to `{}`.
|
|||
|
||||
`filters` <Type text="object" /> <MetaInfo text="optional" />
|
||||
|
||||
Narrow down search results based on metadata, like folder and date, so only relevant content is retrieved. For more details, refer to [Metadata filtering](/autorag/configuration/metadata).
|
||||
Narrow down search results based on metadata, like folder and date, so only relevant content is retrieved. For more details, refer to [Metadata filtering](/ai-search/configuration/metadata).
|
||||
|
|
@ -1,14 +1,14 @@
|
|||
name: AutoRAG
|
||||
name: AI Search
|
||||
|
||||
product:
|
||||
title: AutoRAG
|
||||
url: /autorag/
|
||||
title: AI Search
|
||||
url: /ai-search/
|
||||
group: Developer platform
|
||||
additional_groups: [AI]
|
||||
tags: [AI]
|
||||
|
||||
meta:
|
||||
title: Cloudflare AutoRAG docs
|
||||
title: Cloudflare AI Search docs
|
||||
description: Create fully managed RAG pipelines for your AI applications
|
||||
author: "@cloudflare"
|
||||
|
||||
|
|
@ -1,16 +1,16 @@
|
|||
---
|
||||
link: "/autorag/platform/release-note/"
|
||||
productName: AutoRAG
|
||||
productLink: "/autorag/"
|
||||
link: "/ai-search/platform/release-note/"
|
||||
productName: AI Search
|
||||
productLink: "/ai-search/"
|
||||
entries:
|
||||
- publish_date: "2025-08-20"
|
||||
title: Increased maximum query results to 50
|
||||
description: |-
|
||||
The maximum number of results returned from a query has been increased from **20** to **50**. This allows you to surface more relevant matches in a single request.
|
||||
The maximum number of results returned from a query has been increased from **20** to **50**. This allows you to surface more relevant matches in a single request.
|
||||
- publish_date: "2025-07-16"
|
||||
title: Deleted files now removed from index on next sync
|
||||
description: |-
|
||||
When a file is deleted from your R2 bucket, its corresponding chunks are now automatically removed from the Vectorize index linked to your AutoRAG instance during the next sync.
|
||||
When a file is deleted from your R2 bucket, its corresponding chunks are now automatically removed from the Vectorize index linked to your AI Search instance during the next sync.
|
||||
- publish_date: "2025-07-08"
|
||||
title: Reduced cooldown between syncs
|
||||
description: |-
|
||||
|
|
@ -24,13 +24,13 @@ entries:
|
|||
description: |-
|
||||
The dashboard now includes a new “Processing” step for the indexing pipeline that displays the files currently being processed.
|
||||
- publish_date: "2025-06-12"
|
||||
title: Sync AutoRAG REST API published
|
||||
title: Sync AI Search REST API published
|
||||
description: |-
|
||||
You can now trigger a sync job for an AutoRAG using the [Sync REST API](/api/resources/autorag/subresources/rags/methods/sync/). This scans your data source for changes and queues updated or previously errored files for indexing.
|
||||
You can now trigger a sync job for an AI Search using the [Sync REST API](/api/resources/ai-search/subresources/rags/methods/sync/). This scans your data source for changes and queues updated or previously errored files for indexing.
|
||||
- publish_date: "2025-06-10"
|
||||
title: Files modified in the data source will now be updated
|
||||
description: |-
|
||||
Files modified in your source R2 bucket will now be updated in the AutoRAG index during the next sync. For example, if you upload a new version of an existing file, the changes will be reflected in the index after the subsequent sync job.
|
||||
Files modified in your source R2 bucket will now be updated in the AI Search index during the next sync. For example, if you upload a new version of an existing file, the changes will be reflected in the index after the subsequent sync job.
|
||||
Please note that deleted files are not yet removed from the index. We are actively working on this functionality.
|
||||
- publish_date: "2025-05-31"
|
||||
title: Errored files will now be retried in next sync
|
||||
|
|
@ -43,12 +43,12 @@ entries:
|
|||
- publish_date: "2025-05-25"
|
||||
title: EU jurisdiction R2 buckets now supported
|
||||
description: |-
|
||||
AutoRAG now supports R2 buckets configured with European Union (EU) jurisdiction restrictions. Previously, files in EU-restricted R2 buckets would not index when linked. This issue has been resolved, and all EU-restricted R2 buckets should now function as expected.
|
||||
AI Search now supports R2 buckets configured with European Union (EU) jurisdiction restrictions. Previously, files in EU-restricted R2 buckets would not index when linked. This issue has been resolved, and all EU-restricted R2 buckets should now function as expected.
|
||||
- publish_date: "2025-04-23"
|
||||
title: Response streaming in AutoRAG binding added
|
||||
title: Response streaming in AI Search binding added
|
||||
description: |-
|
||||
AutoRAG now supports response streaming in the `AI Search` method of the [Workers binding](/autorag/usage/workers-binding/), allowing you to stream results as they’re retrieved by setting `stream: true`.
|
||||
AI Search now supports response streaming in the `AI Search` method of the [Workers binding](/ai-search/usage/workers-binding/), allowing you to stream results as they’re retrieved by setting `stream: true`.
|
||||
- publish_date: "2025-04-07"
|
||||
title: AutoRAG is now in open beta!
|
||||
title: AI Search is now in open beta!
|
||||
description: |-
|
||||
AutoRAG allows developers to create fully-managed retrieval-augmented generation (RAG) pipelines powered by Cloudflare allowing developers to integrate context-aware AI into their applications without managing infrastructure. Get started today on the [Cloudflare Dashboard](https://dash.cloudflare.com/?to=/:account/ai/autorag).
|
||||
AI Search allows developers to create fully-managed retrieval-augmented generation (RAG) pipelines powered by Cloudflare allowing developers to integrate context-aware AI into their applications without managing infrastructure. Get started today on the [Cloudflare Dashboard](https://dash.cloudflare.com/?to=/:account/ai/autorag).
|
||||
|
|
@ -31,7 +31,7 @@ entries:
|
|||
- publish_date: "2025-09-05"
|
||||
title: Introducing EmbeddingGemma from Google
|
||||
description: |-
|
||||
- We’re excited to be a launch partner alongside Google to bring their newest embedding model to Workers AI. We're excited to introduce EmbeddingGemma delivers best-in-class performance for its size, enabling RAG and semantic search use cases. Take a look at [`@cf/google/embeddinggemma-300m`](/workers-ai/models/embeddinggemma-300m) for more details. Now available to use for embedding in AutoRAG too.
|
||||
- We’re excited to be a launch partner alongside Google to bring their newest embedding model to Workers AI. We're excited to introduce EmbeddingGemma delivers best-in-class performance for its size, enabling RAG and semantic search use cases. Take a look at [`@cf/google/embeddinggemma-300m`](/workers-ai/models/embeddinggemma-300m) for more details. Now available to use for embedding in AI Search too.
|
||||
- publish_date: "2025-08-27"
|
||||
title: Introducing Partner models to the Workers AI catalog
|
||||
description: |-
|
||||
|
|
|
|||
|
Before Width: | Height: | Size: 808 B After Width: | Height: | Size: 808 B |
|
|
@ -72,7 +72,7 @@ const topCards = [
|
|||
{
|
||||
title: "AI Products",
|
||||
links: [
|
||||
{ text: "AutoRAG", href: "/autorag/" },
|
||||
{ text: "AI Search", href: "/ai-search/" },
|
||||
{ text: "Workers AI", href: "/workers-ai/" },
|
||||
{ text: "Vectorize", href: "/vectorize/" },
|
||||
{ text: "AI Gateway", href: "/ai-gateway/" },
|
||||
|
|
|
|||