Search
Every file processed by Roset produces a searchable-index variant (plain text) and chunked embeddings (vector). These power three search modes:
| Mode | How it works | Best for |
|---|---|---|
text | Postgres full-text search on the searchable-index variant | Exact keyword matching |
vector | Cosine similarity via Cloudflare Vectorize on chunked embeddings | Semantic/conceptual search |
hybrid | Runs both in parallel, merges via Reciprocal Rank Fusion | Best overall relevance (default) |
Quick Start
python
import os
from roset import Client
client = Client(api_key=os.getenv("ROSET_API_KEY"))
# Hybrid search (default)
results = client.search.query(query="payment terms")
for r in results["results"]:
print(f"{r['fileId']} -- score: {r['score']}")
if r.get("snippet"):
print(f" {r['snippet']}")Search Modes
Text Search
Uses Postgres tsvector full-text search on the searchable-index variant. Returns results ranked by ts_rank with highlighted snippets.
- Works immediately after file processing completes
- Best for exact keyword and phrase matching
- Supports pagination via
offset
Vector Search
Embeds your query using OpenAI text-embedding-3-small, then queries Cloudflare Vectorize for the closest chunk vectors.
- Uses OpenAI embeddings (managed by default, or your own key via
/v1/org/provider-keys) - Best for semantic queries where exact words may not appear in the document
- Results include the matched chunk text and chunk index
Hybrid Search
Runs text and vector search in parallel, then merges results using Reciprocal Rank Fusion (RRF, k=60). This consistently outperforms either mode alone.
- Uses managed OpenAI key by default; falls back to text-only if neither managed nor BYOK key is available
- Default mode when
modeis omitted
Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
query | string | required | The search query |
mode | string | "hybrid" | "text", "vector", or "hybrid" |
space | string | all spaces | Scope search to a specific space |
limit | number | 20 | Max results (up to 100) |
offset | number | 0 | Pagination offset (text mode only) |
Response
json
{
"results": [
{
"fileId": "abc-123",
"score": 0.034,
"snippet": "The <b>payment</b> <b>terms</b> are net 30 days..."
}
],
"total": 42,
"query": "payment terms",
"mode": "hybrid"
}Note
Vector and hybrid modes use OpenAI embeddings. Roset provides a managed key by default. You can optionally configure your own via the console or PUT /v1/org/provider-keys.
Next Steps
- Q&A -- ask questions about your files and get answers with citations.
- Webhooks -- get notified when new variants are ready for search.
- Build a Knowledge Base -- end-to-end search + Q&A workflow.