Skip to content

Search

Every file processed by Roset produces a searchable-index variant (plain text) and chunked embeddings (vector). These power three search modes:

ModeHow it worksBest for
textPostgres full-text search on the searchable-index variantExact keyword matching
vectorCosine similarity via Cloudflare Vectorize on chunked embeddingsSemantic/conceptual search
hybridRuns both in parallel, merges via Reciprocal Rank FusionBest overall relevance (default)

Quick Start

python
import os
from roset import Client
 
client = Client(api_key=os.getenv("ROSET_API_KEY"))
 
# Hybrid search (default)
results = client.search.query(query="payment terms")
 
for r in results["results"]:
    print(f"{r['fileId']} -- score: {r['score']}")
    if r.get("snippet"):
        print(f"  {r['snippet']}")

Search Modes

Uses Postgres tsvector full-text search on the searchable-index variant. Returns results ranked by ts_rank with highlighted snippets.

  • Works immediately after file processing completes
  • Best for exact keyword and phrase matching
  • Supports pagination via offset

Embeds your query using OpenAI text-embedding-3-small, then queries Cloudflare Vectorize for the closest chunk vectors.

  • Uses OpenAI embeddings (managed by default, or your own key via /v1/org/provider-keys)
  • Best for semantic queries where exact words may not appear in the document
  • Results include the matched chunk text and chunk index

Runs text and vector search in parallel, then merges results using Reciprocal Rank Fusion (RRF, k=60). This consistently outperforms either mode alone.

  • Uses managed OpenAI key by default; falls back to text-only if neither managed nor BYOK key is available
  • Default mode when mode is omitted

Parameters

ParameterTypeDefaultDescription
querystringrequiredThe search query
modestring"hybrid""text", "vector", or "hybrid"
spacestringall spacesScope search to a specific space
limitnumber20Max results (up to 100)
offsetnumber0Pagination offset (text mode only)

Response

json
{
  "results": [
    {
      "fileId": "abc-123",
      "score": 0.034,
      "snippet": "The <b>payment</b> <b>terms</b> are net 30 days..."
    }
  ],
  "total": 42,
  "query": "payment terms",
  "mode": "hybrid"
}
Note

Vector and hybrid modes use OpenAI embeddings. Roset provides a managed key by default. You can optionally configure your own via the console or PUT /v1/org/provider-keys.

Next Steps

  • Q&A -- ask questions about your files and get answers with citations.
  • Webhooks -- get notified when new variants are ready for search.
  • Build a Knowledge Base -- end-to-end search + Q&A workflow.