Quickstart
Upload a document, let Roset orchestrate extraction, and retrieve structured markdown -- all in under 5 minutes. This guide walks through the complete flow: upload, wait for processing, and retrieve variants.
Prerequisites
Before you start, make sure you have:
- A Roset account at console.roset.dev
- An API key (starts with
rsk_) from Settings > API Keys - A Reducto API key for document extraction, added in Settings > Provider Keys
- Node.js 18+ (for the TypeScript SDK)
Step 1: Install the SDK
npm install @roset/sdkStep 2: Upload a File
Upload a document to Roset. The API creates a file record and a processing job automatically. Roset routes the file to the appropriate extraction provider based on content type.
Step 3: Wait for Processing
Roset processes files asynchronously. The processing job moves through a state machine: queued -> processing -> completed or failed. Poll the file status until it reaches a terminal state.
For production use, register a webhook instead of polling. Roset will POST to your endpoint when processing completes.
Step 4: Retrieve Results
Once processing completes, the extracted content is available as variants on the file. Variants are the outputs of the extraction pipeline -- typically markdown and optionally vector embeddings.
What Happened
- You uploaded a document to Roset.
- Roset created a file metadata record and a processing job.
- The job was routed to Reducto (for PDFs/documents), Gemini (for images), or Whisper (for audio) based on content type.
- The extraction provider returned structured markdown, which Roset stored as a variant on the file.
- If you configured an OpenAI provider key, vector embeddings were also generated as a second variant.
Roset never touched the file bytes directly -- it orchestrated the extraction pipeline and stored the resulting metadata.
Next Steps
- Storage Connections -- connect your S3, GCS, or Azure Blob Storage buckets for direct upload.
- Webhooks -- get notified when processing completes instead of polling.
- TypeScript SDK -- full TypeScript client reference.
- API Reference -- complete endpoint documentation.
- Authentication -- details on API keys and bearer tokens.