Public API

API documentation

Integrate SotaOCR into your AI agents and LLM pipelines. The API is asynchronous: upload a document, poll job status, then fetch the final result.

Authentication

All API requests require Bearer token authentication. You can create an API key in your dashboard.

Header

Authorization: Bearer YOUR_API_KEY

Rate limits and polling

Please poll job status no more than once per second. When you exceed the limit, the API returns 429 Too Many Requests.

POST

1. Upload document

/v1/extract Uploads a PDF or image for OCR. You can optionally limit processing to specific pages.

  • file: Document file (PDF, PNG, JPG).
  • page_ranges: (Optional) JSON string with an array of page ranges. Example: '[{"start":1,"end":3}]'
cURL · Request
curl -X POST https://api.sotaocr.com/v1/extract \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -F "file=@document.pdf" \
  -F 'page_ranges=[{"start":1,"end":5}]'
JSON · Response
{
  "id": "job_123456789",
  "status": "pending",
  "page_count": 0,
  "created_at": "2026-03-24T12:00:00Z"
}
GET

2. Check status

/v1/jobs/{job_id} Returns the current processing status. Use page_count and pages_completed to track progress.

cURL · Request
curl -X GET https://api.sotaocr.com/v1/jobs/job_123456789 \
  -H "Authorization: Bearer YOUR_API_KEY"
JSON · Response
{
  "id": "job_123456789",
  "status": "running",
  "page_count": 5,
  "pages_completed": 2,
  "created_at": "2026-03-24T12:00:00Z"
}
GET

3. Fetch result

/v1/jobs/{job_id}/result?format=markdown Returns extracted text. Available only when the job status is completed.

  • format: (Optional) Response format: json, markdown, or text. Defaults to json.
cURL · Request
curl -X GET "https://api.sotaocr.com/v1/jobs/job_123456789/result?format=markdown" \
  -H "Authorization: Bearer YOUR_API_KEY"
JSON · Response
{
  "job_id": "job_123456789",
  "format": "markdown",
  "page_count": 5,
  "content": "# Annual report\n\nDocument text..."
}

Ready to integrate?

Create an API key in the dashboard and get free pages for testing.