API documentation
The Handwriting OCR API provides a simple, reliable way to extract text and data from documents and images. Using state-of-the-art OCR technology, it can process handwritten text, printed documents, and structured data like tables. The API is RESTful, uses JSON for response data, and requires authentication via API tokens.
Key Features
- Handwriting recognition and text extraction
- Table structure detection and data extraction
- Support for PDF and common image formats (JPG, PNG, TIFF, etc.)
- Multiple export formats (TXT, DOCX, JSON, CSV, XLSX)
Basic Process
1. Upload Document
Start by uploading your document with a specified action:
transcribe
: Extract all text from the documenttables
: Extract data from tablesextractor
: Extract structured data using a Custom Extractor.
2. Check Status
After upload, your document enters the processing queue. Check its status using the document ID returned in step 1.
3. Download Results
Once processing is complete, download the results in your preferred format:
- Transcription: TXT, DOCX, or JSON
- Tables: XLSX or JSON
- Extractor: XLSX, CSV, or JSON.
Getting Started
- Create an account at handwritingocr.com
- Generate an API token in the dashboard
- For Custom Extractors, create and test an Extractor first.
- Test the API with a sample document
- Monitor results in the dashboard
Authentication
The API uses token-based authentication. Each request must include a valid API token in the Authorization header. API tokens provide full access to the document management API for a specific user account.
- Tokens are generated through the web interface at https://www.handwritingocr.com/settings/api
- Tokens never expire but can be revoked or replaced at any time.
- Multiple active tokens are not supported
- Token permissions cannot be customized - each token has full access to all API endpoints
Authentication Header
Include your API token in all requests using the Bearer authentication scheme:
Authorization: Bearer your-api-token
Rate limits
To ensure fair usage, protect our services from abuse, and maintain high availability for all users, our API enforces rate limits. Familiarizing yourself with these limits will help you build robust and efficient integrations.
Standard Rate Limit
- Limit: We enforce a global rate limit of 2 requests per second (RPS).
- Scope: This limit is applied at the account level and is shared across all API endpoints. It is not a per-endpoint limit.
Detecting Rate Limits
When your application exceeds the rate limit, the API will respond with an:
- HTTP Status Code:
429 Too Many Requests
Rate Limit Headers
To help you manage your request volume and anticipate when limits might be reached, the API includes the following headers in its responses:
X-RateLimit-Limit
: The maximum number of requests allowed within the current time window.X-RateLimit-Remaining
: The number of requests remaining in the current time window.Retry-After
: Sent with a429 Too Many Requests
response, this header indicates the number of seconds your application should wait before attempting another request. It is crucial to respect this header to allow your connection to recover.
Best Practices for Managing Rate Limits
To operate efficiently within these limits, especially when processing multiple documents, we recommend the following best practices:
- Use the
document list
endpoint: For tasks like checking the status of multiple documents, utilize batch or list endpoints (such as adocument list
endpoint if available). This allows you to retrieve the status of many items in a single API call instead of polling each one individually. - Process sequentially based on status: Only attempt to retrieve full results for a document (e.g., download) once its status has changed to "processed" (or your equivalent terminal status).
- Implement exponential backoff: When you receive a
429
status code, use theRetry-After
header value to pause before retrying. IfRetry-After
is not present, or as a general error handling strategy, implement an exponential backoff mechanism for retries. This helps reduce pressure on the API during busy periods. - Cache responses: Cache responses from the API where appropriate to avoid requesting the same data repeatedly.
- Utilize Webhooks: By setting up a webhook, our service will proactively send your results to your specified URL as soon as they are ready. This eliminates the need for you to poll for status updates or use the API to download your results, significantly reducing your API call volume. You can set a webhook in your user dashboard's settings page.
Increasing Rate Limits
For users with consistently higher throughput requirements, we offer increased rate limits for Enterprise subscribers. These limits are determined on a case-by-case basis by negotiation.
If you anticipate needing a higher rate limit than the standard offering, please contact our support team to discuss your specific needs.
Support
- For technical support or questions, contact support@handwritingocr.com
List documents
Retrieves a paginated list of documents belonging to the authenticated user. Documents are sorted by creation date in descending order.
Endpoint
GET /api/v3/documentsHeaders
Key | Value | Required | Notes |
---|---|---|---|
Authorization | Bearer your-api-token | Yes | |
Accept | application/json | Yes |
Request Parameters
Name | Type | Required | Notes |
---|---|---|---|
per_page | integer | No | Number of items per page. Default is 50. Maximum 200. |
page | integer | No | The page number for pagination. Defaults to 1. |
action | string | No | Filter results by action. Options are transcribe , tables , extractor .
|
status | string | No | Filter results by status. Options are new , processing , processed , failed .
|
Response Codes
Code | Explanation |
---|---|
200 | Success - Returns list of documents |
401 | Unauthorized - Invalid or missing API token |
422 | Validation Error - Invalid parameters |
Upload Document
Upload a new document for processing. Supports PDF files and various image formats. The API will automatically check the page count of the submitted document against your credit balance before queueing for processing.
Endpoint
POST /api/v3/documentsHeaders
Key | Value | Required | Notes |
---|---|---|---|
Authorization | Bearer your-api-token | Yes | |
Accept | application/json | Yes | |
Content-Type | multipart/form-data | Yes |
Request Parameters
Name | Type | Required | Notes |
---|---|---|---|
file | file | Yes | The document to process. Valid file types are PDF, JPG, PNG, TIFF, HEIC, GIF. Maximum file size is 20MB. |
action | string | Yes | The action to perform on the document. Valid values are transcribe to extract text, tables to extract tables, and extractor to apply a custom extractor.
|
delete_after | integer | No | Seconds until auto-deletion. Overrides the auto-deletion period set in your user settings. Minimum is 300 seconds. Maximum is 1209600 seconds (14 days). |
extractor_id | string | No | A 10-character alphanumeric string e.g. Ks08XVPyMd. Create and test an extractor in the dashboard to get the extractor ID. Required when action is extractor .
|
prompt_id | string | No | A 10-character alphanumeric string e.g. Ab08RsPyMd. For custom prompts. Requires Enterprise subscription. |
Response Codes
Code | Explanation |
---|---|
201 | Success - Document created and queued for processing |
400 | Bad Request - Missing required fields |
401 | Unauthorized - Invalid or missing API token |
403 | Forbidden - Insufficient page credits |
415 | Unsupported Media Type |
422 | Validation Error - Invalid parameters |
429 | Too many requests - Rate limited. |
500 | Server Error - File storage or processing failed |
Download result
Retrieve the status of a document or download the processed results. The format extension is optional - if not provided, returns a JSON response. If the format extension is provided, downloads the processed document in the specified format.
Image thumbnail URLs are provided for each page. These images must be authenticated with your API token to download.
Webhooks
We strongly encourage using a webhook instead of polling this endpoint repeatedly. Webhooks provide a more efficient and real-time alternative by automatically delivering the processed result in JSON format to a specified URL as soon as the document is ready, saving you bandwidth and reducing latency. You can set your webhook URL through the user dashboard at https://www.handwritingocr.com/settings/documents
Endpoint
GET /api/v3/documents/{id}[.{format}]Headers
Key | Value | Required | Notes |
---|---|---|---|
Authorization | Bearer your-api-token | Yes | |
Accept | application/json | Yes |
Path Parameters
Name | Type | Required | Notes |
---|---|---|---|
id | string | Yes | The document's unique identifier, example abcde12345 ..
|
format | string | No | Output format. Varies by action: valid values are txt , docx , xlsx , csv , and json .
|
Response Codes
Code | Explanation |
---|---|
200 | Success - Returns document status (without format) or file (with format). |
202 | Accepted - Document is still being processed. |
400 | Bad Request - Invalid format for action type. |
401 | Unauthorized - Invalid or missing API token. |
403 | Forbidden - No permission to access document. |
404 | Not Found - Document not found. |
500 | Server Error - Error preparing file for download. |
Delete document
Permanently delete a document and its associated files. This action cannot be undone.
Endpoint
DELETE /api/v3/documents/{id}Headers
Key | Value | Required | Notes |
---|---|---|---|
Authorization | Bearer your-api-token | Yes | |
Accept | application/json | Yes |
Path Parameters
Name | Type | Required | Notes |
---|---|---|---|
id | string | Yes | The document's unique identifier. |
Response Codes
Code | Explanation |
---|---|
204 | Success - Document deleted. |
401 | Unauthorized - Invalid or missing API token. |
403 | Forbidden - No permission to delete document. |
404 | Not Found - Document not found. |
500 | Server Error - Error deleting document. |