API documentation

The Handwriting OCR API provides a simple, reliable way to extract text and data from documents and images. Using state-of-the-art OCR technology, it can process handwritten text, printed documents, and structured data like tables. The API is RESTful, uses JSON for response data, and requires authentication via API tokens.

This is an earlier version of our API. For new users, we recommend you use the latest version of our API.

Key Features

  • Handwriting recognition and text extraction
  • Table structure detection and data extraction
  • Support for PDF and common image formats (JPG, PNG, TIFF, etc.)
  • Multiple export formats (TXT, DOCX, JSON, CSV, XLSX)

Basic Process

  1. Upload Document Start by uploading your document with a specified action:
    • transcribe: Extract all text from the document
    • tables: Extract data from tables
    • extractor: Extract structured data using a Custom Extractor.
  2. Check Status After upload, your document enters the processing queue. Check its status using the document ID returned in step 1.
  3. Download Results Once processing is complete, download the results in your preferred format:
    • Transcription: TXT, DOCX, or JSON
    • Tables: XLSX or JSON
    • Extractor: XLSX, CSV, or JSON.

Getting Started

  • Create an account at handwritingocr.com
  • Generate an API token in the dashboard
  • For Custom Extractors, create and test an Extractor first.
  • Test the API with a sample document
  • Monitor results in the dashboard

Support

For technical support or questions, contact support@handwritingocr.com

List documents

Retrieves a paginated list of documents belonging to the authenticated user. Documents are sorted by creation date in descending order.

Endpoint

GET https://www.handwritingocr.com/api/v2/documents

Headers

Key Value Required Notes
Authorization Bearer your-api-token Yes
Accept application/json Yes
Name Type Required Notes
per_page integer No Number of items per page. Default is 50. Maximum 200.
page integer No The page number for pagination. Defaults to 1.

Response Codes

Code Explanation
200 Success - Returns list of documents.
401 Unauthorized - Invalid or missing API token.
422 Validation Error - Invalid parameters.

Request

 1curl -X GET "https://www.handwritingocr.com/api/v2/documents?page=1&per_page=100" \
 2     -H "Authorization: Bearer your-api-token" \
 3     -H "Accept: application/json"

Response

 1{
 2    "current_page": 1,
 3    "data": [
 4        {
 5            "document_id": "xyz789",
 6            "status": "processed",
 7            "created_at": "2024-03-15T14:30:00Z",
 8            "updated_at": "2024-03-15T14:35:00Z",
 9            "automatically_deleted_at": "2024-03-22T14:30:00Z",
10            "page_count": 3,
11            "original_file_name": "business_report.pdf",
12            "action": "transcribe"
13        },
14        {
15            "document_id": "abc123",
16            "status": "queued",
17            "created_at": "2024-03-15T14:25:00Z",
18            "updated_at": "2024-03-15T14:25:00Z",
19            "automatically_deleted_at": "2024-03-22T14:25:00Z",
20            "page_count": 1,
21            "original_file_name": "receipt.jpg",
22            "action": "tables"
23        }
24    ],
25    "first_page_url": "https://www.handwritingocr.com/api/v2/documents?page=1",
26    "from": 1,
27    "last_page": 5,
28    "last_page_url": "https://www.handwritingocr.com/api/v2/documents?page=5",
29    "next_page_url": "https://www.handwritingocr.com/api/v2/documents?page=2",
30    "path": "https://www.handwritingocr.com/api/v2/documents",
31    "per_page": 50,
32    "prev_page_url": null,
33    "to": 50,
34    "total": 243
35}

Upload document

Upload a new document for processing. Supports PDF files and various image formats. The API will automatically check the page count of the submitted document against your credit balance before queueing for processing.

Endpoint

POST https://www.handwritingocr.com/api/v2/documents

Headers

Key Value Required Notes
Authorization Bearer your-api-token Yes
Accept application/json Yes
Content-Type multipart/form-data Yes
Name Type Required Notes
action string Yes Filter results by action. Options are transcribe, tables, extractor.
file file Yes The document to process. Valid file types are PDF, JPG, PNG, TIFF, HEIC, GIF. Maximum file size is 20MB.
delete_after integer No Seconds until auto-deletion. Overrides the auto-deletion period set in your user settings. Minimum is 300 seconds. Maximum is 1209600 seconds (14 days).
extractor_id string No A 10-character alphanumeric string e.g. Ks08XVPyMd. Create and test an extractor in the dashboard to get the extractor ID. Required when action is extractor.

Response Codes

Code Explanation
201 Success - Document created and queued for processing
400 Bad Request - Missing required fields.
401 Unauthorized - Invalid or missing API token.
403 Forbidden - Insufficient page credits.
415 Unsupported Media Type.
422 Validation Error - Invalid parameters.
429 Too many requests - Rate limited.
500 Server Error - File storage or processing failed.

Request

 1curl -X POST "https://www.handwritingocr.com/api/v2/documents" \
 2     -H "Authorization: Bearer your-api-token" \
 3     -H "Accept: application/json" \
 4     -F "file=@/path/to/document.pdf" \
 5     -F "action=transcribe" \
 6     -F "delete_after=604800"

Response

 1{
 2    "id": "abc123",
 3    "status": "queued"
 4}

Download result

Retrieve the status of a document or download the processed results. The format extension is optional - if not provided, returns a JSON response. If the format extension is provided, downloads the processed document in the specified format.

Image thumbnail URLs are provided for each page. These images must be authenticated with your API token to download.

Webhooks

We strongly encourage using a webhook instead of polling this endpoint repeatedly. Webhooks provide a more efficient and real-time alternative by automatically delivering the processed result in JSON format to a specified URL as soon as the document is ready, saving you bandwidth and reducing latency. You can set your webhook URL through the user dashboard.

Endpoint

GET https://www.handwritingocr.com/api/v2/documents/{id}[.{format}]

Headers

Key Value Required Notes
Authorization Bearer your-api-token Yes
Accept application/json Yes

Path Parameters

Name Type Required Notes
id string Yes The document's unique identifier, example abcde12345.
format string No Output format. Varies by action: valid values are txt, docx, xlsx, csv, and json.

Response Codes

Code Explanation
200 Success - Returns list of documents.
202 Accepted - Document is still being processed.
400 Bad Request - Invalid format for action type.
401 Unauthorized - Invalid or missing API token.
403 Forbidden - No permission to access document.
404 Not found - Document not found.
429 Too many requests - Rate limited.
500 Server Error - Error preparing file for download.

Request

 1curl -X GET "https://www.handwritingocr.com/api/v2/documents/abc123.txt" \
 2     -H "Authorization: Bearer your-api-token" \
 3     -H "Accept: application/json" \
 4     --output document.txt

Response

 1{
 2    "id": "abc123",
 3    "status": "processed",
 4    "action": "transcribe",
 5    "created_at": "2024-03-15T14:30:00Z",
 6    "updated_at": "2024-03-15T14:35:00Z"
 7}

Delete document

Permanently delete a document and its associated files. This action cannot be undone.

Endpoint

DELETE https://www.handwritingocr.com/api/v2/documents/{id}

Headers

Key Value Required Notes
Authorization Bearer your-api-token Yes
Accept application/json Yes

Path Parameters

Name Type Required Notes
id string Yes The document's unique identifier.

Response Codes

Code Explanation
204 Success - Document deleted.
401 Unauthorized - Invalid or missing API token.
403 Forbidden - No permission to delete document.
404 Not Found - Document not found.
500 Server Error - Error deleting document.

Request

 1curl -X DELETE "https://www.handwritingocr.com/api/v2/documents/abc123" \
 2     -H "Authorization: Bearer your-api-token" \
 3     -H "Accept: application/json"