Help

Processing & accuracy

How fast pages are processed, scan quality recommendations, multilingual handling, and what affects accuracy.

Last updated

Can I process very large batches of documents at once?

Yes. You can process very large batches without any issue. As long as each uploaded file is within the 20 MB file size limit, we can handle documents with any page count. For extremely large projects, you can upload multiple files, use the API to automate batch submissions, or use our managed processing service if you prefer to offload the entire workflow. If you want to confirm accuracy for a specific type of document, use your free trial credits to test a representative sample.

Do you use AI in the handwriting recognition process?

Yes. Our handwriting models use machine learning to improve accuracy on difficult, messy, or varied handwriting.

Does Handwriting OCR preprocess my images automatically?

Yes. The system automatically handles exposure correction, de-warping, straightening, contrast optimization, and other preprocessing steps. In most cases, you do not need to modify your files beforehand. However, if a page is extremely faint or distorted, a cleaner scan may still improve accuracy. You can test a few representative pages with your included trial credits to see whether your current scan quality is sufficient.

Does parallel processing affect the order of pages?

No. While pages are processed in parallel internally, the system preserves the correct page order in downloaded results. If a document appears out of order, it usually indicates that the original PDF or image set was incorrectly ordered or contained corrupted pages.

Does scan quality or resolution affect processing reliability?

Yes. Higher-quality scans produce significantly better results, especially for handwriting. We recommend scanning at 300 DPI or higher, ensuring strong contrast, and avoiding shadows or reflections. Low-resolution photos, skewed pages, or pages with uneven lighting may lead to dropped characters or errors. When in doubt, test a representative sample using the free trial credits available on new accounts.

How fast does Handwriting OCR process documents?

Most documents finish in 15–20 seconds end-to-end. Each individual page takes roughly 5–10 seconds to transcribe, but because pages run in parallel a 10-page PDF doesn't take 10× as long as a single page — it finishes in roughly the time of the slowest page plus a small overhead.

Occasionally a complex or very large document can take longer — up to about 10 minutes in extreme cases — but that's rare. If a document fails or stalls, the credits for unprocessed pages are refunded automatically.

How is crossed-out text handled in transcriptions?

By default, crossed-out text is transcribed as written. The OCR engine reads the underlying characters and includes them in the output — it doesn't try to guess that a line was meant to be deleted.

This is intentional: in many real-world documents (legal contracts, lab notebooks, edited drafts), the crossed-out content is still meaningful and shouldn't be silently discarded.

If you want crossed-out text removed

  • Edit in the dashboard — open the transcription in the dashboard and remove the offending lines manually before exporting.
  • Use a custom extractor — for forms with a known structure, define extractors that target only the fields you need; crossed-out marginalia won't be picked up.
  • Pre-process the image — if the strikethroughs are heavy enough, scanning at higher contrast and using image editing to mask the deleted regions before upload can also work, though that's usually more effort than editing afterwards.

If your workflow requires automatic crossed-out detection at scale, contact us — it's an area we can scope as part of a custom integration.

What file size or technical limits should I be aware of?

Individual uploads must be 20 MB or smaller. PDF files with very high-resolution images or large multi-page scans may need to be split to avoid hitting this limit. Uploads that exceed internal size thresholds may fail or produce incomplete results. Breaking large projects into smaller files generally provides smoother processing.

What happens to my data after processing?

Your documents and results are stored securely for a short retention period to allow download and review. You can shorten this window or delete files manually at any time.

What scanning settings produce the best OCR results?

The scan itself usually has more impact on accuracy than anything else. Here's what to set on your scanner.

Resolution (DPI)

  • 300 DPI is ideal for most handwriting — clear enough for the model, small enough to upload quickly.
  • 600 DPI is worth using for old or faded handwriting, very small writing, or fine cursive.
  • Below 200 DPI accuracy drops noticeably; avoid.

We rescale uploads to 2000 px on the longest side before processing, so very high-resolution scans don't add accuracy beyond a point — they just make uploads slower.

One document per page

Don't combine multiple documents onto a single page (e.g. four receipts on one A4 sheet, or two photos in one image). Because we rescale to 2000 px on the longest side, packed pages reduce the effective resolution for each individual document. Upload each as its own page or its own file.

Compression

Lower compression is better. Heavily compressed scans introduce artifacts that look like ink to the model.

  • For PDFs: choose "high quality" or equivalent rather than "small file size".
  • For images: prefer PNG (lossless) over heavily compressed JPEG.

File format

PDF and TIFF perform equally well — choose whichever fits your workflow. PDF is easiest if your scanner produces multi-page output natively.

Colour vs greyscale

Greyscale is usually fine and produces smaller files. Colour can help in specific cases:

  • Faded inks (red, blue, pencil) sometimes read better in colour than greyscale
  • Heavily-coloured paper or carbon copies

Contrast and brightness

Don't pre-process contrast yourself — our preprocessing handles it automatically. Just check the scan is clearly legible to your eye: if you struggle to read it, the model will too.

For phone photos

If you're not using a scanner:

  • Lay the page flat — curl and creases distort the text.
  • Use even lighting — no harsh shadows from one side.
  • Hold the camera directly above the page, not at an angle.
  • Frame the whole page with a small margin, not just the text area.

A bright, flat, square photo at 12 megapixels produces results comparable to a scanner at 300 DPI.

Why did my upload fail or get stuck in processing?

Upload failures typically come from oversized files, unsupported formats, poor internet connectivity, or documents containing corrupted pages. Processing failures can happen if the document has extremely low contrast, heavy artifacts, or incompatible layouts. Try re-exporting or rescanning the file, or splitting it into smaller sections. Testing a few representative pages with your free trial credits can help confirm that the document is compatible with the system.

Why did only part of my document process or appear in the results?

Partial results can occur when the source file is unusually large, contains corrupted pages, or includes sections that exceed internal processing limits. Very large books or scans stitched together into a single oversized PDF are the most common cause. Splitting the document into smaller batches usually resolves this. If accuracy is a concern, upload a few representative pages using the free trial credits provided with new accounts to confirm expected performance.

Why do multilingual or mixed-layout documents sometimes process incorrectly?

Documents that mix languages, combine handwriting with printed text, or contain tables, stamps, or irregular layout elements can confuse the model and may lead to unpredictable results. If your document includes mixed elements, upload a representative sample first using your free trial credits to determine how well the system handles your specific layout.