PK Systems PK Systems
Image Tools

OCR — Image to Text Extractor

Drop an image and pull the text out of it — English, Portuguese, or Spanish — without uploading anywhere.

OCR — Image to Text Extractor

Drop an image here or click to pick PNG, JPG, WebP, BMP — max 12 MB

Recognition runs locally — your image never leaves the browser.

Status

What this tool does

Optical Character Recognition (OCR) turns text inside an image — a screenshot, a photographed contract, a whiteboard snap, a scanned receipt, a book page — back into selectable, copyable, searchable text. Drop the image, pick the language, and the recognized text appears in seconds, ready to paste into your document or notes. The image and the recognized text never leave your device — there is no upload, no copy of your file held on a third-party server, no logging. That privacy guarantee matters because the documents people most often run through an OCR are exactly the ones you should not paste into a random online tool: IDs, passports, contracts, medical forms, payslips, tax letters, screenshots of internal apps. Pick the language that matches your image (English, Portuguese, or Spanish) — recognition accuracy drops sharply when the wrong model is used. The output is editable in place: you can correct any classic OCR confusions (0 vs O, 1 vs l vs I, m vs rn) before copying or downloading. Optionally enable per-word confidence so each word is tagged with how certain the engine is — handy for quickly spotting which parts of a low-quality scan still need a human eye.

How to use it

  1. Drop the image — Screenshots and clean scans work best. Photos of documents work too if the lighting is even and the camera held straight.
  2. Pick the language — Match the language of the text in the image. Each model is downloaded once and cached. Mismatched models give nonsense.
  3. Extract — Click Extract text. First run downloads the engine and the language model — subsequent runs of the same language are fast.
  4. Edit, copy, download — The output box is editable. Fix any errors, then copy or download as a .txt file.

How OCR works (in 200 words)

Modern OCR works in five steps. First the image is binarized — turned into pure black-and-white so the engine can tell ink from background regardless of paper color or shadow. Second, connected pixels are grouped into shapes, then into words and lines following the natural reading flow of the page. Third, each word is segmented into individual character candidates. Fourth, those candidates are fed through a neural network trained specifically on the chosen language, which is why picking the right language matters so much: the same letterform can be the most likely match in English and a different letter entirely in Portuguese or Spanish. Fifth, a language model looks at the whole word in context and picks the most plausible reading from a dictionary of common forms — that is what catches confusions like ofice being silently corrected to office. The per-word confidence score is the engine's own self-reported certainty for each word; very high scores are almost always correct, low scores are where you should glance at the original.

What works well, what doesn't

Great: clean PDF screenshots, well-lit scans of typed pages, screen captures of articles, printed book pages photographed straight on. OK: photographed printed pages with even lighting, slightly skewed scans (under 5°), receipts in good shape, signage shot at moderate angles. Poor: handwriting (the engine is trained on print, not cursive), heavily rotated or warped pages, low-light photos, very compressed JPEGs full of noise, decorative or stylized fonts, very small text (under about 10 pixels tall). For tough images, increase the resolution before running OCR — sharp, well-lit pixels matter much more than file size, and a 1500-pixel-wide crop usually beats a blurry 4K original.

Frequently asked questions

Is my image uploaded?
No. The OCR runs entirely on your device. Your image and the recognized text never leave the browser, never travel to our servers, and are not stored, indexed, logged, or shared. The only network calls are the one-time downloads of the recognition engine and the language model on first use, after which the page works even if you go offline.
Why is the first run slow?
First time you run OCR for a given language, the browser downloads the language model (~10 MB). Subsequent runs reuse the cached file and start instantly.
Can I OCR handwriting?
The recognition model is trained mostly on printed text. Neat block letters sometimes work; cursive or messy handwriting will give garbage. Handwriting OCR is a much harder problem — open-source browser-based engines do not handle it reliably yet, and we'd rather give you no result than a wrong one.
Why does it confuse 0/O and l/1?
Those characters are visually identical in many fonts — even humans need context to disambiguate them. The engine uses a language model to bias the choice, but it sometimes guesses wrong. The confidence-mode toggle highlights exactly where to double-check.
Can I add more languages?
We currently expose the three with the broadest demand for this site (English, Portuguese, Spanish), since each language model is a ~10 MB download. If you need another language, get in touch and we will look at adding it.
The result is mostly correct but full of small typos.
Some recognition mistakes are normal on imperfect images. The output textbox is editable for exactly that reason — fix the obvious errors, then copy or download. For long documents, paste into your editor and run a spell-check pass to mop up the rest.