Home

The graphics/tesseract/tesseract port

tesseract-5.3.4 – OCR Engine developed at HP Labs (cvsweb github mirror)

Description

The Tesseract OCR engine was one of the top 3 engines in the 1995 UNLV
Accuracy test. Between 1995 and 2006 it had little work done on it, but
it is probably one of the most accurate open source OCR engines
available. The source code will read a binary, grey or color image and
output text. A tiff reader is built in that will read uncompressed TIFF
images, or libtiff can be added to read compressed images.
WWW: https://github.com/tesseract-ocr/tesseract

Readme

+-----------------------------------------------------------------------
| Running ${PKGSTEM} on OpenBSD
+-----------------------------------------------------------------------

Before running Tesseract, if using another language than English, the
corresponding language pack must be installed.
e.g.
    # pkg_add tesseract-fra

Here's a quick HOWTO about optical character recognition using:
    scanimage(1) -- from the sane-backends package
    unpaper -- from the unpaper package
    convert(1) -- from the ImageMagick package
    tesseract

$ scanimage --mode gray --resolution 300 > scan.pnm
$ unpaper -b 0.5 -w 0.8 -l single scan.pnm scan1.pnm
$ convert scan1.pnm scan.tif
$ tesseract scan.tif scan

Maintainer

The OpenBSD ports mailing-list

Only for arches

aarch64 alpha amd64 arm hppa i386 mips64 mips64el powerpc powerpc64 riscv64 sparc64

Categories

graphics textproc

Library dependencies

Build dependencies

Run dependencies

Reverse dependencies

Files

Search