ocr icr and omr recognition systems. Recognition of a set of segments. Post-processing of recognition results

Enter primary documents- digitization (image-processing, document capture)
In the process of preparing information during computerization of an enterprise, automation of accounting, the task arises of entering a large volume of text and graphic information. Using programs optical recognition texts, you can digitize text information. Modern software and hardware systems allow you to automate the entry of large volumes of information using network scanners and parallel text recognition on several computers simultaneously.

OCR – purpose - recognition
Most optical character recognition (OCR) programs work with raster image, which is received through a fax modem, scanner, digital camera or other device. The purpose of OCR systems is to analyze raster information (scanned symbol) and assign the corresponding symbol to a fragment of the image. After completing the recognition process, OCR systems must be able to preserve the formatting of source documents, assign a paragraph attribute in the right place, save tables, graphics, etc. Modern programs recognition supports all known text and graphic formats and formats spreadsheets, and some support formats such as HTML and PDF.

Stream input
To enter large volumes, continuous scanning of documents is used on special industrial document scanners. Processing in such systems is carried out in a semi-automatic mode with high productivity. Flow scanning of documents is optimal for creating an electronic archive of a large volume of similar information (accounting documentation, reports, conclusions, scientific works and so on.). Stream scanning is used for digitizing: accounting and financial documents, contractual documents, legal documents, archival documents, library catalogs, etc.

Image-processing tools are used for automatic data entry into Information Systems from any types of documents (identity, accounting, legal, etc.) to create electronic archives with an opportunity quick search necessary documents, when processing large amounts of data (population census, unified state exam, etc.), as well as for translating scanned documents, images and PDF files into editable formats. implementation modern means stream input allows you to reduce the cost of processing documents by more than 50%, achieve an increase in the speed of input into information systems by 3-10 times, improve the convenience and quality of working with data (high level of security of confidential data, reduce the number of errors associated with human factor when entering data), optimize business processes through automation routine function data entry and freeing up employees’ time to solve core problems. At the same time, the average return on investment ranges from three months to one year.

The main consumers of Image-processing in the world are large organizations (a little more than half of the market in monetary terms), medium-sized enterprises account for about a third, and the rest is small business.

Document recognition, document content analysis and data extraction are currently carried out using the following systems recognition of texts that differ in cost, quality and speed:

  • OCR (Optical Character Recognition) is a technology for optical recognition of printed characters, i.e. converting a scanned image of printed characters into their text representation;
  • ICR (Intelligent Character Recognition) - recognition of individual printed characters written by hand;
  • OMR (Optical Mark Recognition) - recognition of marks (usually squares or circles crossed out crosswise or with ticks);
  • stylized numbers - recognition handwritten numbers, handwritten according to a template, like on postal envelopes.

Over the years, recognition technology companies have tried to create acronyms to distinguish between OCR, ICR, OMR and OCR technologies. effective reading many types and styles of handwriting, including cursive.

Optical Character Recognition (OCR) technology examines scanned images of printed text and converts them into electronic text data. Although the most advanced systems can recognize almost all types of font, they only work with printed texts and reject handwritten ones. The printed letters lie flat on the page, allowing OCR to read one character at a time. When all characters in a word are recognized, the word is compared to the list possible options for final approval of the result. Any text that is not perfect will challenge even the most advanced OCR system, resulting in a significant reduction in processing accuracy for low-quality images. For example, when symbols are disconnected due to Bad quality images or several characters merge due to a blurred or dark background between them, recognition accuracy can decrease by as much as 20%.

Technology intelligent recognition handwritten printed text (ICR) mainly used in recognizing handwritten text in block letters. ICR is capable of recognizing single characters written by hand.

The task of recognizing human handwriting is much more complex than recognizing simple printed texts, since no two people have the same handwriting. Factors such as mood, environment, stress - all this together changes handwriting, forcing a person to write characters differently each time. Like OCR, ICR performs recognition character by character and begins by separating words into their constituent components. Therefore, when performing ICR recognition, it is important that the letters are not written carelessly or joined together.

ICR is a more reliable tool for processing handwritten text than OCR. Dictionaries are applied after the recognition process, not during it. Therefore, if the correct guess was not made during the character segmentation and recognition process, testing with a dictionary may not improve the result and significantly reduce accuracy.

Parascript ICR technology takes into account that handwriting elements have dynamic structure. Handwriting abbreviated to his basic elements, in essence, is the movements produced by the writing instrument. Some symbols embody the essence of all handwriting styles. For example, slope characterizes the trajectory of handwriting. Parascript calls this bias an XR element. It can be found in all letters. The combined XR elements essentially form the shape of all the letters.

Parascript ICR technology focuses on the structure of the written word. Similar to how people search for meaning to read words that have partially rearranged letters (yuo spa lkiley raed tihs wthiuot a pborlem), Parascript ICR achieves similar recognition based on a contextual approach. By processing results during the recognition process, Parascript ICR creates highly accurate responses, which in turn result in more high level recognition than those that are checked at the end of the process.

OMR (Optical Mark Recognition) - recognition of marks. Typically, the marks are crossed lines or ticked squares or circles (checkbox).

Text recognition systems or OCR systems (Optical Character Recognition) are designed to automatic input documents to the computer. This could be a page of a book, a magazine, a dictionary, some kind of document - anything that has already been printed and needs to be converted back into electronic form.

OCR systems recognize text and its various elements (pictures, tables) with electronic image. An image is usually obtained by scanning a document and less often by photographing it. The received image is processed by an OCR program algorithm, areas of text, images, tables are highlighted, and garbage is separated from the necessary data.

At the next stage, each character is compared with a special dictionary of characters, and if a match is found, then this character is considered recognized. As a result, you get a set of recognized characters, that is, the text you are looking for.

Modern OCR systems are quite complex software solutions. After all, the text can be littered, distorted, contaminated, and the program must take this into account and be able to correctly handle such situations. In addition, modern OCR systems also allow you to obtain a copy printed document V in electronic format while maintaining formatting, styles, text sizes and font types, etc.

Description of the OCR procedure

1. Pre-processing of the image.

2. Recognition of objects of higher levels.

3. Character recognition

4. Structuring hypotheses. Vocabulary check.

5. Synthesis of an electronic document.

Most optical character recognition (OCR Optical Character Recognition) programs work with a raster image that is received via a fax modem, scanner, digital camera or other device. In the first step, OCR must break the page into blocks of text based on the features of right and left alignment and the presence of multiple columns. The recognized block is then split into lines. Despite its apparent simplicity, this is not such an obvious task, since in practice distortion of the page image or page fragments when folded is inevitable. Even a slight tilt causes the left edge of one line to be lower than the right edge of the next, especially when the line spacing. As a result, the problem arises of determining the line to which this or that image fragment belongs. For example, for the letters j, И, ё, with a slight tilt, it is already difficult to determine which line the upper (separate) part of the character belongs to (in some cases it can be mistaken for a comma or period).

The lines are then divided into continuous image areas, which typically correspond to individual letters; the recognition algorithm makes assumptions regarding the correspondence of these areas to characters; and then a selection is made of each character, with the result that the page is reconstructed in characters of text, and, as a rule, in the appropriate format. OCR systems can achieve the best recognition accuracy of over 99.9% for clean images, made up of regular fonts. At first glance, this recognition accuracy seems ideal, but the error rate is still depressing, because if there are approximately 1500 characters on a page, then even with a 99.9% recognition success rate, there are one or two errors per page. In such cases, the dictionary checking method comes to the rescue. That is, if a word is not in the system’s dictionary, then it uses special rules to try to find a similar one. But this still does not allow correcting 100% of errors, which requires human control of the results.

54. Microsoft Word a powerful word processor (a text processor performs more complex operations than editors - word wrapping, formatting operations), designed to perform all text processing processes.

Currently the most widespread. Included in the integrated kit Microsoft package Office. The main purpose is to create and edit text documents. Has wide capabilities. The program is convenient for working with large documents thanks to its tools for creating indexes, tables of contents, headers and footers, hierarchical headings, etc.

1.Text capabilities Word processor extend from typing, to spell checking, inserting graphics into the text in the *.pcx or *.bmp standard, music modules in the *.wav format, text printing. Placement in document graphic objects, tables, diagrams, hyperlinks, automation of document processing, use of styles; lists, Word fields; creating macros; preparing the text for publication (creating a table of contents, alphabetical index, footnotes, notes); work on text jointly by several users, generating documents by merging, using templates, etc. It works with many fonts from any of the 21 languages ​​of the world. Availability of text layouts and templates. Word provides searching for a specified fragment of text, replacing it with the specified fragment, deleting it, copying it to the internal buffer. The presence of a bookmark in the text allows you to quickly go to the bookmarked place in the text. Word allows you to include databases in your text. You can set a password. Word allows you to open many windows so you can work on multiple texts at the same time. Microsoft Word (often MS Word, WinWord or simply Word) is a word processor designed for creating, viewing and editing text documents, with local application of the simplest forms of table-matrix algorithms. Produced by Microsoft as part of the Microsoft Office suite. The first version was written by Richard Brodie for IBM PCs running DOS in 1983.

Text editor - is a program designed for creating and processing texts.

Four groups of editors:

1. Editors for printing text.

2. Word processors to create compound documents, i.e. documents consisting of texts, tables, figures, graphs.

3. Programs for text layout (in typography)

4. Editors for creating scientific texts

Operating modes Word editor:

· Normal mode– used for printing text information

Page layout mode

· Document structure mode – a system for breaking a document into parts. Designed to work with large texts and having a number of headings and subheadings.

· Web document mode

Entering and editing text:

1. Do not print a space at the beginning of a sentence. A space is considered a character.

2. You cannot press the enter key to go to new line. But be sure to press enter when creating a new paragraph.

3. Before the symbols “.,:!? "You cannot put a space; you must put a space after the characters.

4. You need to first select the text, and then just do some work.

Document formatting includes:

1. Page formatting

2. Paragraph formatting

3. Symbols

4. table formatting

5. formatting the picture.

Creating a document.

IN text editor MS Word uses two methods to create a new document:

1. Based on ready-made template

2. Based on an existing document.

The second method is more advanced, but the first is methodologically more correct. When creating a document based on an existing document, open the existing document, save it under a new name, then select all the contents in it and delete everything, after which we have blank document, which has its own name and saves all the settings previously adopted for the source document.

Word includes a wide range of automation tools that make common tasks easier. Most of them were presented in one form or another in previous versions editor, but now the possibilities of automation have become much wider. Such means include:

Auto-replace allowing automatic correction typical mistakes when entering;

Auto-fill (or auto-text), with the help of which you can automatically continue entering a word or fragment of text after entering the first few letters (now the editor has a certain database of such blanks from the very beginning);

Automatic check Spelling includes checking spelling and grammar. The user has the opportunity to disable any type of check or conduct a check altogether only after completing the entire document entry;

Microsoft Office

Automatic creation And preview styles;

Auto-format as you type, designed to automatically format a document directly as you type or after it is completed;

An assistant designed to automatically provide advice and reference information you may need as you complete a task.

For example, if the Assistant decides that you are about to start creating a letter, it will offer to launch the Letter Wizard.

Word has tools that make it easier to work with tables, borders, and shading:

Using the mouse, you can draw tables of various shapes (individual table cells can have any width and height). The border of a table cell, row, or column can be easily removed, which has the same effect as merging cells. In Word, you can merge any adjacent cells both horizontally and vertically;

The contents of table cells can be aligned to the top or bottom, or to the middle of the cell. Text inside cells can be positioned vertically (rotated 90 degrees);

Word includes more than 150 various types borders that will help decorate any document and design it professionally;

Word offers a set graphic tools, with which you can enrich and decorate text and drawings by adding volume, shadows, texture and transparent fills, and auto-shapes.

Graphics editor Microsoft Office provides big set drawing tools. To decorate text and drawings, more than 100 customizable auto-shapes, 4 types of fills (multi-color gradient, patterned, transparent and patterned), and also adding shadows and volume are offered.

Scroll Microsoft capabilities Word

Text editing is carried out using the following functions:

§ selecting, copying and pasting the desired piece of text;

§ inserting non-text objects into Microsoft format Word (for example, including in the text graphic images, spreadsheets and graphs, sounds, video images, etc.);

§ inserting page numbers, dates and times, footnotes, special characters, etc. into the document;

§ possibility of finding, moving, replacing the right word text, line, section, page, etc.;

§ possibility of redoing or canceling last action, produced with text;

§ advanced document formatting options. Unlike Word Pad, Word allows document alignment on both edges and multi-column layout;

§ use of styles for quick formatting document.

In addition to the listed features, the program offers a certain set of service functions, such as:

§ spelling and grammar checking, including background checking - as you enter text;

§ selection of synonyms for words (menu item “Thesaurus”);

§ hyphenation in the text of the document;

§ determination of document statistical data (number of characters, words, lines, paragraphs, pages);

§ work with macros and document templates.

The program also has a large set of functions for working with tables and graphics, and a comprehensive help system ( reference system) and many many others.


Related information.


Optical Character Recognition (OCR) systems are designed to automatically enter printed documents into a computer.

FineReader is an omnifont optical text recognition system. This means that it allows you to recognize texts typed in almost any font without prior training. A special feature of the FineReader program is high accuracy recognition and low sensitivity to printing defects, which is achieved through the use of “holistic targeted adaptive recognition” technology.

The process of entering a document into a computer can be divided into two stages:

1. Scanning. At the first stage, the scanner plays the role of the “eye” of your computer: it “views” the image and transmits it to the computer. In this case, the resulting image is nothing more than a set of black, white or colored dots, a picture that cannot be edited in any text editor.

2. Recognition. Image processing by OCR system.

Let's look at the second step in more detail.

Image processing by the FineReader system includes analysis of the graphic image transmitted by the scanner and recognition of each character. The processes of analyzing the page layout (determining recognition areas, tables, pictures, highlighting lines and individual characters in the text) and image recognition are closely related: the block search algorithm uses information about the recognized text for a more accurate analysis of the page.

As already mentioned, image recognition is carried out on the basis of “holistic targeted adaptive recognition” technology.

Integrity- an object is described as a whole using significant elements and relationships between them.

Focus- recognition is built as a process of putting forward and purposefully testing hypotheses.

Adaptability- the ability of the OCR system to self-learn.

In accordance with these three principles, the system first puts forward a hypothesis about the recognition object (a symbol, part of a symbol, or several glued symbols), and then confirms or disproves it, trying to sequentially detect all structural elements and the relationships connecting them. Each structural element contains parts that are significant for human perception: segments, arcs, rings and points.

Following the principle of adaptability, the program “adjusts” itself, using the positive experience gained from the first confidently recognized symbols. Targeted search and consideration of context make it possible to recognize torn and distorted images, making the system resistant to possible writing defects.

As a result of your work, recognized text will appear in the FineReader window, which you can edit and save in the format most convenient for you.

New features of abbyy FineReader 7.0

Recognition accuracy

Recognition accuracy has been improved by 25%. Documents with complex layouts are better analyzed and recognized, in particular those containing sections of text on a colored background or a background consisting of small dots, documents with complex tables, including tables with white dividers, tables with colored cells

IN new version Added specialized dictionaries for English and German, including the most commonly used legal and medical terms. This allows you to reach a qualitatively new level in recognizing legal and medical documents.

Format supportXMLand integration withMicrosoftOffice

Appeared in FineReader new format saving - Microsoft Word XML. Now users of the new version of Microsoft Office 2003 will be able to work with documents recognized by FineReader, taking advantage of all the advantages of the XML format!

Integration of FineReader with Microsoft Word 2003 allows you to combine the powerful capabilities of these two applications for processing recognized text. You will be able to check and edit recognition results using familiar Word tools, while simultaneously checking the text transferred to Word with original image- the Zoom FineReader window opens directly in the Word window.

New features will make your work more convenient. When creating a Word document, you can call FineReader, recognize the text and insert it into the place of the document where the cursor is located, that is, you can easily collect information from different paper sources or PDF files in one document. Recognition results can now be sent via e-mail as an attachment in any of the supported save formats.

Improved performanceFineReaderWithPDFdocuments

The quality of PDF file recognition has improved significantly. Most documents contain text in addition to the page image. FineReader 7.0 can extract this text and use it to check the results and improve the quality of recognition.

Now you can edit recognized PDF documents in the FineReader editor window: the changes made will be saved in any of the PDF file saving modes supported in the program.

The format of PDF files created by FineReader is optimized for publishing them on the Internet - the user will be able to view the contents of the first pages while the rest of the document is downloaded.

New saving options

New format for saving recognition results - MicrosoftPowerPoint- allows you to quickly create new presentations or edit existing ones.

When saving to MicrosoftWord The size of the resulting file has been reduced, the formatting of documents with different delimiters has been improved, and new options for saving pictures have appeared.

Improved display of complex layout elements when saving

V HTML, for example, wrapping text around non-rectangular pictures. In addition, the size of the HTML file has been reduced, which is very important for publishing documents on the Internet.

Ease of use

Updated intuitive user interface. It has become more convenient to work with professional settings. Editing toolbars have been moved to the window where recognition results are displayed. Convenient tools for managing FineReader windows have appeared: for example, you can set a convenient magnification level in each window.

Updated practical guide to improve the quality of recognition will help a novice user quickly get started, and a more experienced user will be able to best configure the program to obtain excellent results when working with any type of documents.

Professional Opportunities

Now in version FineReaderProfessionalEdition those features that were previously available only to users of the version became available CorporateEdition:

Improved recognition barcodes, PDF-417 two-dimensional barcode recognition is supported.

Image splitting tool. With it you can divide images into areas and save each area as a separate page of the package. This makes it convenient to recognize multiple business cards scanned together, books, or printouts of PowerPoint presentation slides.

Morphological search. Any package created in FineReader can be used as a small database

with the possibility of full-text morphological search. Among all the recognized pages of the package, you can find those pages that contain the specified words in all their grammatical forms (for 34 languages ​​with dictionary support).

Processor supportIntelusing technologyHyper- Threading. The use of this technology can significantly increase productivity, which is especially important if the task is to recognize a large number of documents.

FineReader 7.0 also introduces other professional features:

Double-sided scanning. Scanning a document with a printed

with text on both sides using a scanner that supports this option, you will receive images of the contents of each side in the form of two separate pages of the package. If you only need to scan one side of a document, you can disable this option.

Supported opening of graphic files of the format JPEG 2000 and saving in this format.

NetworkpossibilitiesversionsFineReader Corporate Edition

Details of all the features of installing and using FineReader Corporate Edition in corporate network are described in the System Administrator's Guide, which you can find in the subfolder Administrator" sGuide server folders where FineReader was installed.

Major improvements compared to the previous version:

Support for basic methods of automatic installation from a server to workstations. FineReader Corporate Edition supports all the main methods of automatic installation on a local network: using Active Directory, Microsoft Systems Management Server or using the command line.

Working with multifunctional devices, including network ones. Multifunction devices that combine the functions of a scanner, printer, copier and fax are becoming increasingly popular. Now it is not necessary to install each employee with his own scanner - one is enough powerful device, with which all users of the organization work. FineReader can work with such devices, both connected to a workstation and networked. Special program settings allow the user to automatically open scanned images from anywhere on the local network or from an FTP server and recognize them

Various volume licensing models. In addition to licensing based on the number of concurrent users, other licensing methods have also become available. You can choose the option that best suits your needs.

LicenseManager- a tool for managing licenses on the network. IN FineReaderCorporateEdition a convenient license management utility (License Manager) has appeared. It helps track the use of FineReader on workstations, reserve licenses for workstations, and add new licenses.

Parameter name Meaning
Article topic: Text recognition systems (OCR systems)
Rubric (thematic category) Technologies

General characteristics and functionality of the program Adobe PhotoShop

PhotoShop- This program professional designers and everyone involved in the processing of graphic images. It allows you to process and correct images entered into a computer from external sources(scanner, digital camera or digital video camera), ᴛ.ᴇ. works with raster (digitized) graphics.

PhotoShop has many ready-made add-ons designed to create special effects, as well as the most precise instruments manual image adjustment.

The main characteristics of PhotoShop are:

1. The ability to create a multi-layered image, where each layer can be edited separately and moved relative to other layers. The final image can be saved either in a “multi-layer” form (PSD format), or you can combine all the layers into one, converting them into one of standard formats(JPG, GIF, etc.)

2. Wide range of possibilities on working with flowers: working with different color modes(for example, you can view and edit a picture as in RGB mode, and in CMYK); the presence of tools for fine adjustment of colors (and the parameters of each color can be adjusted separately).

3. Integrated vector editing capabilities.

4. The presence of several dozen tools for drawing and cutting out image contours, as well as professional tools for highlighting and editing individual areas Images.

5. Rich possibilities for combining images and working with textures.

6. The presence of many different filters and special effects (from simple ones, allowing you to adjust the sharpness of the image, to very exotic ones, allowing you to create 3-dimensional volumetric objects from two-dimensional photos, simulate the effects of explosions, cigarette smoke, etc.), the ability to connect additional plugins.

7. Supports several dozen files graphics programs, own files format common to IBM PC and Mac platforms.

8. Availability of tools for working with text, the ability to add text to any part of the image (on top of the picture), change the shape of the text, etc.

9. Possibility of multi-stage cancellation changes made(using the special “History” panel).

Any scanned information represents graphic file(picture). Therefore, scanned text cannot be edited without special translation V text format. This translation can be done using optical character recognition (OCR) systems.

To obtain an electronic (ready for editing) copy of a printed document, it is extremely important for the OCR program to perform a number of operations, among which are the following:

1. Segmentation- the “picture” received from the scanner is divided into segments (text is separated from graphics, table cells are divided into separate pieces, etc.).

2. Recognition- text is converted from graphic form to regular text form.

3. Spell checking and editing - the internal spell checker checks and corrects the operation of the recognition system (controversial words and symbols are highlighted in color, the user is informed about “uncertainly recognized characters”)

4. Preservation- recording the recognized document to a file required format for further editing in the appropriate program.

The above operations in most OCR systems can be performed both automatically (using a wizard program) and manual mode(separately).

Modern OCR systems recognize texts typed in various fonts; work correctly with texts containing words in several languages; recognize tables and figures; allow you to save the result in a text or table format file, etc.

Examples of OCR systems include CuneiForm from Cognitive and FineReader from ABBYY Software.

OCR system FineReader released in different versions(Sprint, Home Edition, Professional Edition, Corporate Edition, Office) and all of them, from the simplest to the most powerful, have very user-friendly interface, and also (depending on the modification) have a number of advantages that distinguish them from similar programs.

Eg, FineReader Professional Edition (FineReader Pro) has the following functionality:

§ supports almost two hundred languages ​​(even ancient languages ​​and popular programming languages);

§ recognizes graphics, tables, documents on forms, etc.;

§ completely preserves all the features of formatting documents and their graphic design;

§ for texts that use decorative fonts or contain Special symbols(for example, mathematical ones), a “Recognition with training” mode is provided, as a result of which a standard of characters found in the text is created for further use upon recognition;

Text recognition systems (OCR systems) - concept and types. Classification and features of the category "Text recognition systems (OCR systems)" 2017, 2018.