Recognition of checks. Methods for withdrawing funds from applications. Receipt search using adaptive binarization with a high threshold

Ivan Ozhiganov April 7, 2016

Text recognition task in different conditions was and remains relevant. Automate document recognition, credit cards, recognize and translate a sign on a billboard into another language - all this could save time on collecting and processing the necessary data. With the development of convolutional neural networks and their training methods, the quality of text recognition is steadily increasing.

We once again became convinced of the effectiveness of using convolutional neural networks while working on a project to recognize cash receipts. The object of the study was cash receipts from a number of Russian retail outlets, with text in Cyrillic and Latin. At the same time, the developed system can be easily adapted to recognize cash receipts from other countries, with text in other languages. Let's look at the project in detail to show the principle of operation of the resulting solution.

The goal of the project is to develop an application with client-server architecture to recognize cash receipts and extract the necessary semantic component from them.

Project overview

The task of recognizing receipts consists of several stages:

1. Preprocessing
Search for a receipt in an image
Binarization
2. Select text
3. Recognition
4. Extracting the necessary semantic component of the receipt

Implementation

1. Preprocessing

The task of preprocessing is the following: rotate the image so that the lines of the receipt are located as horizontally as possible, find the receipt in the image and binarize it.

1.1. Rotate the image and search for a receipt on it

We performed the task of searching for a check using the following methods:

  • Adaptive binarization with high threshold
  • Convolutional neural network
  • Classifier with Haar characteristics
Receipt search using adaptive binarization
with a high threshold

Rice. 1. Initial type of check

On at this stage the task was to find a region in the image that contains the entire check and a minimum amount of background.

To simplify the search task, the picture is first rotated so that the lines are located as close as possible to horizontal position(Fig. 2). A rotation algorithm is needed to maximize the variance of the sum of luminances across rows. The maximum is achieved when the lines are horizontal.

Rice. 2. Turning the check

To search for a receipt, we used the adaptive_threshold function from the scikit-image library. This is a high-threshold adaptive binarization that leaves pixels white in high-gradient regions and more uniform regions black. Thus, with a sufficiently uniform background there remains no a large number of white pixels for which we are looking for the described rectangle. The resulting rectangle (Fig. 3) includes the area with the receipt and a minimum amount of background.

Rice. 3. Found area with receipt

Receipt search using convolutional neural network

We decided to look for key check points using a convolutional neural network, as we did earlier in . The corners of the check were chosen as key points. This method turned out to be quite good, but in quality it was inferior to the adaptive binarization method with a high threshold.

The convolutional neural network showed not the most best result, because it learned to predict the coordinates of corners only relative to the found text. At the same time, the location of the text relative to the corners differs from check to check, so the accuracy of the resulting convolutional neural network model is not the highest.

We present you the results of the network:

Rice. 4. Examples of how a convolutional neural network works to find check corners

Search for a receipt using a cascade classifier with Haar characteristics

As an alternative, we decided to try a classifier with Haar features. After spending about a week on training, adjusting the check detection parameters, we still did not get a decent result. Convolutional neural network showed more high quality work.

Examples of the operation of a cascade classifier with Haar characteristics:

Rice. 5. Positive results of the cascade classifier with Haar features

Rice. 6. False negative and false positives of a classifier with Haar features

1.2. Binarization

For binarization, the same adaptive_threshold is used, the window is large enough to contain both text and background (Fig. 7).

Rice. 7. Binarization of a check

2. Select text

2.1. Selecting text using the connected component method

The first stage of text selection is the search for connected components. We implemented it using the findContours function from OpenCV. Most of the connected components are indeed symbols, but some are residual noise after binarization. We filtered them out using filters based on maximum/minimum area. For composite symbols, we applied the algorithm for combining connected components (:, И, =). The characters are then combined into words using a nearest neighbor search. The principle of searching for nearest neighbors: for each symbol, several nearest neighbors are searched, then the most suitable candidate for joining on the right and left is selected from them. The algorithm is repeated until there are no characters left that do not belong to words (Fig. 8).

Rice. 8. Search for connected components and formation of words (words are highlighted in one color)

Rice. 9. Formation of lines (lines are highlighted in one color)

The disadvantage of this algorithm is that it is not able to correctly recognize words with stuck together or broken letters.

2.2. Selecting text using a grid

We noticed that almost all receipts have monospace text. This means that you can draw a grid on the check so that the grid lines pass between the symbols:


Rice. 10. Grid example

Algorithm automatic search check grid simplifies further check recognition: a neural network is applied to each grid cell, every character is recognized, there are no problems with stuck together characters or broken characters, the number of spaces that follow each other in a line is accurately determined.

To find such a grid, we tried next algorithm. First, connected components in the binarized image are found:

Rice. 11. Example of searching for connected components

Then we take the lower left corners of these green rectangles and get a set of points given by two coordinates. To determine the distortion, we decided to use the following two-dimensional periodic function:

The graph of this formula looks like this:

Rice. 12. Graph of the function in the formula

The idea of ​​the check grid extraction method is to search for such nonlinear geometric distortions of the coordinates of points so that the points fall on the peaks of the graph. That is, the problem is reduced to the problem of finding the maximum sum of the values ​​of this function. In this case, the optimal distortion is sought.

The geometric distortion was parameterized using the RectBivariateSpline function from the scipy module in python. Optimization was carried out using the minimize function from the scipy module.

Rice. 13. Example of a correctly found mesh

Rice. 14. Example of an incorrectly found mesh

We abandoned this method because it has a number of significant disadvantages - it is unstable and slow.

3. Text recognition

3.1. Recognition of text found using the connected component method

Text recognition is performed using a convolutional neural network trained on fonts cut from receipts. At the exit from the network, we have probabilities for each letter and take the first few options, which in total give a probability close to 1 (99%). Next we consider everything possible options compose words from the letters received and check them in the dictionary. This allows you to improve recognition accuracy by eliminating errors among similar characters (Z and E).

Unfortunately, this method It works stably only when the letters do not break apart or stick together.

3.2. Whole word recognition

Recognizing the entire word is necessary in difficult cases when the letters are torn and stuck together. We solved this problem in two ways:

  • using a recurrent neural network of the LSTM type;
  • using uniform segmentation.
LSTM

For complex cases, we decided to use an LSTM-type neural network to recognize the entire word, based on the research experience in the articles “Reading Scene Text in Deep Convolutional Sequences” and “Can we build language-independent OCRusing LSTM networks? " For this purpose we took the OCRopus library.

Using monospace fonts, we prepared an artificial sample for training (Fig. 15).


Rice. 15. Examples of artificial sampling

Having trained the network, we tested it on a validation set. Test results showed that the network trained well. Then we tested it on real checks. Below are the results:

The trained neural network worked well on simple examples, which we already successfully recognize in another way. Co complex examples the network couldn't cope.

We decided to add various distortions to the training sample in order to bring it closer to the words obtained from checks (Fig. 16).


Rice. 16. Examples of artificial sampling

To prevent the network from overtraining, we stopped training the network, prepared a new dataset, and trained the network further with a new dataset. As a result of the training we received the following:

The resulting neural network recognized Difficult words better, but began to recognize worse simple words. This model did not satisfy us because it was not stable.

We assume that with one font and little distortion, such a network would work much better.

Uniform segmentation

We came up with the idea of ​​dividing the word into characters evenly, since the font on checks is monospaced. To do this, you need to know the width of the character in the word. For each check, the mode of the symbol width is estimated. If the distribution of character widths is bimodal (Fig. 17), then two modes are selected and each line has its own width determined.


Rice. 17. Example of bimodal distribution of character widths in a receipt

Once we get the approximate width of the character in a given string, we divide the length of the word by the width of the character to get the approximate number of letters. Then divide the length of the word by the resulting approximate number of letters, plus or minus one:

Rice. 18. The process of finding optimal segmentation

And we choose best option splits:

Rice. 19. Optimal segmentation

The accuracy of such segmentation is very high.

Rice. 20. Example correct operation algorithm

But sometimes we observed that the algorithm does not work entirely correctly:

Rice. 21. Example incorrect operation algorithm

After segmentation, each fragment is sent to a convolutional neural network and recognized.

4. Extracting the necessary semantic component of the receipt

Searching for purchases in a receipt is done using regular expressions. There is one for all checks general feature: the purchase price is written in the format XX.XX, where X is a number. This way you can extract purchase rows. The TIN is searched for by 10 digits and verified by checksum. The cardholder name is searched in NAME/SURNAME format.

Rice. 22. Results of extracting the necessary semantic component of a receipt

Conclusion

The task of recognizing cash receipts turned out to be not as simple as at first glance. In the process of searching for a solution, we encountered big amount subtasks, each of which is fully or partially related to the others. Often such complex algorithms as a recurrent neural network like lstm are perceived as a universal tool. But in reality, such methods take a lot of time to master and are not always useful.

Work on the project continues. We are improving the quality at each stage of recognition and optimizing individual algorithms. On this moment system with high accuracy recognizes checks good quality– no sticky or torn letters. Receipts with sticky or torn letters are slightly less recognizable.

Every day we all go to grocery stores and spend our money. The most thrifty people catch promotions and look in electronic catalogs to find out where and what is cheaper.

But now they have another opportunity to get cashback. All you have to do is scan special codes, which since 2017 must be on cash receipts.

Making money by scanning QR codes from receipts via Android is not a scam, but reviews from people on the Internet are the best for that proof. Of course, the amounts returned are not so large, but for active purchases or for those who work in the store, this will still be useful.

Now we will explain how it works and show you exactly what to do.

What is a QR code on a cash receipt?

In 2017 due to changes federal law, some entrepreneurs and large companies were forced to start working with online cash registers.

According to the new rules, not only the use of modern equipment, but also the transfer of cash receipts to customers through electronic means communications.

At the buyer's request, the seller is obliged to provide him with a copy of the receipt in electronic format. This can be done via SMS (for example, Yandex.Taxi does this), but in regular stores no one knows the client’s phone number.

Therefore, in addition to a regular check, a QR code is applied to the paper, containing an electronic copy of the check.

You can simply walk into any supermarket and grab an armful of receipts to scan at home. There is no need to look for receipts with promotions, check any and if they contain promotional items, get a refund.

It can be scanned with any mobile application and obtain the same data as printed on paper. It's just encoded information.

Recently, they began to return part of the money for it, although not for any product, but only for those that participate in a special promotion.

Applications for making money by scanning QR codes

Under new trend Applications that previously simply offered to conveniently monitor promotions in stores began to actively connect. Every day there are more and more of them, but not all work correctly. Based on reviews, to make money on QR codes it is better to use:

Each application has its advantages, but you are better off downloading them all at once and as you read this article you will understand why. The most important thing is that all programs have special functionality installed for scanning QR codes from receipts of such popular stores as:

This is far from full list, scan codes from absolutely all receipts so as not to miss out on any reward. Just keep in mind that QR codes are valid for only a day from the moment they are received. Therefore, it is better to take a few minutes at the end of each day to enter data into the application.

Instructions: how to scan QR from a receipt and earn money?

In general, using QR code scanners is not difficult, however, some beginners have never done it before, so they need some guidance. We have done clear example via the Cool app:

  1. After downloading the application and launching it, you will immediately see a list of products for which money is refunded. The example below shows that you can get 3 rubles if the receipt contains Prostokvashino kefir or 5 rubles for Lipton tea. The balance is also displayed here, on QROOTO it is in points (10 points = 1 ruble):

  1. Through the menu, you will first need to register, then “Profile” will appear in place of this item. The same menu contains all the necessary sections and the receipt scanning function:

  1. Now you need to point the camera at the QR code for the application to scan it (no need to press anything). If everything is fine, you will see a corresponding notification and within 24 hours you will receive a reward in the form of points:

  1. If you make money with the Cool app, you can withdraw money (points) to your phone number, bank card or Yandex.Money. Everything is simple here, you dial the minimum amount (shown in the image), choose a method, enter the details. Money arrives in a maximum of 3 days:

The application works, everything is in Russian, no annoying advertising and other negative aspects. Even a child can use it, the main thing is not to forget to leave receipts.

Some sellers already collect receipts after work and check them using apps, collecting good money.

Questions about cashback from QR codes

While this type of income has not gained enormous popularity, many questions arise. We have already shown which applications to download, how to use them and withdraw money, but decided to collect FAQ users from forums:

  1. How many receipts can I scan?

You are allowed to check up to 10 QR codes per day, no more than 3 from one store. But this limitation is not a problem, because there is different applications and you can install them on the smartphones of your family members.

  1. I scanned the receipt, but the money didn’t arrive?

This happens if the information has not been received by the Federal Tax Service. The official website checks the compliance of cash receipts. You can check this yourself.

  1. What to do if the QR code is incorrect?

On some checks, QR codes are printed not as copies of the check, but as other information (for example, coupons). Therefore, such data is not processed.

  1. How much can you earn from QR codes?

It is impossible to name the exact amount, because it all depends on the number of checks and applications used. According to some data, average users manage to return about 1,000 rubles per month.

  1. How else can you make money with apps?

Some apps have an affiliate program. For example, Inshopper pays 50 rubles to the invited friend and the same amount to the one who invited the user after the first purchase.

If you still have any questions regarding the applications, leave them in the comments. You can actually save money with QR codes, but not that much money. Although, it all depends on what you buy and how often.

Making money from QR codes is a scam? Reviews from real people!

Although this is relative the new kind earnings, reviews about scanning QR codes are actively posted. People try to save on everything, but here there is practically nothing to do. Here is a screenshot of statistics from one of the authors of a positive review:

The Runet is quickly flooded with examples like this. All this suggests that making money from QR codes is not a scam.

The only thing everyone talks about is the small amount of cashback. Honestly, complaining about some bonuses received when buying milk or bread is simply stupid.

Earning money from QR codes on Android and IOS is already available and is gaining momentum. Even if this doesn’t start to bring in huge profits for ordinary families, it will definitely save money. After all, many go to neighboring stores to buy eggs for 2 rubles cheaper.

Follow promotions and buy suitable products to save even more.

I recommend visiting the following pages:


Essential expenses include food, personal care products and more. We all spend money from our pockets on various goods, but few people think that it is possible to return some of them. Now created There are many services with cashbacks, among which stand out applications that scan QRs from offline store receipts.

Earn money by scanning receipts through mobile applications It can hardly be called profitable; it is rather an opportunity to save on everyday purchases. Even with paid bread, kefir or cookies, you can get back a few rubles, and special effort this is not required.

Cashback via QR code in the receipt

In 2017, entrepreneurs on the territory of the Russian Federation were obliged to transfer electronic form check. To solve this problem, they began not only to use new equipment, but also to print QR codes on receipts. What are they needed for? You can scan them at any time and see the same data that the check itself contains:

It is necessary to check the company’s statements, and is also useful to the buyer himself, who can at any time restore the document on payment for the goods. Another use for this code was invented - participation in promotions. Row large stores launched a gift system for its customers. These include:

And this is not a complete list. On special conditions they return part of the money spent simply for scanning the QR code on the receipt. Only it must contain the goods participating in the promotion.

Special promotions in receipt scanning apps

Many applications have been created that help not only keep track of promotions in stores, but also scan QR codes. Below we will present a list of them, but now I would like to show a couple of examples:

Here is a list of different products from one application that are participating in the promotion. As you can see, for purchasing some of them you can get up to 200 rubles back (this is not the most large sum). The benefits are obvious, and when you click on one of the products you can see detailed conditions stock:

IN in this case You are invited not only to buy the product, but also to leave a review about it. Conditions are always different; additional activity (except QR scanning) is rarely required. As you can see, such cashback can completely cover the cost of the product. This is not a joke, making money from QR codes is not a scam, everything works.

How to scan a QR code of a receipt (best applications)

There are more and more mobile programs to save money. In addition to the fact that they have a built-in QR code scanner, they publish selections of promotional products (discounted products). Therefore, you will save by choosing best deals in my city. Install the best applications on your smartphones:

  1. InShopper - available for Android and IOS. This is one of the first applications in which you will receive a bonus of 50 rubles for registering using my link (just register on the website, not in the application). For the first uploaded check you will receive another 20 rubles, for the second 5 rubles. Payments are available from 300 rubles per mobile operators, Yandex.Money or in the form gift certificates famous stores.
  2. Edadil - application for all mobile devices operating systems, just on outdated Androids There is no QR code scanning function. To use the program, you need to connect your social account. networks or Yandex (the search engine owns 10% of the shares of this project). Payouts without commission to Yandex.Money or phone. If you paid by card, then the receipt needs to be scanned within 24 hours, if in cash, then 30 minutes.
  3. Qrooto - in many reviews about making money by scanning QR codes, this application is recommended. You can withdraw money from it even to a bank card, the minimum is only 10 rubles (100 points). For registration you are given 99 points, receipts are checked within 24 hours. A must go quick registration. Up to 10 receipts are checked per day, no more than 3 from one store.
  4. Together Cheaper - when using this QR code scanner from receipts, you additionally need to check the barcode. This application has a much larger list of stores; in addition to offline supermarkets, there is cashback from Aliexpress, Ebay and over 1000 other companies (even the purchase of airline tickets is refundable). Payments are made to your phone, card or Yandex wallet.

It's interesting that when parallel use programs, sometimes you can get cashback from the same checks. Don’t be lazy to scan QR codes with all the applications, and below we will explain why it’s still worth downloading all the scanners.

An example of making money on QR codes

To show you how it works and exactly what you need to do, we've downloaded the InShopper app and are ready to walk you through the interface. Programs are loaded from official stores (Google Play and Appstore):

  1. When you first launch, we immediately see a list of products, but first of all go to the “Profile” section:

  1. Registration is the same almost everywhere. Enter your phone number and receive a verification code via SMS:

  1. After authorization, the balance will be shown in the profile, buttons for withdrawing funds and inviting friends will appear. The transaction history is also displayed:

  1. Now you can click on the central “Scan Code” button and point the camera at the QR code indicated on the receipt. You don’t need to press anything, just try to catch it in the window, the program will automatically select the moment to take a picture:

  1. Immediately appears in the transaction history new entry. There you can monitor the processing of added checks:

  1. The final point is the withdrawal of funds. Everything is simple here, you choose a method and enter your payment details. As a rule, money arrives within 24 hours:

In other applications, the interface is different, as are some of the conditions. But in general, they work according to the same scheme. You upload the code through the scanner, wait a while, and get cashback.

How much can you save with QR codes on store receipts?

I have come across articles with the headings “Earn 40,000 rubles with QR codes.” Their authors are openly disingenuous, because an ordinary buyer is unlikely to be able to save more than 1,000 rubles on one application per month. Reviews about scanning receipts vary; users often post their transaction history:

As a rule, they do not appear large sums. However, it all depends on how active you are as a buyer and how well you keep track of the products that are included in the promotion. If you take advantage of all the offers, the cashback will be much greater.

Even in this, it would seem in a simple way earnings, there are certain recommendations that allow you to get more cashback. It is unlikely that you will be able to reimburse all your expenses, but It’s worth listening to the tricks:

  • use all devices in your home to install applications, this will remove restrictions on maximum number downloadable receipts;
  • download all applications, this will also allow you to add more QR codes. In addition, they sometimes work in different programs;
  • receipts with QR codes are lying around in bundles in stores, just take some of these “pieces of paper” for yourself and scan them at home;
  • It is not necessary to choose receipts only with promotional items. Scan all QR codes, the programs themselves determine if there is anything suitable in them;
  • Find among your friends sellers from stores whose products are participating in the promotion and collect cashbacks together;
  • use it more actively affiliate programs, invite more people. In the article we showed the most effective methods.

There are often reviews that making money by scanning QR codes from receipts is a scam. Their authors simply do not know who is paying the money and for what. And they come not from stores, but from manufacturers. In this way, they reward their customers and increase sales.

Even if you receive a ruble from each loaf of bread, if you buy it every day, you will earn 30 rubles a month. Plus cashback on milk, kefir, baked goods and other provisions. At a minimum, you can easily pay for the Internet or mobile communications. Well, the most active users they manage to make good money from this.

Now in App Store you can find a large number of applications to perform the most different tasks- from measuring your pulse and buying clothes online to calling a taxi and keeping track of your finances. The latter, as a rule, are similar to each other and differ only in the interface, but today’s Alzex Finance application will stand out from the rest with one interesting functionality, which is why many in the editorial office began to use the program for financial accounting.

The application is ideal for maintaining personal and family budget— the interface is so logical and thoughtful that you can easily track any movement Money, up to the amount spent on a loaf of bread in a nearby store. The idea of ​​dividing expenses and income into categories allows you to see where and in what quantities money is spent, who added a new transaction (it is possible to add a family member or counterparty).


You can add a transaction in a couple of clicks - enter a description, debit account, amount and category. It will immediately appear in the list of expenses (or income, if we are talking, for example, about wages), will be visible in analytics and will be reflected in the balance of the selected account in the “Accounts” tab.


So far, the description is similar to many other expense accounting applications, so let's jump straight into the differences of Alzex Finance. First of all, this is synchronization between different devices(the same iPhone) and a computer (version for). This allows multiple people to maintain the same database of expenses and income and synchronize changes. Moreover, unlike other similar services, in this case the database is stored locally, and the program works without the Internet. This is especially true abroad, where every megabyte spent in roaming can be “golden”.

But one of the main functions in the application appeared relatively recently with an update. Alzex Finance has implemented the function of importing receipts, that is, the developers have eliminated the need to manually enter transactions. We scanned the receipt and the expense was added!

How it works?

From July 1, 2017, all stores were required to print a QR code on receipts so that it could be scanned and received electronic check. Scanning is carried out by the “Cash Receipt Verification” program from the Federal Tax Service of Russia. After the first launch, the application will offer to recognize the receipt, and after a few seconds it will digital copy will already be on your iPhone.


To export a check to a financial accounting application, you need to select the “Receive statement” section, select the time interval and json data format. After loading the check, you will see strange symbols, but don’t be afraid - click the “Export” button, select the Alzex Finance program from the list, and the check is added!


The program automatically sends imported receipts to the “Unconfirmed Transactions” section and selects a category for the product based on previously entered transactions. So, for example, all transactions with the word “gasoline” will be included in the fuel category, and all transactions with the word “Milk” will be included in the “Products” category.


FinPix is not just a unique scanner of cash receipts in Russian, but an application for complete accounting of household finances.
Peculiarities:

  • Recognition of cash receipts in Russian. The application recognizes individual items in a receipt, highlights the product name, price, quantity, discount, cost;
  • Recognition banking SMS. If some SMS from banks are not recognized, you can send us examples from the application and we will add support for new formats;
  • Accounting for income, expenses, transfers between accounts, as well as debts, loans, deposits, and currency exchanges based on them;
  • Maintaining accounts, tracking their balances;
  • Maintaining several separate budgets within the application;
  • Using categories and subcategories to classify expenses. For convenience, you can configure which categories individual products belong to, then FinPix will automatically determine the category for familiar products;
  • Indication of transactions in any ISO 4217 currencies, automatic conversion of values ​​into different currencies at the rate on the date of expense (the necessary rates are downloaded from the website of the Central Bank of the Russian Federation);
  • Analysis of the structure and dynamics of income and expenses on the corresponding diagrams;
  • Viewing the transaction log in the form of a list with the ability to filter by categories and periods, accounts and sources of income, text search for transactions;
  • Export basic application data to xlsx file, import data in the same format that can be used to create backup copies, and for data exchange during joint management family budget by several family members;
  • Data in exported xlsx file adapted for analysis in office applications, incl. immediately created in the file pivot table by expenses by categories and periods;
  • FinPix is free application, the only paid functionality in it is uploading the names of goods in receipts when exporting to xlsx (does not apply to those expenses that were entered into the application manually or imported).

Note: The process of recognizing a receipt from photographs may take long time- up to several minutes. The processing time depends on the quality of the receipt itself, the quality of the photographs, as well as the performance of the device. When receipt recognition is started, you can continue working with the application, incl. take photographs of other receipts. In this case, the application will recognize checks one after another. The FinPix application itself does not send receipts anywhere - recognition occurs directly on your phone or tablet, incl. data about your expenses remains only yours.

Start using FinPix for management home accounting, scan your receipts, and you will find out the structure of your expenses without much effort. In addition, along the way, you will create your own database of prices for goods in different stores, save for yourself information about which goods, how many and where you bought previously, and you will be able to track personal inflation. Whether you're making repairs or going on vacation, save detailed expenses so you can accurately plan them next time.
Download the FinPix accounting and receipt scanner application for Android you can follow the link below.