Select Page

Have you ever wondered what would happen if there were no identity documents?

ID cards are essential to validate our data and our own identity. If they did not exist, we would not have the right to identity, nor would we be able to prove who we are or identify ourselves when carrying out some procedures.

Their importance is considerable in such procedures as they are public, official, personal, and non-transferable papers. Despite their relevance, these documents have not always been the same, and like many others, Spanish ID has undergone different changes to adapt to the new era.

Among these changes, we find the introduction of the OCR system as a mechanism for reading data from identity documents. You may not know anything about OCR technology. That is why we have gathered everything you need to know about this technology in this post: how it works, its applications, and how Mobbeel makes use of it through MobbScan.

 

What is OCR technology and what does its acronym mean?

The acronym OCR stands for Optical Character Recognition. Before knowing what optical character recognition is, it is necessary to break down its acronyms so that you better understand what it refers to, like this:

O: Optical

People use their eyes and brain connections to recognise images and read documents. Nevertheless, computers use a scanner camera to recognise documents and images, considering them a simple set of pixels.

C: Characters

Characters are units of information that correspond to symbols or graphemes. In other words, they are the compositions of pixels, curves, and lines that form the written digits and the letters that we use in the alphabet.

R: Recognition

Character recognition takes place after the optical scanner digitises the image.

When the characters have been scanned, the OCR software proceeds to identify the letters and digits in the image and converts them into words.

You may have already got an idea of ​​what OCR is, but you must know a little more about its meaning. OCR technology is a technique that allows recognising characters in written texts and images and transcribing them in digital format.

However, the system must have learned and internalised the characters that it has to recognise in advance for this recognition to occur. In other words, this system analyses documents and images in different media and formats and recognises characters in them that match the information it has stored.

 

How does the OCR system work?

The character recognition system has two stages:

  1. Image digital processing that modifies the input image to remove all elements that can affect character recognition. It involves a thresholding process (to convert the image to binary), cleaning and noise removal, and morphological transformations to improve the layout, especially in the case of handwriting recognition.
  2. Classification that consists of applying recognition techniques. There are different approaches to character classification, some very simple based on comparison using geometric or statistical methods and others more complex using the latest techniques in machine learning.

 

OCR applications to digitalise identity documents

Your company can use OCR technology for different purposes, emphasising data extraction and verification activities. Here are the most representative use cases when it comes to identity documents:

Digitisation of identity documents

Many companies carry out campaigns to update the ID cards of their clients. The OCR system facilitates the digitisation process since documents are scanned through the web, validated and information is extracted by OCR quickly and efficiently, saving time and effort.

Age verification

Underage people cannot access online betting sites. Online gaming vendors have to control that the participants are over eighteen years old, verifying and validating the identity of the users in the registration processes. The user’s ID card is scanned and data is extracted using the OCR system to carry out this process.

Extraction of meta-information from an ID card automatically

Given a scanned document or an image of a valid identity document, OCR would be used to perform the extraction of all information fields along with the photo available in said identity document.

We have clients that send the scanned identity documents to the MobbScan API cutting out the image of the ID card and extracting the information from the document by OCR.

 

What kind of documents is verifiable through OCR automatically?

According to the ICAO document 9303, there are three types of standard documents in which data is encoded to be read by an OCR system.

Size 1 Travel Document (TD1)

TDI is especially used in identity cards. The space in this document is limited so the MRZ is moved to its back. For this reason, it is necessary to capture the front and the back of the document in order to extract the important information and validate this kind of document. The MRZ area in the TD1 contains three lines and each line contains thirty character rows. In addition, it is possible for each country to add custom content to this area.

Size 2 Travel Document (TD2)

The size of the TD2 document is smaller than TD1, making it easier to carry. Another benefit is that could find MRZ on the front so it is just necessary to scan that part of the document to verify it. The MRZ in the TD2 spans two lines with thirty-five character rows.

Size 3 Travel Document (TD3)

The TD3 document is the one used in most passports. This document includes the MRZ key information on the backside. This enables solutions such as Mobbscan to speed up the passport control process and the extraction of data as only one side of the document should be processed. The MRZ has two lines with forty character rows.

 

How does MobbScan extract personal information through OCR?

MobbScan extracts all the data that an identification document collects through optical character scanning to optimise identity validation.

Mobbeel’s advanced technology scans the document, detecting and reading the information in the machine-readable zone or MRZ. After this, it is decoded and converted into user-readable information.

The digital scanning can be carried out in two ways depending on the needs and demands of the client:

  • Exclusive MRZ scanning only extracts the information included in the mechanical reading zone of the ID card or passport with which we are working. The MRZ contains all the basic data of a person (name, date of birth, expiration date, issuing country, document number, etc.) and it also includes several control digits to ensure/validate that the data extracted is correct and non-manipulated. For this to happen, the document must comply with the international standard 9303 ICAO. international standard 9303 ICAO.
  • The full scan of the official identification document allows other types of additional information to be extracted, such as the address and the issuing equipment. This type of scanning allows validation to verify that the data on both sides of the document match. This type of scanning allows us to verify that the data on both sides of the document match.

Mobbeel’s OCR technology works reliably and accurately to comply with the Know Your Customer regulations both with documents that comply with ICAO 9303 (travel documents) and with other documents that do not comply with this standard, such as the EU driving license.

 

If you want to know more about our OCR technology (MobbScan), feel free to contact us through our contact form.