INDEX TERMS—Automatic license plate recognition (ALPR), automatic number plate recognition (ANPR), car plate recognition (CPR), optical character recognition (OCR) for cars.
ABSTRACT—Automatic license plate recognition (ALPR) is the extraction of vehicle license plate information from an image or a sequence of images. The extracted information can be used with or without a database in many applications, such as electronic payment systems (toll payment, parking fee payment), and freeway and arterial monitoring systems for traffic surveillance. The ALPR uses either a color, black and white, or infrared camera to take images. The quality of the acquired images is a major factor in the success of the ALPR. ALPR as a real-life application has to quickly and successfully process license plates under different environmental conditions, such as indoors, outdoors, day or night time. It should also be generalized to process license plates from different nations, provinces, or states. These plates usually contain different colors, are written in different languages, and use different fonts; some plates may have a single color background and others have background images. The license plates can be partially occluded by dirt, lighting, and towing accessories on the car. In this paper, we present a comprehensive review of the state-of-the-art techniques for ALPR. We categorize different ALPR techniques according to the features they used for each stage, and compare them in terms of pros, cons, recognition accuracy, and processing speed. Future forecasts of ALPR are given at the end.
INDEX TERMS—Automatic license plate recognition (ALPR), automatic number plate recognition (ANPR), car plate recognition (CPR), optical character recognition (OCR) for cars.
I. INTRODUCTION
UTOMATIC license plate recognition (ALPR) plays an Aimportant role in numerous real-life applications, suchas automatic toll collection, trafÞc law enforcement, park-ing lot access control, and road trafÞc monitoring [1]Ð[4]. ALPR recognizes a vehicleÕs license plate number from an image or images taken by either a color, black and white, or infrared camera. It is fulÞlled by the combination of a
Manuscript received May 21, 2011; revised February 21, 2012; accepted April 6, 2012. Date of publication June 8, 2012; date of current version February 1, 2013. This work was supported in part by the Natural Sciences and Engineering Research Council of Canada and Alberta Innovates Technology Futures. This paper was recommended by Associate Editor Q. Tian.
S. Du and M. Ibrahim are with IntelliView Technologies, Inc., Calgary, AB T2E 2N4, Canada (e-mail: du@intelliview.ca; ibrahim@intelliview.ca).
M. Shehata is with the Department of Electrical and Computer Engineering, Faculty of Engineering, Benha University, Cairo 11241, Egypt (e-mail: mohamed.shehata@ieee.org).
W. Badawy is with the Department of Computer Engineering, College of Computer and Information System, Umm Al-Qura University, Makkah 21955, Saudi Arabia (e-mail: badawy@badawy.ca).
Color versions of one or more of the Þgures in this paper are available online at http://ieeexplore.ieee.org.
Digital Object IdentiÞer 10.1109/TCSVT.2012.2203741
Fig. 1. (a) Standard Alberta license plate. (b) Vanity Alberta license plate.
lot of techniques, such as object detection, image processing, and pattern recognition. ALPR is also known as automatic vehicle identiÞcation, car plate recognition, automatic number plate recognition, and optical character recognition (OCR) for cars. The variations of the plate types or environments cause challenges in the detection and recognition of license plates. They are summarized as follows.
1) Plate variations:
a) location: plates exist in different locations of animage;
b) quantity: an image may contain no or many plates;
c) size: plates may have different sizes due to thecamera distance and the zoom factor;
d) color: plates may have various characters andbackground colors due to different plate types or capturing devices;
e) font: plates of different nations may be written indifferent fonts and language;
f) standard versus vanity: for example, the standardlicense plate in Alberta, Canada, has three and recently (in 2010) four letters to the left and three numbers to the right, as shown in Fig. 1(a). Vanity (or customized) license plates may have any number of characters without any regulations, as shown in Fig. 1(b);
g) occlusion: plates may be obscured by dirt;
h) inclination: plates may be tilted;
i) other: in addition to characters, a plate may con-tain frames and screws.
2) Environment variations:
a) illumination: input images may have differenttypes of illumination, mainly due to environmental lighting and vehicle headlights;
b) background: the image background may containpatterns similar to plates, such as numbers stamped on a vehicle, bumper with vertical patterns, and textured ßoors.
1051-8215/$31.00 c 2012 IEEE
312 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 23, NO. 2, FEBRUARY 2013
Fig. 2. Four stages of an ALPR system.
The ALPR system that extracts a license plate number from a given image can be composed of four stages [5]. The Þrst stage is to acquire the car image using a camera. The parameters of the camera, such as the type of camera, camera resolution, shutter speed, orientation, and light, have to be con-sidered. The second stage is to extract the license plate from the image based on some features, such as the boundary, the color, or the existence of the characters. The third stage is to segment the license plate and extract the characters by project-ing their color information, labeling them, or matching their positions with templates. The Þnal stage is to recognize the extracted characters by template matching or using classiÞers, such as neural networks and fuzzy classiÞers. Fig. 2 shows the structure of the ALPR process. The performance of an ALPR system relies on the robustness of each individual stage.
The purpose of this paper is to provide researchers with a systematic survey of existing ALPR research by categorizing existing methods according to the features they used, analyzing the pros or cons of these features, and comparing them in terms of recognition performance and processing speed, and to open some issues for the future research.
The remainder of this paper is organized as follows. In Section II, license plate extraction methods are classiÞed with a detailed review. Section III demonstrates character segmen-tation methods and Section IV discusses character recognition methods. At the beginning of each section, we deÞne the problem and its levels of difÞculties, and then classify the existing algorithms with our discussion. In Section V, we summarize this paper and discuss areas for future research.
II. LICENSE PLATE EXTRACTION
The license plate extraction stage inßuences the accuracy of an ALPR system. The input to this stage is a car image, and the output is a portion of the image containing the potential license plate. The license plate can exist anywhere in the image. Instead of processing every pixel in the image, which increases the processing time, the license plate can be distinguished by its features, and therefore the system processes only the pixels that have these features. The features are derived from the license plate format and the characters constituting it. License
plate color is one of the features since some jurisdictions (i.e., countries, states, or provinces) have certain colors for their license plates. The rectangular shape of the license plate boundary is another feature that is used to extract the license plate. The color change between the characters and the license plate background, known as the texture, is used to extract the license plate region from the image. The existence of the characters can be used as a feature to identify the region of the license plate. Two or more features can be combined to identify the license plate.
In the following, we categorize the existing license plate extraction methods based on the features they used.
A. License Plate Extraction Using Boundary/Edge
Information
Since the license plate normally has a rectangular shape with a known aspect ratio, it can be extracted by Þnding all possible rectangles in the image. Edge detection methods are commonly used to Þnd these rectangles [8]Ð[11].
In [5], [9], and [12]Ð[15], Sobel Þlter is used to detect edges. Due to the color transition between the license plate and the car body, the boundary of the license plate is represented by edges in the image. The edges are two horizontal lines when performing horizontal edge detection, two vertical lines when performing vertical edge detection, and a complete rectangle when performing both at the same time.
In [7], the license plate rectangle is detected by using the geometric attribute for locating lines forming a rectangle.
Candidate regions are generated in [5], [9], [12], and [16] by matching between vertical edges only. The magnitude of the vertical edges on the license plate is considered a robust extraction feature, while using the horizontal edges only can result in errors due to car bumper [10]. In [5], the vertical edges are matched to obtain some candidate rectangles. Rect-angles that have the same aspect ratio as the license plate are considered as candidates. This method yielded a result of 96.2% on images under various illumination conditions. According to [9], if the vertical edges are extracted and the background edges are removed, the plate area can easily be extracted from the edge image. The detection rate in 1165 images was around 100%. The total processing time of one 384 × 288 image is 47.9 ms.
In [17], a new and fast vertical edge detection algorithm (VEDA) was proposed for license plate extraction. VEDA showed that it is faster than Sobel operator by about seven to nine times.
The block-based method is also presented in the literature. In [18], blocks with high edge magnitudes are identiÞed as possible license plate areas. Since block processing does not depend on the edges of the license plate boundary, it can be applied to an image with an unclear license plate boundary. The accuracy of 180 pairs of images is 92.5%.
In [19], a license plate recognition-based strategy for check-ing inspection status of motorcycles was proposed. Experi-ments yielded a recognition rate of 95.7% and 93.9% based on roadside and inspection station test images. It takes 654 ms on a ultramobile personal computer and about 293 ms on a PC to recognize a license plate.
DU et al.: ALPR: STATE-OF-THE-ART REVIEW
Boundary-based extraction that uses Hough transform (HT) was described in [13]. It detects straight lines in the image to locate the license plate. The Hough transform has the advantage of detecting straight lines with up to 30¡ inclination [20]. However, the Hough transform is a time and memory consuming process. In [21], a boundary line-based method combining the HT and contour algorithm is presented. It achieved extraction results of 98.8%.
The generalized symmetry transform (GST) is used to extract the license plate in [22]. After getting edges, the image is scanned in the selective directions to detect corners. The GST is then used to detect similarity between these corners and to form license plate regions.
Edge-based methods are simple and fast. However, they require the continuity of the edges [23]. When combined with morphological steps that eliminate unwanted edges, the extraction rate is relatively high. In [8], a hybrid method based on the edge statistics and morphology was proposed. The accuracy of locating 9786 vehicle license plates is 99.6%.
B. License Plate Extraction Using Global Image Information
Connected component analysis (CCA) is an important tech-nique in binary image processing [4], [24]Ð[26]. It scans a binary image and labels its pixels into components based on pixel connectivity. Spatial measurements, such as area and aspect ratio, are commonly used for license plate extraction [27], [28]. Reference [28] applied CCA on low resolution video. The correct extraction rate and false alarms are 96.62% and 1.77%, respectively, by using more than 4 h of video.
In [29], a contour detection algorithm is applied on the binary image to detect connected objects. The connected objects that have the same geometrical features as the plate are chosen to be candidates. This algorithm can fail in the case of bad quality images, which results in distorted contours.
In [30], 2-D cross correlation is used to Þnd license plates. The 2-D cross correlation with a prestored license plate template is performed through the entire image to locate the most likely license plate area. Extracting license plates using correlation with a template is independent of the license plate position in the image. However, the 2-D cross correlation is time consuming. It is of the order of n4 for n × n pixels [14].
C. License Plate Extraction Using Texture Features
This kind of method depends on the presence of characters in the license plate, which results in signiÞcant change in the grey-scale level between characters color and license plate background color. It also results in a high edge density area due to color transition. Different techniques are used in [31]Ð[39].
In [31] and [39], scan-line techniques are used. The change of the grey-scale level results in a number of peaks in the scan line. This number equals the number of the characters.
In [40], the vector quantization (VQ) is used to locate the text in the image. VQ representation can gives some hints about the contents of image regions, as higher contrast and more details are mapped by smaller blocks. The experimental results showed 98% detection rate and processing time of 200 ms using images of different quality.
In [41], the sliding concentric windows (SCW) method was proposed. In this method, license plates are viewed as irregularities in the texture of the image. Therefore, the abrupt changes in the local characteristics are the potential license plate. In [42], a license plate detection method based on sliding concentric windows and histogram was proposed.
Image transformations are also widely used in license plate extraction. Gabor Þlters are one of the major tools for texture analysis [43]. This technique has the advantage of analyzing texture in unlimited orientations and scales. The result in [44] is 98% when applied to images acquired in a Þxed and speciÞc angle. However, this method is time-consuming.
In [32], spatial frequency is identiÞed by using discrete Fourier transform (DFT) because it produces harmonics that are detected in the spectrum analysis. The DFT is used in a row-wise fashion to detect the horizontal position of the plate and in a column-wise fashion to detect the vertical position.
In [36], the wavelet transform (WT)-based method is used for the extraction of license plates. In WT, there are four subbands. The subimage HL describes the vertical edge information and LH describes the horizontal one. The maximum change in horizontal edges is determined by scanning the LH image and is identiÞed by a reference line. Vertical edges are projected horizontally below this line to determine the position based on the maximum projection. In [45], the HL subband is used to search the features of license plate and then to verify the features by checking if in the LH subband there exists a horizontal line around the feature or not. The execution time of license plate localization is less than 0.2 s with an accuracy of 97.33%.
In [46]Ð[48], adaptive boosting (AdaBoost) is combined with Haar-like features to obtain cascade classiÞers for license plate extraction. The Haar-like features are commonly used for object detection. Using the Haar-like features makes the classiÞer invariant to the brightness, color, size, and position of license plates. In [46], the cascade classiÞers use global statistics, known as gradient density, in the Þrst layer and then Haar-like features. Detection rate in this paper reached 93.5%. AdaBoost is also used in [49]. The method presented a detection rate of 99% using images of different formats, size, and under various lighting conditions.
All the methods based on texture have the advantage of detecting the license plate even if its boundary is deformed. However, these methods are computationally complex, espe-cially when there are many edges, as in the case of a complex background or under different illumination conditions.
D. License Plate Extraction Using Color Features
Since some countries have speciÞc colors for their license plates, some reported work involves the extraction of license plates by locating their colors in the image.
The basic idea is that the color combination of a plate and characters is unique, and this combination occurs almost only in a plate region [50]. According to the speciÞc formats of Chinese license plates, Shi et al. [50] proposed that all the pixels in the input image are classiÞed using the hue, lightness, and saturation (HLS) color model into 13 categories.
314 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 23, NO. 2, FEBRUARY 2013
In [51], a neural network is used to classify the color of each pixel after converting the RGB image into HLS. Neural network outputs, green, red, and white are the license plate colors in Korea. The same license plate color is projected vertically and horizontally to determine the highest color density region that is the license plate region.
In [52], since only four colors (white, black, red, and green) are utilized in the license plates, the color edge detector focuses only on three kinds of edges (i.e., blackÐwhite, redÐ white, and greenÐwhite edges). In the experiment, 1088 images taken from various scenes and under different conditions are employed. The license plate localization rate is 97.9%
Genetic algorithm (GA) is used in [53] and [54] as a search method for identifying the license plate color. In [54], from training pictures with different lighting conditions, a GA is used to determine the upper and lower thresholds for the plate color. The relation between the average brightness and these thresholds is described through a special function. For any input picture the average brightness is determined Þrst, and then from this function the lower and upper thresholds are obtained. Any pixel with a value between these thresholds is labeled. If the connectivity of the labeled pixels is rectangular with the same aspect ratio of the license plate, the region is considered as the plate region.
In [55], Gaussian weighted histogram intersection is used to detect the license plate by matching its color. To overcome the various illumination conditions that affect the color level, conventional HI is modiÞed by using Gaussian function. The weight that describes the contribution of a set of similar colors is used to match a predeÞned color.
The collocation of license plate color and characters color is used in [56] to generate an edge image. The image is scanned horizontally and if any pixel that has a value within the license plate color range is found, the color value of its horizontal neighbors is checked. If two or more neighbors are within the same character color range, this pixel is identiÞed as an edge pixel in a new edge image. All edges in the new image are analyzed to Þnd candidate license plate regions.
In [57] and [58], color images are segmented by the mean shift algorithm into candidate regions and subsequently classiÞed as a plate or not. The detection rate of 97.6% was obtained. In [59], a fast mean shift method was proposed.
To deal with the problem of illumination variation asso-ciated with the color-based method, [60] proposed a fuzzy logic based method. The hue, saturation, and value (HSV) color space is employed. Three components of the HSV are Þrst mapped to fuzzy sets according to different membership functions. The fuzzy classiÞcation function is then described by the fusion of three weighted membership degrees.
Reference [61] proposed a new approach for vehicle license plate localization using a color barycenters hexagon model that is lower sensitive to the brightness.
Extracting a license plate using color information has the advantage of detecting inclined and deformed plates. However, it also has several difÞculties. DeÞning the pixel color using the RGB value is very difÞcult, especially in different illumi-nation conditions. The HLS, which is used as an alternative color model, is very sensitive to noise. Methods that use color
projection suffer from wrong detection, especially when some parts of the image have the same license plate color such as the car body.
In [62], the HSI color model is adopted to select statistical threshold for detecting candidate regions. This method can detect candidate regions when vehicle bodies and license plates have similar color. The mean and standard deviation of hue are used to detect green and yellow license plate pixels. Those of saturation and intensity are used to detect green, yellow, and white license plate pixels from vehicle images.
E. License Plate Extraction Using Character Features
License plate extraction methods based on locating its characters have also been proposed. These methods examine the image for the presence of characters. If the characters are found, their region is extracted as the license plate region.
In [63], instead of using properties of the license plate directly, the algorithm tries to Þnd all character-like regions in the image. This is achieved by using a region-based approach. Regions are enumerated and classiÞed using a neural network. If a linear combination of character-like regions is found, the presence of a whole license plate is assumed.
The approach used in [64] is to horizontally scan the image, looking for repeating contrast changes on a scale of 15 pixels or more. It assumes that the contrast between the characters and the background is sufÞciently good and there are at least three to four characters whose minimum vertical size is 15 pixels. A differential gradient edge detection approach is made and 99% accuracy was achieved in outdoor conditions.
In [65], binary objects that have the same aspect ratio as characters and more than 30 pixels are labeled. The Hough transform is applied on the upper side of these labeled objects to detect straight lines. The same happens on the lower part of these connected objects. If two straight lines are parallel within a certain range and the number of the connected objects between them is similar to the characters, the area between them is considered as the license plate area.
In [66], the characters are extracted using scale-space anal-ysis. The method extracts large-size blob-type Þgures that consist of smaller line-type Þgures as character candidates.
In [67], the character region is Þrst recognized by iden-tifying the character width and the difference between the background and the character region. The license plate is then extracted by Þnding the inter-character distance in the plate region. This method yielded an extraction rate of 99.5%.
In [68], an initial set of possible character regions are obtained by the Þrst stage classiÞer and then passed to the second stage classiÞer to reject noncharacter regions. Thirty-six AdaBoost classiÞers serve as the Þrst stage classiÞer. In the second stage, a support vector machine (SVM) trained on scale-invariant feature transform (SIFT) descriptors is em-ployed. In [69], maximally stable extremal regions are used to obtain a set of character regions. Highly unlike regions are removed with a simplistic heuristic-based Þlter. The remaining regions with sufÞcient positively classiÞed SIFT keypoints are retained as likely license plate regions.
These methods of extracting characters from the binary image as deÞning the license plate region are time consum-
DU et al.: ALPR: STATE-OF-THE-ART REVIEW
TABLE I
PROS AND CONS OF EACH CLASS OF LICENSE PLATE EXTRACTION METHODS
Methods
Rationale
Pros
Cons
References
Using boundary features
The
boundary
of
license
Simplest, fast and
Hardly be applied to complex im-
[5], [8]Ð[16]
plate is rectangular.
straightforward.
ages since they are too sensitive to
unwanted edges.
Using global image features
Find
a
connected
object
Straightforward, inde-
May generate broken objects.
[27]Ð[30]
whose dimension is like a
pendent of the license
license plate.
plate position.
Using texture features
Frequent color transition on
Be able to detect even
Computationally complex when
[31], [39]Ð[41]
license plate.
if the boundary is de-
there are many edges.
formed.
Using color features
SpeciÞc
color
on
license
Be able to detect in-
RGB is limited to illumination con-
[50]Ð[52]
plate.
clined and deformed
dition, HLS is sensitive to noise.
license plates.
Using character features
There must be characters on
Robust to rotation.
Time consuming (processing all bi-
[63], [64]
the license plate.
nary objects), produce detection er-
rors when other text in the image.
Using two or more features
Combining features is more
More reliable.
Computationally complex.
[70]Ð[72], [74], [81]
effective.
ing because they process all binary objects. Moreover, these methods produce errors when there is other text in the image.
F. License Plate Extraction Combining Two or More Features
In order to effectively detect the license plate, many methods search two or more features of the license plate. The extraction methods in this case are called hybrid extraction methods [47].
Color feature and texture feature are combined in [70]Ð [74]. In [70], fuzzy rules are used to extract texture feature and yellow colors. The yellow color values, obtained from sample images, are used to train the fuzzy classiÞer of the color feature. The fuzzy classiÞer of the texture is trained based on the color change between characters and license plate background. For any input image, each pixel is classiÞed if it belongs to the license plate based on the generated fuzzy rules. In [71], two neural networks are used to detect texture feature and color feature. One is trained for color detection and the other is trained for texture detection using the number of edges inside the plate area. The outputs of both neural networks are combined to Þnd candidate regions. In [72], only one neural network is used to scan the image by using H × W window, similar to the license plate size, and to detect color and edges inside this window to decide if it is a candidate. In [73], the neural network is used to scan the HLS image horizontally using a 1 × M window where M is approximately the license plate width, and vertically using an N × 1 window where N is the license plate height. The hue value for each pixel is used to represent the color information and the intensity is to represent the texture information. The output of both the vertical and the horizontal scan is combined to Þnd candidate regions. Time-delay neural network (TDNN) is implemented in [74] to extract plates. Two TDNNs are used for analyzing color and texture of the license plate by examining small windows of vertical and horizontal cross sections of the image.
In [75], the edge and the color information are combined to extract the plate. High edge density areas are considered as plate if their pixel values are the same as the license plate.
In [80], the statistical and the spatial information of the license plate is extracted using the covariance matrix. The
single covariance matrix extracted from a region has enough information to match the region in different views. A neural network trained on the covariance matrix of license plate and nonlicense plate regions is used to detect the license plate.
In [81], the rectangle shape feature, the texture feature, and the color feature are combined to extract the license plate. 1176 images that were taken from various scenes and conditions are used. The success rate is 97.3%.
In [43], the raster scan video is used as input with low memory utilization. Gabor Þlter, threshold, and connected component labeling are used to obtain plate region.
In [75], wavelet transform is used to detect edges of the image. After the edge detection, the morphology in image is used to analyze the shape and the structure of the image to strengthen the structure to locate the license plate.
In [76], a method applies HL subband feature of 2-D DWT twice to signiÞcantly highlight the vertical edges of license plates and suppress the background noise. Then, promising candidates of license plates are extracted by Þrst-order local recursive OtsuÕs segmentation and orthogonal projection his-togram analysis. The most probable candidate is selected by edge density veriÞcation and aspect ratio constraint.
In [77], the license plate is detected using local structure patterns computed from the modiÞed census transform. Then, two-part postprocessing is used to minimize false positive rates. One is the position-based method that uses the positional relation between a license plate and a possible false positive with similar local structure patterns, such as headlights or radiators. The other is the color-based method that uses the known color information of license plates.
Reference [78] proposed a method using wavelet analysis and improved HLS color decomposition and Hough line detection.
G. Discussion
In this section, we described existing license plate extraction methods and classiÞed them based on the features they used. In Table I, we summarize them and discuss the pros and cons of each class of methods.
316 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 23, NO. 2, FEBRUARY 2013
In Table IV, we highlight some typical ALPR systems presented in the literature. The techniques used in the main procedures are summarized. The performances of license plate extraction using different methods are shown.
In the literature, experimentation setups are normally re-stricted to well-deÞned conditions, e.g., vehicle position and illumination. To overcome the problem of varying illumina-tion, infrared (IR) units have been used. This method emerged from the nature of the license plate surface (retroreßective material) and has been already tested in the literature [63], [75], [82], [83]. In [75], a detection rate of 99.3% was achieved for 2483 images of Iranian vehicles captured using IR illumination units. IR cameras are also used in some commercial systems. An ALPR system [84] from Motorola and PIPS Technology acts as a silent partner in the vehicle, constantly scanning license plates of passed vehicles. When a vehicle of interest is passed, the system can alert the ofÞcer and record the time and GPS coordinates. The IBM Haifa Research Laboratory [85] developed an LPR engine for the Stockholm road-charging project. Nedap [86] automatic vehicle identiÞ-cation and vehicle access control applications claim that when installed properly an approximate 98% accuracy typically can be achieved. Geovision [87] license plate recognition system uses advanced neural networks technology to capture vehicle license plates. The system can reach up to 99% recognition success with high recognition speed (< 0.2 s). In [82], Naito et al. studied the ALPR problem from the viewpoint of the sensorsystem. The authors claimed that the existing dynamic range of a conventional CCD video camera is insufÞcient for ALPR purposes. Therefore, the sensor system is upgraded to a double dynamic range using two CCDs and a prism that splits an incident ray into two lights of different intensities. In testing, the input image is binarized using OtsuÕs method [88] and the character regions are extracted exploiting the focal length of the sensor to estimate the character size. Recognition rates are over 99% for conventional plates and over 97% for highly inclined plates from −40¡ to 40¡. Regarding the camera-to-car distance, as reported in [4], license plate height should be at least 20Ð25 pixels to facilitate the character segmentation and recognition.
III. LICENSE PLATE SEGMENTATION
The isolated license plate is then segmented to extract the characters for recognition. An extracted license plate from the previous stage may have some problems, such as tilt and nonuniform brightness. The segmentation algorithms should overcome all of these problems in a preprocessing step.
In [51] and [89], the bilinear transformation is used to map the tilted extracted license plate to a straight rectangle.
In [90], a least-squares method is used to treat horizontal tilt and vertical tilt in license plate images.
In [91], according to KarhunenÐLoeve transform, the co-ordinates of characters are arranged into a 2-D covariance matrix. The eigenvector and the rotation angle α are computed in turn. Then, image horizontal tilt correction is performed. For vertical tilt correction, three methods K-L transform, the line Þtting based on K-means clustering, and the line Þtting based
on least squares are put forward to compute the vertical tilt angle θ.
In [92], a line Þtting method based on the least-squares Þtting with perpendicular offsets was introduced for correcting a license plate tilt in the horizontal direction. Tilt correction in the vertical direction by minimizing the variance of co-ordinates of the projection points was proposed. Character segmentation is performed after horizontal correction and character points are projected along the vertical direction after shear transform.
Choosing an inappropriate threshold for the binarization of the extracted license plate results in joined characters. These characters make the segmentation very difÞcult [90]. License plates with a surrounding frame are also difÞcult to segment since after binarization, some characters may be joined with the frame [93]. Enhancing the image quality before binarization helps in choosing the appropriate threshold [93]. Techniques commonly used to enhance the license plate image are noise removal, histogram equalization, and contrast enhancement. In [93], a system was proposed to conduct gradient analysis on the whole image to detect the license plate and then the detected license plate is enhanced by grey level transformation. A method to enhance only the characters and to reduce the noise was proposed in [94]. The size of the characters is considered to be approximately 20% of the license plate size. First, the grey-scale level is scaled to 0Ð100, then the largest 20% pixels are multiplied by 2.55. Only characters are enhanced while noise pixels are reduced. Since binarization with one global threshold cannot always produce acceptable results, adaptive local binarization methods are normally used. In [95], local thresholding is used for each pixel. The threshold is computed by subtracting a constant c from the mean grey level in an m × n window centered at the pixel. In [96], the threshold is given by the Niblack binarization formula to vary the threshold over the image, based on the local mean and the standard deviation.
In the following, we categorize the existing license plate segmentation methods based on the features they used.
A. License Plate Segmentation Using Pixel Connectivity
Segmentation is performed in [12], [30], [52], and [97]Ð[99] by labeling the connected pixels in the binary license plate image. The labeled pixels are analyzed and those which have the same size and aspect ratio of the characters are considered as license plate characters. This method fails to extract all the characters when there are joined or broken characters.
B. License Plate Segmentation Using Projection Profiles
Since characters and license plate backgrounds have dif-ferent colors, they have opposite binary values in the binary image. Therefore, some proposed methods as in [15], [21], [24], [32], [50], [51], [74], and [100]Ð[104] project the binary extracted license plate vertically to determine the starting and the ending positions of the characters, and then project the ex-tracted characters horizontally to extract each character alone. In [15], along with noise removal and character sequence analysis, vertical projection is used to extract the characters.
DU et al.: ALPR: STATE-OF-THE-ART REVIEW
TABLE II
PROS AND CONS OF EACH CLASS OF LICENSE PLATE SEGMENTATION METHODS
Methods
Pros
Cons
Using pixel connectivity [12], [30]
Simple and straightforward, robust
Fails to extract all the characters when there are
to the license plate rotation.
joined or broken characters.
Using projection proÞles [21], [24],
Independent of character positions,
Noise affects the projection value, requires prior
[51], [101]
be able to deal with some rotation.
knowledge of the number of license plate char-
acters.
Using prior knowledge of characters [6],
Simple.
Limited by the prior knowledge, any change may
[14], [105], [106]
result in errors.
Using character contours [107], [108]
Can get exact character boundaries.
Slow and may generate incomplete or distorted
contour.
Using combined features [111], [112]
More reliable.
Computationally complex.
By examining more than 30 000 images, this method reached the accuracy rate of 99.2% with a 10Ð20 ms processing speed. In [51] and [101], character color information is used in the projection instead of using the binary license plate. By reviewing the literature, it is evident that the method that exploits vertical and horizontal projections of the pixels is the most common and simplest one.
The pro of the projection method is that the extraction of characters is independent of their positions. The license plate can be slightly rotated. However, it depends on the image quality. Any noise affects the projection value. Moreover, it requires prior knowledge of the number of plate characters.
C. License Plate Segmentation Using Prior Knowledge of
Characters
Prior knowledge of characters can help the segmentation of the license plate. In [14], the binary image is scanned by a horizontal line to Þnd the starting and ending positions of the characters. When the ratio between characters pixels to background pixels in this line exceeds a certain threshold after being lower than this threshold, this is considered as the starting position of the characters. The opposite is done to Þnd the ending position of the characters.
In [6], the extracted license plate is resized into a known template size. In this template, all character positions are known. After resizing, the same positions are extracted to be the characters. This method has the advantage of simplicity. However, in the case of any shift in the extracted license plate, the extraction results in background instead of characters.
In [105], the proposed approach provides a solution for the vehicle license plates that are degraded severely. Color collocation is used to locate the license plate in the image. Dimensions of each character are used to segment the character. The layout of the Chinese license plate is used to construct a classiÞer for recognition.
The license plates in Taiwan are all in the same color distribution [106], i.e., black characters and white background. If the license plate is scanned with a horizontal line, the number of black to white (or white to black) transitions is at least 6 and at most 14. Hough transform is used to correct the rotation problem, the hybrid binarization technique is used to segment the characters in the dirty license plate, and feedback self-learning procedure is employed to adjust the parameters. In the experiment, 332 different images are used
captured under various illuminations and at different distances. The overall location and segmentation rates are 97.1% and 96.4%.
D. License Plate Segmentation Using Character Contours
Contour modeling is also employed for character segmen-tation. In [108] a shape driven active contour model is estab-lished, which utilizes a variational fast marching algorithm. The system works in two steps. First, rough location of each character is found by an ordinary fast marching technique [109] combined with a gradient-dependent and curvature-dependent speed function [110]. Then, the exact boundaries are obtained by a special fast marching method.
E. License Plate Segmentation Using Combined Features
In order to efÞciently segment the license plate, two or more features of the characters can be used. In [111], an adaptive morphology based segmentation approach for se-riously degraded plate images was proposed. An algorithm based on the histogram detects fragments and merges these fragments. A morphological thickening algorithm [113] lo-cates reference lines for separating the overlapped characters. A morphological thinning algorithm [114] and the segmen-tation cost calculation determine the baseline for segment-ing the connected characters. For 1189 degraded images, the entire character content is correctly segmented in 1005 of them.
In [115], a method was described for segmenting the main numeric characters on a license plate by introducing dynamic programming (DP). The proposed method func-tions very rapidly by applying the bottom-up approach of the DP algorithm and also robustly by minimizing the use of environment-dependent features such as color and edges. The success rate for detection of four main numbers is 97.14%.
F. Discussion
In this section, we described existing license plate segmen-tation methods and classiÞed them based on the features they used. In Table II, we summarize them and discuss the pros and cons of each class of methods.
In Table IV, we highlight some typical ALPR systems presented in the literature. The techniques used in the main procedures are summarized. The performances of license plate segmentation using different methods are shown.
318 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 23, NO. 2, FEBRUARY 2013
IV. CHARACTER RECOGNITION
The extracted characters are then recognized and the output is the license plate number. Character recognition in ALPR systems may have some difÞculties. Due to the camera zoom factor, the extracted characters do not have the same size and the same thickness [30], [93]. Resizing the characters into one size before recognition helps overcome this problem. The charactersÕ font is not the same all the time since different countriesÕ license plates use different fonts. Extracted charac-ters may have some noise or they may be broken [30]. The extracted characters may also be tilted [30].
In the following, we categorize the existing character recog-nition methods based on the features they used.
A. Character Recognition Using Raw Data
Template matching is a simple and straightforward method in recognition [5], [101]. The similarity between a character and the templates is measured. The template that is the most similar to the character is recognized as the target. Most template matching methods use binary images because the grey-scale is changed due to any change in the lighting [90].
Template matching is performed in [5], [12], [30], [51], [93], and [116] after resizing the extracted character into the same size. Several similarity measuring techniques are deÞned in the literature. Some of them are Mahalanobis distance and the Bayes decision technique [30], Jaccard value [51], Hausdorff distance [116], and the Hamming distance [5].
Character recognition in [93] and [117] uses normalized cross correlation to match the extracted characters with the templates. Each template scans the character column by column to calculate the normalized cross correlation. The template with the maximum value is the most similar one.
Template matching is useful for recognizing single-font, nonrotated, nonbroken, and Þxed-size characters. If a character is different from the template due to any font change, rotation, or noise, the template matching produces incorrect recognition [90]. In [82], the problem of recognizing tilted characters is solved by storing several templates of the same character with different inclination angles.
B. Character Recognition Using Extracted Features
Since all character pixels do not have the same importance in distinguishing the character, a feature extraction technique that extracts some features from the character is a good alter-native to the grey-level template matching technique [101]. It reduces the processing time for template matching because not all pixels are involved. It also overcomes template matching problems if the features are strong enough to distinguish characters under any distortion [90]. The extracted features form a feature vector which is compared with the pre-stored feature vectors to measure the similarity.
In [101] and [119], the feature vector is generated by projecting the binary character horizontally and vertically. In [119], each projection is quantized into four levels. In [102], the feature vector is generated from the Hotelling transform of each character. The Hotelling transform is very sensitive to the
segmentation result. In [120], the feature vector is generated by dividing the binary character into blocks of 3 × 3 pixels. Then, the number of black pixels in each block is counted. In [97], the feature vector is generated by dividing the binary character after a thinning operation into 3 × 3 blocks and counting the number of elements that have 0¡, 45¡, 90¡, and 135¡ inclination. In [121], the character is scanned along a central axis. This central axis is the connection between the upper bound horizontal central moment and lower bound horizontal central moment. Then the number of transitions from character to background and spacing between them form a feature vector for each character. This method is invariant to the rotation of the character because the same feature vector is generated. In [122], the feature vector is generated by sampling the character contour all around. The resulted waveform is quantized into the feature vector. This method recognizes multifont and multi-size characters since the contour of the character is not affected by any font or size change. In [123], the Gabor Þlter is used for feature extraction. The character edges whose orientation has the same angle as the Þlter will have the maximum respond to the Þlter. This can be used to form feature vector for each character. In [124], Kirsch edge detection is applied on the character image in different directions to extract fea-tures. Using Kirsch edge detection for feature extraction and recognition achieved better results than other edge detection methods, such as Prewitt, Frei Chen, and Wallis [125]. In [126], the feature vector is extracted from the binary character image by performing thinning operation and then converting the direction of the character strokes into one code. In [127], pixelsÕ grey-scale values of 11 subblocks as the features are fed into a neural network classiÞer. In [128], a scene is processed by visiting nonoverlapping 5 × 5 blocks, processing the surrounding image data to extract ÒspreadÓ edge features based on the research conducted in [129], and classifying this subimage according to the coarse-to-Þne search strategy described in [130]. In [49], three character features contour-crossing counts, directional counts, and peripheral background area are used. The classiÞcation is realized by a support vector machine. In [52], the topological features of charactersÑ the number of holes, endpoints, three-way nodes, and four-way nodesÑare used. These features are invariant to spatial transformations.
After feature extraction, many classiÞers can be used to recognize characters, such as ANN [127], SVM [74], HMM [95]. Some researchers integrate two kinds of classiÞca-tion schemes [131], [132], multistage classiÞcation schemes [133], or a ÒparallelÓ combination of multiple classiÞers [134], [135].
C. Discussion
In this section, we described existing character recognition methods and classiÞed them based on the features they used. In Table III, we summarize them and discuss the pros and cons of each class of methods.
In Table IV, we highlight some typical ALPR systems presented in the literature. The techniques used in the main procedures are summarized. The performances of character
DU et al.: ALPR: STATE-OF-THE-ART REVIEW
TABLE III
PROS AND CONS OF EACH CLASS OF CHARACTER RECOGNITION METHODS
Methods
Pros
Cons
Using pixel
Template matching [5], [93], [117]
Simple and straightforward.
Processing nonimportant
values
pixels and slow, vulnera-
ble to any font change, ro-
tation, noise and thickness
change.
Several templates for each character
Be able to recognize tilted characters.
More processing time.
Using
Horizontal and vertical projections [101], [119]
Be able to extract salient features, robust
Feature extraction takes
extracted
to any distortion, fast recognition since
time, nonrobust features
Hotelling transform [102]
features
the number of features is smaller than
will degrade the recogni-
The number of black pixels in each 3 × 3 pixels block [120]
that of the pixels.
tion.
Count the number of elements that have certain degrees inclina-
tion [97]
The number of transitions from character to background and
spacing between them [121]
Sampling the character contour all around [122]
Gabor Þlter [123]
Kirsch edge detection [124]
Convert the direction of the character strokes into one code [126]
PixelsÕ values of 11 subblocks [127]
Nonoverlapping 5 × 5 blocks [128]
Contour-crossing counts (CCs), directional counts (DCs), and
peripheral background area (PBA) [49]
Topological features of characters including the number of holes,
endpoints, three-way nodes, and four-way nodes [52]
segmentation using different methods are shown when avail-able with processing speed.
Some characters are similar in their shape, such as (B-8), (O-0), (I-1), (A-4), (C-G), (D-O), and (K-X). These characters confuse the character recognizer, especially when they are distorted. Dealing with this ambiguity problem should attract more attention than regular OCR in future research.
V. SUMMARY, FUTURE DIRECTIONS, AND CONCLUSION
A. Summary
In general, an ALPR system consists of four processing stages. In the image acquisition stage, some points have to be considered when choosing the ALPR system camera, such as the camera resolution and the shutter speed. In the license plate extraction stage, the license plate is extracted based on some features such as the color, the boundary, or the existence of the characters. In the license plate segmentation stage, the characters are extracted by projecting their color information, by labeling them, or by matching their positions with template. Finally, the characters are recognized in the character recog-nition stage by template matching, or by classiÞers such as neural networks and fuzzy classiÞers. Automatic license plate recognition is quite challenging due to the different license plate formats and the varying environmental conditions. There are numerous ALPR techniques that have been proposed in recent years. Table IV highlights some typical ALPR systems performance as presented in the literature. Issues, such as main processing procedure, experimental database, processing time, and recognition rate, are provided. However, the authors of [4] pointed out that it is inappropriate to explicitly declare which methods demonstrate the highest performance since there is a lack of uniform way to evaluate the methods. Therefore, in [4],
Anagnostopoulos et al. provided researchers with a common test set to facilitate the systematic performance assessment.
B. Current Trends and Future Directions
Although signiÞcant progress of ALPR techniques has been made in the last few decades, there is still a lot of work to be done since a robust system should work effectively under a variety of environmental conditions and plate conditions.
An effective ALPR system should have the ability to deal with multistyle plates, e.g., different national plates with different fonts and different syntax. Little existing research has addressed this issue, but still has some constraints. In [127], four critic