Cheng-Lin Liu: National Laboratory of Pattern Recognition, Institute of Automation of Chinese Academy of Sciences , Beijing 100190 , People's Republic of China;
Jian Yang: School of Computer Science and Engineering, Nanjing University of Science and Technology , Nanjing 210094 , People's Republic of China
Under the Creative Commons Attribution License
Open Access funded by Chongqing University of Technology
Pattern Recognition (PR) is a key ability of intelligent machines and an important field in intelligence technology. Pattern recognition enables machines to perceive and interact with the environment, humans and other machines. The research topics of pattern recognition include: pattern classification (statistical and structural methods, neural networks, kernel methods, etc), clustering, feature extraction and selection, visual object detection and recognition, video analysis, and applications in various domains. Both the methods and applications have been tremendously progressed in recent years, particularly with the benefit of deep learning methods (deep neural networks), which learn discriminative features by cascading many layers of transformation.
Research works in pattern recognition have been widely published in main journals and conferences, such as IEEE Trans. PAMI, Pattern Recognition and ICPR. In 2011, a new conference, Asian Conference on Pattern Recognition (ACPR), to be held every two years, was started in the Asia-Pacific Region. The 4th ACPR (ACPR 2017) was held in Nanjing (November 26–29, 2017). Following the conference, authors of selected ACPR 2017 papers were invited to submit extended versions to this Special Issue in CAAI Trans. Intelligence Technology. The six accepted papers that form this Special Issue present contributions in image processing, saliency image estimation, medical image analysis, Web image classification, license plate classification and recognition.
In ‘Robust Optimisation Algorithm for the Measurement Matrix in Compressed Sensing’, Zhou et al. propose a measurement matrix optimisation method to improve the efficiency and construction of compressed sensing. This is achieved by considering the energy concentration characteristic of natural images in the sparse domain. The proposed method is based on the Hadamard matrix, named Hadamard-Diagonal Matrix (HDM), which maximises the energy conservation in the sparse domain. In addition, an effective optimisation strategy is adopted for reducing the mutual coherence for better reconstruction quality. Experimental results show that HDM performs better than other popular measurement matrices, and the optimisation algorithm can improve the performance of not only the HDM, but also of other popular measurement matrices.
In ‘Influence of Image Classification Accuracy on Saliency Map Estimation’, Oyama and Yamanaka present their research on the relationship between image classification accuracy and the performance of saliency map estimation using convolutional neural networks (CNNs). First, the authors show that there is a strong correlation between image classification accuracy and saliency map estimation accuracy. They also investigate the effective CNN architecture based on multi-scale images and the upsampling layers to refine the saliency-map resolution. In the reported experiments the proposed model achieves state-of-the-art accuracy on the PASCAL-S, OSIE, and MIT1003 datasets. In the MIT Saliency Benchmark, their model achieves the best performance in some metrics.
In ‘Solution to Overcome the Sparsity Issue of Annotated Data in Medical Domain’, Pujitha et al. propose a method using crowd sourcing and synthetic image generation for training deep neural net-based lesion detection in computer aided diagnosis. To overcome the noisy nature of crowdsourced annotations, the authors assign a reliability factor for crowd subjects based on their performance and experience, and require region of interest markings rather than pixel-level markings from the crowd. A generative adversarial network-based solution is proposed to generate synthetic images with lesions to control the overall severity level of the disease. The authors then evaluate the effectiveness of the crowdsourced annotations and synthetic images by training the DNN with data drawn from a heterogeneous mixture of annotations. Experimental results obtained for hard exudate detection from color fundus images show that training with processed crowdsourced data or synthetic images is effective, with detection performance in terms of sensitivity improved by 25% or 27% over training with just expert-markings.
In ‘Fast Genre Classification of Web Images Using Global and Local Features’, Liu et al. propose a fast method for image type classification. The motivation of the work is to alleviate the computation cost in processing huge number of images on the Web. Through identification of the type of generation source, the image can be sent to different recognisers for extracting information of different purposes. The authors classify images into four classes: natural scene images, born-digital images, scanned and camera-captured paper documents. The algorithm consists of two classification stages, which extract global features at low complexity and local features at high complexity, respectively. Images that are assigned low confidence by the first-stage classifier are processed by the second stage. The features are extracted to reflect the characteristics and differences of four types of images. Different strategies in fusing the classifiers of two stages are also considered. For evaluating the algorithm, the authors built a database containing more than 55,000 images from various sources. They obtained an overall classification accuracy of 98.4% with processing speed over 27FPS on CPU.
In ‘CNN-RNN based Method for License Plate Recognition’, Shivakumara et al. report the classification of license plates into different types before recognition, so as to overcome the large variability of plate background and layout. The authors combine CNN and Recurrent Neural Network BLSTM (Bi-Directional Long Short Term Memory) for plate recognition. While for classification, they propose a method called Dense Cluster based Voting (DCV), which separates foreground and background for successful classification of plates with different backgrounds. Experimental results on live data given by MIMOS (a company funded by Malaysian Government) and the standard dataset UCSD, show the promise of the proposed method. The comparison between plate recognition with classification and that without recognition shows the benefit of classification before recognition.
In ‘Symmetry Features for License Plate Classification’, Raghunandan et al. present new symmetry features based on stroke width for classifying license plate images into different types (private, taxi, cursive, non-text), such that an appropriate Optical Character Recognition (OCR) can be chosen for enhancing recognition performance. The proposed method explores Gradient Vector Flow (GVF) for defining symmetry features, namely, GVF opposite direction, stroke width distance, and stroke pixel direction. Stroke pixels in Canny and Sobel satisfying the above symmetry features are called local candidate stroke pixels, and common pixels of the local candidate stroke pixels are called global candidate stroke pixels. Spatial distributions of stroke pixels in local and global symmetry are explored by generating a weighted proximity matrix to extract statistical features, which are fed to an SVM classifier for classification. Experimental results on large datasets show the effectiveness of classification and its usefulness in improving the plate recognition performance.