模式识别在所有的自动化,信息处理和检索应用中都至关重要。本书由该领域内的两位顶级专家合著而成,从工 程角度,全面阐述了模式识别的应用,涉及的主题从图像分析到语音识别与通信,书中涉及到了神经网络的前沿材料,着重描述了包括独立分量和支持向量机在内的最新进展。本书是享誉世界的名著,经过十余年的发展,已成为此领域 最全面的参考书,被世界众多高校选用为教材。除了适合教学外,也可供工程技术人员参考。
本书的主要特点:
最新的特征生成技术,包括基于小波。小波包,分形的特征,还阐述了独立分量分析。
新增了关子支持向量机,变形模板匹配的章节,以及关于约束优化的附录。
特征选择技术。
线性以及非线性分类器的设计,包括贝叶斯分类器、多层感知器,决策树和RBF网络。
独立于上下文的分类,包括动态规划和隐马尔科夫建模技术。
不仅介绍了聚类算法的最新发展,而且还介绍了一些经典方法,诸如模糊。基因、退火等算法技术。
各种应用,包括图像分析。字符识别,医学诊断。语音识别以及信道均衡。
This book is the outgrowth of our teaching advanced undergraduate and gradu-ate courses over the past 20 years. These courses have been taught to differentaudiences, including students in electrical and electronics engineering, computer engineering, computer science and informatics, as well as to an interdisciplinary audience of a graduate course on automation. This experience led us to make the book as self-contained as possible and to address students with different back-grounds. As prerequisitive knowledge the reader requires only basic calculus,elementary linear algebra, and some probability theory basics. A number of mathe-matical tools, such as probability and statistics as well as constrained optimization,needed by various chapters, are treated in four Appendices. The book is designed to serve as a text for advanced undergraduate and graduate students, and it can be used for either a one- or a two-semester course. Furthermore, it is intended to be used as a self-study and reference book for research and for the practicing scientist/engineer. This latter audience was also our second incentive for writing this book, due to the involvement of our group in a number of projects related to pattern recognition.
The philosophy of the book is to present various pattern recognition tasks in a unified way, including image analysis, speech processing, and communication applications. Despite their differences, these areas do share common features and their study can only benefit from a unified approach. Each chapter of the book starts with the basics and moves progressively to more advanced topics and reviews up-to-date techniques. A number of problems and computer exercises are given at the end of each chapter and a solutions manual is available from the publisher.
Furthermore, a number of demonstrations based on MATLAB are available via the web at the book's site, http://www, di.uoa.gr/~stpatrec.
Our intention is to update the site regularly with more and/or improved versions of these demonstrations. Suggestions are always welcome. Also at this web site, a page will be available for typos, which are unavoidable, despite frequent careful reading. The authors would appreciate readers notifying them about any typos found.
This book would have not be written without the constant support and help from a number of colleagues and students throughout the years. We are especially indebted to Prof. K. Berberidis, Dr. E. Kofidis, Prof. A. Liavas, Dr. A. Rontogiannis, Dr. A. Pikrakis, Dr. Gezerlis and Dr. K. Georgoulakis. The constant support provided by Dr. I. Kopsinis from the early stages up to the final stage, with those long nights,has been invaluable. The book improved a great deal after the careful reading and the serious comments and suggestions of Prof. G. Moustakides, Prof. V. Digalakis,Prof. T. Adali, Prof. M. Zervakis, Prof. D. Cavouras, Prof. A. B0hm, Prof. G.Glentis, Prof. E. Koutsoupias, Prof. V. Zissimopoulos, Prof. A. Likas, Dr. A.Vassiliou, Dr. N. Vassilas, Dr. V. Drakopoulos, Dr. S. Hatzispyros. We are greatly indebted to these colleagues for their time and their constructive criticisms. Our collaboration and friendship with Prof. N. Kalouptsidis have been a source of constant inspiration for all these years. We are both deeply indebted to him.
Last but not least, K. Koutroumbas would like to thank Sophia for her tolerance and support and S. Theodoridis would like to thank Despina, Eva, and Eleni, his joyful and supportive "harem."
(希腊)Sergios Theodoridis,Konstantinos Koutroumbas:Sergios Theodoridis: Sergios Theodoridis 是希腊雅典大学信息系教授。于1973年在雅典大学获得物理学学士学位,又分别于 1975年,1978年在英国伯明翰大学获得信号处理与通信硕士和博士学位。主要研究方向是自适应信号处理。通信与模式识别。他是欧洲并行结构及语言协会(PARLE-95)的主席和欧洲信号处理协会(亡USIPCO-98)的常务主席、《信 号处理》杂志编委。
Konstantinos Koutroumbas: Konstantinos Koutroumbas 任职于希腊雅典国家天文台空间应用研究院,是国际知名的专家。
Preface
CHAPTER 1 INTRODUCTION
1.1 Is Pattern Recognition Important
1.2 Features, Feature Vectors, and Classifiers
1.3 Supervised Versus Unsupervised Pattern
Recognition
1.4 Outline of the Book
CHAPTER CLASSIFIERS BASED ON BAYES DECISION THEORY
2.1 Introduction
2.2 Bayes Decision Theory
2.3 Discriminant Functions and Decision Surfaces
2.4 Bayesian Classification for Normal Distributions
2.5 Estimation of Unknown Probability Density
Functions
2.5.1 Maximum Likelihood Parameter Estimation
2.5.2 Maximum a Posteriori Probability
Estimation
2.5.3 Bayesian Inference
2.5.4 Maximum Entropy Estimation
2.5.5 Mixture Models
2.5.6 Nonparametric Estimation
2.6 The Nearest Neighbor Rule
CHAPTER 3 LINEAR CLASSIFIERS
3.1 Introduction
3.2 Linear Discriminant Functions and Decision
Hyperplanes
3.3 The Perceptron Algorithm
3.4 Least Squares Methods
3.4.1 Mean Square Error Estimation
3.4.2 Stochastic Approximation and the LMS
Algorithm
3.4.3 Sum of Error Squares Estimation
3.5 Mean Square Estimation Revisited
3.5.1 Mean Square Error Regression
3.5.2 MSE Estimates Posterior Class Probabilities
3.5.3 The Bias-Variance Dilemma
3.6 Support Vector Machines
3.6.1 Separable Classes
3.6.2 Nonseparable Classes
CHAPTER 4 NONLINEAR CLASSIFIERS
4.1 Introduction
4.2 The XOR Problem
4.3 The Two-Layer Perceptron
4.3.1 Classification Capabilities of the Two-Layer
Perceptron
4.4 Three-Layer Perceptrons
4.5 Algorithms Based on Exact Classification of the
Training Set
4.6 The Backpropagation Algorithm
4.7 Variations on the; Backpropagation Theme
4.8 The Cost Function Choice
4.9 Choice of the Network Size
4.10 A Simulation Example
4.11 Networks With Weight Sharing
4.12 Generalized Linear Classifiers
4.13 Capacity of the/-Dimensional Space in Linear
Dichotomies
4.14 Polynomial Classifiers
4.15 Radial Basis Function Networks
4.16 Universal Approximators
4.17 Support Vector Machines: The Nonlinear Case
4.18 Decision Trees
4.18.1 Set of Questions
4.18.2 Splitting Criterion
4.18.3 Stop-Splitting Rule
4.18.4 Class Assignment Rule
4.19 Discussion
CHAPTER 5 FEATURE SELECTION
5.1 Introduction
5.2 Preprocessing
5.2.1 Outlier Removal
5.2.2 Data Normalization
5.2.3 Missing Data
5.3 Feature Selection Based on Statistical Hypothesis
Testing
5.3.1 Hypothesis Testing Basics
5.3.2 Application of the t-Test in Feature
Selection
5.4 The Receiver Operating Characteristics CROC Curve
5.5 Class Separability Measures
5.5.1 Divergence
5.5.2 Chernoff Bound and
Bhattacharyya Distance
5.5.3 Scatter Matrices
5.6 Feature Subset Selection
5.6.1 Scalar Feature Selection
5.6.2 Feature Vector Selection
5.7 Optimal Feature Generation
5.8 Neural Networks and Feature Generation/Selection
5.9 A Hint on the Vapnik--Chemovenkis Learning
Theory
CHAPTER 6 FEATURE GENERATION I: LINEAR TRANSFORMS
6.1 Introduction
6.2 Basis Vectors and Images
6.3 The Karhunen-Loeve Transform
6.4 The Singular Value Decomposition
6.5 Independent Component Analysis
6.5.1 ICA Based on Second- and Fourth-Order
Cumulants
6.5.2 ICA Based on Mutual Information
6.5.3 An ICA Simulation Example
6.6 The Discrete Fourier Transform (DFT)
6.6.1 One-Dimensional DFT
6.6.2 Two-Dimensional DFT
6.7 The Discrete Cosine and Sine Transforms
6.8 The Hadamard Transform
6.9 The Haar Transform
6.10 The Haar Expansion Revisited
6.11 Discrete Time Wavelet Transform (DTWT)
6.12 The Multiresolution Interpretation
6.13 Wavelet Packets
6.14 A Look at Two-Dimensional Generalizations
6.15 Applications
CHAPTER 7 FEATURE GENERATION II
7.1 Introduction
7.2 Regional Features
7.2.1 Features for Texture Characterization
7.2.2 Local Linear Transforms for Texture
Feature Extraction
7.2.3 Moments
7.2.4 Parametric Models
7.3 Features for Shape and Size Characterization
7.3.1 Fourier Features
7.3.2 Chain Codes
7.3.3 Moment-Based Features
7.3.4 Geometric Features
7.4 A Glimpse at Fractals
7.4.1 Self-Similarity and Fractal Dimension
7.4.2 Fractional Brownian Motion
CHAPTER 8 TEMPLATE MATCHING
8.1 Introduction
8.2 Measures Based on Optimal Path Searching
Techniques
8.2.1 Bellman's Optimality Principle and
Dynamic Programming
8.2.2 The Edit Distance
8.2.3 Dynamic Time Warping in Speech
Recognition
8.3 Measures Based on Correlations
8.4 Deformable Template Models
CHAPTER 9 CONTEXT-DEPENDENT CLASSIFICATION
9.1 Introduction
9.2 The Bayes Classifier
9.3 Markov Chain Models
9.4 The Viterbi Algorithm
9.5 Channel Equalization
9.6 Hidden Markov Models
9.7 Training Markov Models via Neural Networks
9.8 A discussion of Markov Random Fields
CHAPTSR 10 SYSTEM EVALUATION
10.1 Introduction
10.2 Error Counting Approach
10.3 Exploiting the Finite Size of the Data Set
10.4 A Case Study From Medical Imaging
CHAPTER 11 CLUSTERING: BASIC CONCEPTS
11.1 Introduction
11.1.1 Applications of Cluster Analysis
11.1.2 Types of Features
11.1.3 Definitions of Clustering
11.2 Proximity Measures
11.2.1 Definitions
11.2.2 Proximity Measures between Two Points
11.2.3 Proximity Functions between a Point and
a Set
11.2.4 Proximity Functions between Two Sets
CHAPTER 12 CLUSTERING ALGORITHMS I: SEQUENTIAL
ALGORITHMS
12.1 Introduction
12.1.1 Number of Possible Clusterings
12.2 Categories of Clustering Algorithms
12.3 Sequential Clustering Algorithms
12.3.1 Estimation of the Number of sters
12.4 A Modification of BSAS
12.5 A Two-Threshold Sequential Scheme
12.6 Refinement Stages
12.7 Neural Network Implementation
12.7.1 Description of the Architecture
12.7.2 Implementation of the BSAS Algorithm
CHAPTER 13 CLUSTERING ALGORITHMS II: HIERARCHICAL
ALGORITHMS
13.1 Introduction
13.2 Agglomerative Algorithms
13.2.1 Definition of Some Useful Quantities
13.2.2 Agglomerative Algorithms Based on
Matrix Thetry
13.2.3 Monotonicity and Crossover
13.2.4 Implementational Issues
13.2.5 Agglomerative Algorithms Based on
Graph Theory
13.2.6 Ties in the Proximity Matrix
13.3 The Cophenetic Matrix
13.4 Divisive Algorithms
13.5 Choice of the Best Number of Clusters
CHAPTER 14 CLUSTERING ALGORITHMS III:
SCHEMES BASED ON FUNCTION OPTIMIZATION
14.1 Introduction
14.2 Mixture Decomposition Schemes
14.2.1 Compact and Hyperellipsoidal Clusters
14.2.2 A Geometrical Interpretation
14.3 Fuzzy Clustering Algorithms
14.3.1 Point Representatives
14.3.2 Quadric Surfacesas Representatives
14.3.3 Hyperplane Representatives
14.3.4 Combining Quadric and Hyperplane
Representatives
14.3.5 A Geometrical Interpretation
14.3.6 Convergence Aspects of the Fuzzy
Clustering Algorithms
14.3.7 Alternating Cluster Estimation
14.4 Possibilistic Clustering
14.4.1 The Mode-Seeking Property
14.4.2 An Alternative Possibilistic Scheme
14.5 Hard Clustering Algorithms
14.5.1 The Isodata or k-Means or c-Means
Algorithm
14.6 Vector Quantization
CHAPTER 15 CLUSTERING ALGORITHMS IV
15.1 Introduction
15.2 Clustering Algorithms Based on Graph Theory
15.2.1 Minimum Spanning Tree Algorithms
15.2.2 Algorithms Based on Regions of Influence
15.2.3 Algorithms Based on Directed Trees
15.3 Competitive Learning Algorithms
15.3.1 Basic Competitive Learning Algorithm
15.3.2 Leaky Learning Algorithm
15.3.3 Conscientious Competitive Learning
Algorithms
15.3.4 Competitive Learning-Like Algorithms
Associated with Cost Functions
15.3.5 Self-Organizing Maps
15.3.6 Supervised Learning Vector Quantization
15.4 Branch and Bound Clustering Algorithms
15.5 Binary Morphology Clustering Algorithms (BMCAs)
15.5.1 Discretization
15.5.2 Morphological Operations
15.5.3 Determination of the Clusters in a Discrete
Binary Set
15.5.4 Assignment of Feature Vectors to Clusters
15.5.5 The Algorithmic Scheme
15.6 Boundary Detection Algorithms
15.7 Valley-Seeking Clustering Algorithms
15.8 Clustering Via Cost Optimization (Revisited)
15.8.1 Simulated Annealing
15.8.2 Deterministic Annealing
15.9 Clustering Using Genetic Algorithms
15.10 Other Clustering Algorithms
CHAPTER 16 CLUSTER VALIDITY
16.1 Introduction
16.2 Hypothesis Testing Revisited
16.3 Hypothesis Testing in Cluster Validity
16.3.1 External Criteria
16.3.2 Internal Criteria
16.4 Relative Criteria
16.4.1 Hard Clustering
16.4.2 Fuzzy Clustering
16.5 Validity of Individual Clusters
16.5.1 External Criteria
16.5.2 Internal Criteria
16.6 Clustering Tendency
16.6.1 Tests for Spatial Randomness
Appendix A
Hints from Probability and Statistics
Appendix B
Linear Algebra Basics
Appendix C
Cost Function Optimization
Appendix D
Basic Definitions from Linear Systems Theory
Index