摘要
该文提出了自组织隐马尔可夫模型的人脸检测方法.采用多视人脸样本对隐马尔可夫模型进行训练,得到其参数的初步估计值.在此基础上,裁剪那些状态之间的弱连接,将网络自组织成多路径的左右模型(MPLR).然后利用EM算法对参数重新进行估计,得到隐马尔可夫模型的状态图.在检测阶段,通过求取最优状态序列和最大相似度的方法来判断.与伪二维隐马尔可夫模型相比,该方法的优点在于能检测多视下的人脸,不只局限于垂直正面视.实验结果证明了该方法的有效性.
This paper presents a method of face detection based on self-organized hidden Markov model (Self-Organized HMM). Training the hidden Markov model using multi-view face samples respectively, the informative features and their connections are strengthened while the weak ones keep in low level. After the initial estimation of parameters and cliping out the weak connections in the states, the network can be self-organized into a multi-path left-right model (MPLR). Then EM algorithm is used to re-estimate the HMM parameters. During the detection, we can do the judgment by the optimal state sequence and maximum likelihood. It exploits the important structures of different views and facial features. Each path, corresponding to one discriminated view, keeps the structure of faces with super states from top to bottom, and also the left to right structure with states inside each of these super states. In order to handle the situation of face rotation in plane, face regularization is used to transform them into vertical samples. Dynamic color model, initialized with the built-in off-line trained model, is adaptive to the human races and lighting to reduce the computation of searching for the face candidates. During training, these multi-view paths are self-organized into one unified framework. In the framework, frontal vertical view has the longest states chain, while the side view has the shortest one and others are between them. In the states chains, they have some common states and connections, which in number depend on the panning angle among them. Comparing to the pseudo-HMM, the self-organized multi-path left-right hidden Markov model can detect the multi-view faces efficiently, not limited to the vertical-frontal views. Experiment results testify the efficiency of this algorithm: it can detect the frontal views at the rate of 95.3% and side views at the rate of 92.5% while keeping the false alarm rate at 5%.
出处
《计算机学报》
EI
CSCD
北大核心
2002年第11期1165-1169,共5页
Chinese Journal of Computers
基金
本课题得到国家自然科学基金重大项目(60072029)资助