To investigate the robustness of face recognition algorithms under the complicated variations of illumination, facial expression and posture, the advantages and disadvantages of seven typical algorithms on extracting ...To investigate the robustness of face recognition algorithms under the complicated variations of illumination, facial expression and posture, the advantages and disadvantages of seven typical algorithms on extracting global and local features are studied through the experiments respectively on the Olivetti Research Laboratory database and the other three databases (the three subsets of illumination, expression and posture that are constructed by selecting images from several existing face databases). By taking the above experimental results into consideration, two schemes of face recognition which are based on the decision fusion of the twodimensional linear discriminant analysis (2DLDA) and local binary pattern (LBP) are proposed in this paper to heighten the recognition rates. In addition, partitioning a face nonuniformly for its LBP histograms is conducted to improve the performance. Our experimental results have shown the complementarities of the two kinds of features, the 2DLDA and LBP, and have verified the effectiveness of the proposed fusion algorithms.展开更多
Recent years have witnessed a rapid spread of multi-modality microblogs like Twitter and Sina Weibo composed of image, text and emoticon. Visual sentiment prediction of such microblog based social media has recently a...Recent years have witnessed a rapid spread of multi-modality microblogs like Twitter and Sina Weibo composed of image, text and emoticon. Visual sentiment prediction of such microblog based social media has recently attracted ever-increasing research focus with broad application prospect. In this paper, we give a systematic review of the recent advances and cutting-edge techniques for visual senti- ment analysis. To this end, in this paper we review the most recent works in this topic, in which detailed comparison as well as experimental evaluation are given over the cuttingedge methods. We further reveal and discuss the future trends and potential directions for visual sentiment prediction.展开更多
Towards the end of 2012,artificial intelligence(AI)scientists first figured out how to impart“vision”to neural networks.Later,they also mastered how to enable neural networks to mimic human reasoning,hearing,speakin...Towards the end of 2012,artificial intelligence(AI)scientists first figured out how to impart“vision”to neural networks.Later,they also mastered how to enable neural networks to mimic human reasoning,hearing,speaking,and writing.Although AI has become similar to or even superior to humans in accomplishing specific tasks,it still does not possess the“flexibility”of the human brain,i.e.,the human brain can apply skills learned in one situation to another.Taking cues from the growth process of children,we think about the following question.If senses and language can be combined,and AI can perform at a level closer to humans in terms of collecting and processing information,will it be able to develop an understanding of the world?The answer is yes.“Multi-modal”systems,which can simultaneously acquire human senses and language,thereby generating significantly stronger AI,and making it easier for AI to adapt to new situations and solve new problems.Hence,such algorithms can be used to solve more complex problems,or be implanted into robots for communication and collaboration with humans in our daily lives.In September 2020,researchers from the Allen Institute for AI(AI2)created a model that could generate images from captions,thus demonstrating the ability of the algorithm to associate words with visual information.In November,scientists from the University of North Carolina at Chapel Hill developed a method of incorporating images into existing language models,which significantly enhanced the ability of the model to comprehend text.Early in 2021,OpenAI extended GPT-3 and released two visual language models:one associates the objects in the image with the words in the descriptions,and another one generates a digital image based on the combination of concepts it has learned.The progress made by“multi-modal”systems,in the long run,will help break through the limits of AI.It will not only unlock new AI applications,but also make these applications safer and more reliable.More sophisticated multi-modal systems will also aid the development of more advanced robot assistants.Ultimately,multi-modal systems may prove to be the first AI that we can trust.展开更多
文摘To investigate the robustness of face recognition algorithms under the complicated variations of illumination, facial expression and posture, the advantages and disadvantages of seven typical algorithms on extracting global and local features are studied through the experiments respectively on the Olivetti Research Laboratory database and the other three databases (the three subsets of illumination, expression and posture that are constructed by selecting images from several existing face databases). By taking the above experimental results into consideration, two schemes of face recognition which are based on the decision fusion of the twodimensional linear discriminant analysis (2DLDA) and local binary pattern (LBP) are proposed in this paper to heighten the recognition rates. In addition, partitioning a face nonuniformly for its LBP histograms is conducted to improve the performance. Our experimental results have shown the complementarities of the two kinds of features, the 2DLDA and LBP, and have verified the effectiveness of the proposed fusion algorithms.
文摘Recent years have witnessed a rapid spread of multi-modality microblogs like Twitter and Sina Weibo composed of image, text and emoticon. Visual sentiment prediction of such microblog based social media has recently attracted ever-increasing research focus with broad application prospect. In this paper, we give a systematic review of the recent advances and cutting-edge techniques for visual senti- ment analysis. To this end, in this paper we review the most recent works in this topic, in which detailed comparison as well as experimental evaluation are given over the cuttingedge methods. We further reveal and discuss the future trends and potential directions for visual sentiment prediction.
文摘Towards the end of 2012,artificial intelligence(AI)scientists first figured out how to impart“vision”to neural networks.Later,they also mastered how to enable neural networks to mimic human reasoning,hearing,speaking,and writing.Although AI has become similar to or even superior to humans in accomplishing specific tasks,it still does not possess the“flexibility”of the human brain,i.e.,the human brain can apply skills learned in one situation to another.Taking cues from the growth process of children,we think about the following question.If senses and language can be combined,and AI can perform at a level closer to humans in terms of collecting and processing information,will it be able to develop an understanding of the world?The answer is yes.“Multi-modal”systems,which can simultaneously acquire human senses and language,thereby generating significantly stronger AI,and making it easier for AI to adapt to new situations and solve new problems.Hence,such algorithms can be used to solve more complex problems,or be implanted into robots for communication and collaboration with humans in our daily lives.In September 2020,researchers from the Allen Institute for AI(AI2)created a model that could generate images from captions,thus demonstrating the ability of the algorithm to associate words with visual information.In November,scientists from the University of North Carolina at Chapel Hill developed a method of incorporating images into existing language models,which significantly enhanced the ability of the model to comprehend text.Early in 2021,OpenAI extended GPT-3 and released two visual language models:one associates the objects in the image with the words in the descriptions,and another one generates a digital image based on the combination of concepts it has learned.The progress made by“multi-modal”systems,in the long run,will help break through the limits of AI.It will not only unlock new AI applications,but also make these applications safer and more reliable.More sophisticated multi-modal systems will also aid the development of more advanced robot assistants.Ultimately,multi-modal systems may prove to be the first AI that we can trust.