Identifying human actions and interactions finds its use in manyareas, such as security, surveillance, assisted living, patient monitoring, rehabilitation,sports, and e-learning. This wide range of applications has at...Identifying human actions and interactions finds its use in manyareas, such as security, surveillance, assisted living, patient monitoring, rehabilitation,sports, and e-learning. This wide range of applications has attractedmany researchers to this field. Inspired by the existing recognition systems,this paper proposes a new and efficient human-object interaction recognition(HOIR) model which is based on modeling human pose and scene featureinformation. There are different aspects involved in an interaction, includingthe humans, the objects, the various body parts of the human, and the backgroundscene. Themain objectives of this research include critically examiningthe importance of all these elements in determining the interaction, estimatinghuman pose through image foresting transform (IFT), and detecting the performedinteractions based on an optimizedmulti-feature vector. The proposedmethodology has six main phases. The first phase involves preprocessing theimages. During preprocessing stages, the videos are converted into imageframes. Then their contrast is adjusted, and noise is removed. In the secondphase, the human-object pair is detected and extracted from each image frame.The third phase involves the identification of key body parts of the detectedhumans using IFT. The fourth phase relates to three different kinds of featureextraction techniques. Then these features are combined and optimized duringthe fifth phase. The optimized vector is used to classify the interactions in thelast phase. TheMSRDaily Activity 3D dataset has been used to test this modeland to prove its efficiency. The proposed system obtains an average accuracyof 91.7% on this dataset.展开更多
Natural scene recognition has important significance and value in the fields of image retrieval,autonomous navigation,human-computer interaction and industrial automation.Firstly,the natural scene image non-text conte...Natural scene recognition has important significance and value in the fields of image retrieval,autonomous navigation,human-computer interaction and industrial automation.Firstly,the natural scene image non-text content takes up relatively high proportion;secondly,the natural scene images have a cluttered background and complex lighting conditions,angle,font and color.Therefore,how to extract text extreme regions efficiently from complex and varied natural scene images plays an important role in natural scene image text recognition.In this paper,a Text extremum region Extraction algorithm based on Joint-Channels(TEJC)is proposed.On the one hand,it can solve the problem that the maximum stable extremum region(MSER)algorithm is only suitable for gray images and difficult to process color images.On the other hand,it solves the problem that the MSER algorithm has high complexity and low accuracy when extracting the most stable extreme region.In this paper,the proposed algorithm is tested and evaluated on the ICDAR data set.The experimental results show that the method has superiority.展开更多
基金This research was supported by the MSIT(Ministry of Science and ICT),Korea,under the ITRC(Information Technology Research Center)support program(IITP-2023-2018-0-01426)supervised by the IITP(Institute for Information&Communications Technology Planning&Evaluation)This work has also been supported by PrincessNourah bint Abdulrahman UniversityResearchers Supporting Project Number(PNURSP2022R239),Princess Nourah bint Abdulrahman University,Riyadh,Saudi Arabia.Alsothis work was partially supported by the Taif University Researchers Supporting Project Number(TURSP-2020/115),Taif University,Taif,Saudi Arabia.
文摘Identifying human actions and interactions finds its use in manyareas, such as security, surveillance, assisted living, patient monitoring, rehabilitation,sports, and e-learning. This wide range of applications has attractedmany researchers to this field. Inspired by the existing recognition systems,this paper proposes a new and efficient human-object interaction recognition(HOIR) model which is based on modeling human pose and scene featureinformation. There are different aspects involved in an interaction, includingthe humans, the objects, the various body parts of the human, and the backgroundscene. Themain objectives of this research include critically examiningthe importance of all these elements in determining the interaction, estimatinghuman pose through image foresting transform (IFT), and detecting the performedinteractions based on an optimizedmulti-feature vector. The proposedmethodology has six main phases. The first phase involves preprocessing theimages. During preprocessing stages, the videos are converted into imageframes. Then their contrast is adjusted, and noise is removed. In the secondphase, the human-object pair is detected and extracted from each image frame.The third phase involves the identification of key body parts of the detectedhumans using IFT. The fourth phase relates to three different kinds of featureextraction techniques. Then these features are combined and optimized duringthe fifth phase. The optimized vector is used to classify the interactions in thelast phase. TheMSRDaily Activity 3D dataset has been used to test this modeland to prove its efficiency. The proposed system obtains an average accuracyof 91.7% on this dataset.
基金This work is supported by State Grid Shandong Electric Power Company Science and Technology Project Funding under Grant Nos.520613180002,62061318C002the Fundamental Research Funds for the Central Universities(Grant No.HIT.NSRIF.201714)+1 种基金Weihai Science and Technology Development Program(2016DX GJMS15)Key Research and Development Program in Shandong Provincial(2017GGX90103).
文摘Natural scene recognition has important significance and value in the fields of image retrieval,autonomous navigation,human-computer interaction and industrial automation.Firstly,the natural scene image non-text content takes up relatively high proportion;secondly,the natural scene images have a cluttered background and complex lighting conditions,angle,font and color.Therefore,how to extract text extreme regions efficiently from complex and varied natural scene images plays an important role in natural scene image text recognition.In this paper,a Text extremum region Extraction algorithm based on Joint-Channels(TEJC)is proposed.On the one hand,it can solve the problem that the maximum stable extremum region(MSER)algorithm is only suitable for gray images and difficult to process color images.On the other hand,it solves the problem that the MSER algorithm has high complexity and low accuracy when extracting the most stable extreme region.In this paper,the proposed algorithm is tested and evaluated on the ICDAR data set.The experimental results show that the method has superiority.