In sports, virtual spaces are sometimes utilized to enhance performance or user experience. In this study, we conducted a frequency analysis, semantic network analysis, and topic modeling using 134 abstracts obtained ...In sports, virtual spaces are sometimes utilized to enhance performance or user experience. In this study, we conducted a frequency analysis, semantic network analysis, and topic modeling using 134 abstracts obtained through keyword searches focusing on “sport(s)” in combination with “metaverse”, “augmented reality”, “virtual reality”, “lifelogging”, and “mixed reality”. First, the top 20 words were extracted through frequency analysis, and then each type of extracted word was retained to select seven words. The analysis revealed the emergence of key themes such as “user(s)”, “game(s)”, “technolog(y/ies)”, “experience(d)”, “physical”, “training”, and “video”, with variations in intensity depending on the type of metaverse. Second, the relationships between the words were reconfirmed using semantic networks based on the seven selected words. Finally, topic modeling analysis was conducted to uncover themes specific to each type of metaverse. We also found that “performance/scoring” was a prominent word across all types of metaverses. This suggests that in addition to providing enjoyment through sports, there is a high possibility that all users (both general users and athletes) utilize the metaverse to achieve positive outcomes and success. The importance of “performance/scoring” in sports may seem obvious;however, it also provides significant insights for practitioners when combined with metaverse-related keywords. Ultimately, this study has managerial implications for enhancing the performance of specialized users in the sports industry.展开更多
To further enhance the efficiencies of search engines,achieving capabilities of searching,indexing and locating the information in the deep web,latent semantic analysis is a simple and effective way.Through the latent...To further enhance the efficiencies of search engines,achieving capabilities of searching,indexing and locating the information in the deep web,latent semantic analysis is a simple and effective way.Through the latent semantic analysis of the attributes in the query interfaces and the unique entrances of the deep web sites,the hidden semantic structure information can be retrieved and dimension reduction can be achieved to a certain extent.Using this semantic structure information,the contents in the site can be inferred and the similarity measures among sites in deep web can be revised.Experimental results show that latent semantic analysis revises and improves the semantic understanding of the query form in the deep web,which overcomes the shortcomings of the keyword-based methods.This approach can be used to effectively search the most similar site for any given site and to obtain a site list which conforms to the restrictions one specifies.展开更多
In recent years,multimedia annotation problem has been attracting significant research attention in multimedia and computer vision areas,especially for automatic image annotation,whose purpose is to provide an efficie...In recent years,multimedia annotation problem has been attracting significant research attention in multimedia and computer vision areas,especially for automatic image annotation,whose purpose is to provide an efficient and effective searching environment for users to query their images more easily. In this paper,a semi-supervised learning based probabilistic latent semantic analysis( PLSA) model for automatic image annotation is presenred. Since it's often hard to obtain or create labeled images in large quantities while unlabeled ones are easier to collect,a transductive support vector machine( TSVM) is exploited to enhance the quality of the training image data. Then,different image features with different magnitudes will result in different performance for automatic image annotation. To this end,a Gaussian normalization method is utilized to normalize different features extracted from effective image regions segmented by the normalized cuts algorithm so as to reserve the intrinsic content of images as complete as possible. Finally,a PLSA model with asymmetric modalities is constructed based on the expectation maximization( EM) algorithm to predict a candidate set of annotations with confidence scores. Extensive experiments on the general-purpose Corel5k dataset demonstrate that the proposed model can significantly improve performance of traditional PLSA for the task of automatic image annotation.展开更多
Semantic video analysis plays an important role in the field of machine intelligence and pattern recognition. In this paper, based on the Hidden Markov Model (HMM), a semantic recognition framework on compressed video...Semantic video analysis plays an important role in the field of machine intelligence and pattern recognition. In this paper, based on the Hidden Markov Model (HMM), a semantic recognition framework on compressed videos is proposed to analyze the video events according to six low-level features. After the detailed analysis of video events, the pattern of global motion and five features in foreground—the principal parts of videos, are employed as the observations of the Hidden Markov Model to classify events in videos. The applications of the proposed framework in some video event detections demonstrate the promising success of the proposed framework on semantic video analysis.展开更多
Because of everyone's involvement in social networks, social networks are full of massive multimedia data, and events are got released and disseminated through social networks in the form of multi-modal and multi-att...Because of everyone's involvement in social networks, social networks are full of massive multimedia data, and events are got released and disseminated through social networks in the form of multi-modal and multi-attribute heterogeneous data. There have been numerous researches on social network search. Considering the spatio-temporal feature of messages and social relationships among users, we summarized an overall social network search framework from the perspective of semantics based on existing researches. For social network search, the acquisition and representation of spatio-temporal data is the basis, the semantic analysis and modeling of social network cross-media big data is an important component, deep semantic learning of social networks is the key research field, and the indexing and ranking mechanism is the indispensable part. This paper reviews the current studies in these fields, and then main challenges of social network search are given. Finally, we give an outlook to the prospect and further work of social network search.展开更多
The global view of firewall policy conflict is important for administrators to optimize the policy.It has been lack of appropriate firewall policy global conflict analysis,existing methods focus on local conflict dete...The global view of firewall policy conflict is important for administrators to optimize the policy.It has been lack of appropriate firewall policy global conflict analysis,existing methods focus on local conflict detection.We research the global conflict detection algorithm in this paper.We presented a semantic model that captures more complete classifications of the policy using knowledge concept in rough set.Based on this model,we presented the global conflict formal model,and represent it with OBDD(Ordered Binary Decision Diagram).Then we developed GFPCDA(Global Firewall Policy Conflict Detection Algorithm) algorithm to detect global conflict.In experiment,we evaluated the usability of our semantic model by eliminating the false positives and false negatives caused by incomplete policy semantic model,of a classical algorithm.We compared this algorithm with GFPCDA algorithm.The results show that GFPCDA detects conflicts more precisely and independently,and has better performance.展开更多
Social media platforms provide new value for markets and research companies.This article explores the use of social media data to enhance customer value propositions.The case study involves a company that develops wea...Social media platforms provide new value for markets and research companies.This article explores the use of social media data to enhance customer value propositions.The case study involves a company that develops wearable Internet of Things(IoT)devices and services for stress management.Netnography and semantic annotation for recognizing and categorizing the context of tweets are conducted to gain a better understanding of users’stress management practices.The aim is to analyze the tweets about stress management practices and to identify the context from the tweets.Thereafter,we map the tweets on pleasure and arousal to elicit customer insights.We analyzed a case study of a marketing strategy on the Twitter platform.Participants in the marketing campaign shared photos and texts about their stress management practices.Machine learning techniques were used to evaluate and estimate the emotions and contexts of the tweets posted by the campaign participants.The computational semantic analysis of the tweets was compared to the text analysis of the tweets.The content analysis of only tweet images resulted in 96%accuracy in detecting tweet context,while that of the textual content of tweets yielded an accuracy of 91%.Semantic tagging by Ontotext was able to detect correct tweet context with an accuracy of 50%.展开更多
Current research on metaphor analysis is generally knowledge-based and corpus-based,which calls for methods of automatic feature extraction and weight calculation.Combining natural language processing(NLP),latent sema...Current research on metaphor analysis is generally knowledge-based and corpus-based,which calls for methods of automatic feature extraction and weight calculation.Combining natural language processing(NLP),latent semantic analysis(LSA),and Pearson correlation coefficient,this paper proposes a metaphor analysis method for extracting the content words from both literal and metaphorical corpus,calculating correlation degree,and analyzing their relationships.The value of the proposed method was demonstrated through a case study by using a corpus with keyword“飞翔(fly)”.When compared with the method of Pearson correlation coefficient,the experiment shows that the LSA can produce better results with greater significance in correlation degree.It is also found that the number of common words that appeared in both literal and metaphorical word bags decreased with the correlation degree.The case study also revealed that there are more nouns appear in literal corpus,and more adjectives and adverbs appear in metaphorical corpus.The method proposed will benefit NLP researchers to develop the required step-by-step calculation tools for accurate quantitative analysis.展开更多
A document layout can be more informative than merely a document’s visual and structural appearance.Thus,document layout analysis(DLA)is considered a necessary prerequisite for advanced processing and detailed docume...A document layout can be more informative than merely a document’s visual and structural appearance.Thus,document layout analysis(DLA)is considered a necessary prerequisite for advanced processing and detailed document image analysis to be further used in several applications and different objectives.This research extends the traditional approaches of DLA and introduces the concept of semantic document layout analysis(SDLA)by proposing a novel framework for semantic layout analysis and characterization of handwritten manuscripts.The proposed SDLA approach enables the derivation of implicit information and semantic characteristics,which can be effectively utilized in dozens of practical applications for various purposes,in a way bridging the semantic gap and providingmore understandable high-level document image analysis and more invariant characterization via absolute and relative labeling.This approach is validated and evaluated on a large dataset ofArabic handwrittenmanuscripts comprising complex layouts.The experimental work shows promising results in terms of accurate and effective semantic characteristic-based clustering and retrieval of handwritten manuscripts.It also indicates the expected efficacy of using the capabilities of the proposed approach in automating and facilitating many functional,reallife tasks such as effort estimation and pricing of transcription or typing of such complex manuscripts.展开更多
The main focus of the article is the semantic analysis and genesis of the words that create the lexical base of the modern Azerbaijani language to a certain extent and belong to the roots system of the language.The go...The main focus of the article is the semantic analysis and genesis of the words that create the lexical base of the modern Azerbaijani language to a certain extent and belong to the roots system of the language.The goal is to restore the words which have gone through deformation and flexion for thousands of years to their initial forms.The concept of stem cells in genetics has also been utilized as an analogy method because the author believes that languages are living organisms too and they have words and elements functioning as stem cells.Thus,the principal idea is that the linguistic units and words entering the organic system of a language are deprivations of the aforementioned linguistic stem cells.The stem words and concepts-the original elements of a language are determined in the first place and all the following analyses are built upon them.Such studies contain a wide range of comparativist investigations as well.Examples from the Ancient Greek and Latin languages have also been used as comparativism objects.Discovery of such words will not only give us linguistic information but also objective historical information on different aspects.This fact can be considered one of the main reasons making this kind of study very significant.展开更多
Chinese Color Words two words have a higher degree turn and turn-grade class degree,they are "white"(white) and "black"(black),these two sets of words are generally located on human color perceptio...Chinese Color Words two words have a higher degree turn and turn-grade class degree,they are "white"(white) and "black"(black),these two sets of words are generally located on human color perception of the system the top three,we believe that the typical basic color terms most likely to turn and turn-grade class,but different history,culture and other aspects of cognition,cross-grammatical category they are different order.Based on this,in English and Chinese Basic Color Terms "Black" and "White" Cognitive Semantic Analysis of the research topic,this in-depth study of this aspect of the study hope to be beneficial to help.展开更多
Software testing is a critical phase due to misconceptions about ambiguities in the requirements during specification,which affect the testing process.Therefore,it is difficult to identify all faults in software.As re...Software testing is a critical phase due to misconceptions about ambiguities in the requirements during specification,which affect the testing process.Therefore,it is difficult to identify all faults in software.As requirement changes continuously,it increases the irrelevancy and redundancy during testing.Due to these challenges;fault detection capability decreases and there arises a need to improve the testing process,which is based on changes in requirements specification.In this research,we have developed a model to resolve testing challenges through requirement prioritization and prediction in an agile-based environment.The research objective is to identify the most relevant and meaningful requirements through semantic analysis for correct change analysis.Then compute the similarity of requirements through case-based reasoning,which predicted the requirements for reuse and restricted to error-based requirements.Afterward,the apriori algorithm mapped out requirement frequency to select relevant test cases based on frequently reused or not reused test cases to increase the fault detection rate.Furthermore,the proposed model was evaluated by conducting experiments.The results showed that requirement redundancy and irrelevancy improved due to semantic analysis,which correctly predicted the requirements,increasing the fault detection rate and resulting in high user satisfaction.The predicted requirements are mapped into test cases,increasing the fault detection rate after changes to achieve higher user satisfaction.Therefore,the model improves the redundancy and irrelevancy of requirements by more than 90%compared to other clustering methods and the analytical hierarchical process,achieving an 80%fault detection rate at an earlier stage.Hence,it provides guidelines for practitioners and researchers in the modern era.In the future,we will provide the working prototype of this model for proof of concept.展开更多
This study introduces the Orbit Weighting Scheme(OWS),a novel approach aimed at enhancing the precision and efficiency of Vector Space information retrieval(IR)models,which have traditionally relied on weighting schem...This study introduces the Orbit Weighting Scheme(OWS),a novel approach aimed at enhancing the precision and efficiency of Vector Space information retrieval(IR)models,which have traditionally relied on weighting schemes like tf-idf and BM25.These conventional methods often struggle with accurately capturing document relevance,leading to inefficiencies in both retrieval performance and index size management.OWS proposes a dynamic weighting mechanism that evaluates the significance of terms based on their orbital position within the vector space,emphasizing term relationships and distribution patterns overlooked by existing models.Our research focuses on evaluating OWS’s impact on model accuracy using Information Retrieval metrics like Recall,Precision,InterpolatedAverage Precision(IAP),andMeanAverage Precision(MAP).Additionally,we assessOWS’s effectiveness in reducing the inverted index size,crucial for model efficiency.We compare OWS-based retrieval models against others using different schemes,including tf-idf variations and BM25Delta.Results reveal OWS’s superiority,achieving a 54%Recall and 81%MAP,and a notable 38%reduction in the inverted index size.This highlights OWS’s potential in optimizing retrieval processes and underscores the need for further research in this underrepresented area to fully leverage OWS’s capabilities in information retrieval methodologies.展开更多
Urban sustainability assessment is an effective method for objectively presenting the current state of sustainable urban development and diagnosing sustainability-related issues.As the global community intensifies its...Urban sustainability assessment is an effective method for objectively presenting the current state of sustainable urban development and diagnosing sustainability-related issues.As the global community intensifies its efforts to implement the sustainable development goals(SDGs),the demand for assessing progress in urban sustainable development has increased.This has led to the emergence of numerous indicator systems with varying scales and themes published by different entities.Cities participating in these evaluations often encounter difficulties in matching indicators or the absence of certain indicators.In this context,urban decision makers and planners urgently need to identify substitute indicators that can express the semantic meaning of the original indicators and consider the availability of indicators for participating cities.Hence,this study explores the relationships of substitution between indicators and constructs a collection of substitute indicators to serve as a reference for sustainable urban development assessment.Specifically,building on a review of international and Chinese indicators related to urban sustainability assessment,this study employs natural semantic analysis methods based on the Word2Vec model and cosine similarity algorithm to calculate the similarity between indicators related to sustainable urban development.The results show that the Skip-gram algorithm with a word vector dimensionality of 600 has the best performance in terms of calculating the similarity between sustainable urban development assessment indicators.The findings provide valuable insights into selecting substitute indicators for future sustainable urban development assessment,particularly in China.展开更多
In order to solve the problem that current search engines provide query-oriented searches rather than user-oriented ones, and that this improper orientation leads to the search engines' inability to meet the personal...In order to solve the problem that current search engines provide query-oriented searches rather than user-oriented ones, and that this improper orientation leads to the search engines' inability to meet the personalized requirements of users, a novel method based on probabilistic latent semantic analysis (PLSA) is proposed to convert query-oriented web search to user-oriented web search. First, a user profile represented as a user' s topics of interest vector is created by analyzing the user' s click through data based on PLSA, then the user' s queries are mapped into categories based on the user' s preferences, and finally the result list is re-ranked according to the user' s interests based on the new proposed method named user-oriented PageRank (UOPR). Experiments on real life datasets show that the user-oriented search system that adopts PLSA takes considerable consideration of user preferences and better satisfies a user' s personalized information needs.展开更多
Automatic Chinese text summarization for dialogue style is a relatively new research area. In this paper, Latent Semantic Analysis (LSA) is first used to extract semantic knowledge from a given document, all questio...Automatic Chinese text summarization for dialogue style is a relatively new research area. In this paper, Latent Semantic Analysis (LSA) is first used to extract semantic knowledge from a given document, all question paragraphs are identified, an automatic text segmentation approach analogous to Text'filing is exploited to improve the precision of correlating question paragraphs and answer paragraphs, and finally some "important" sentences are extracted from the generic content and the question-answer pairs to generate a complete summary. Experimental results showed that our approach is highly efficient and improves significantly the coherence of the summary while not compromising informativeness.展开更多
Android has been dominating the smartphone market for more than a decade and has managed to capture 87.8%of the market share.Such popularity of Android has drawn the attention of cybercriminals and malware developers....Android has been dominating the smartphone market for more than a decade and has managed to capture 87.8%of the market share.Such popularity of Android has drawn the attention of cybercriminals and malware developers.The malicious applications can steal sensitive information like contacts,read personal messages,record calls,send messages to premium-rate numbers,cause financial loss,gain access to the gallery and can access the user’s geographic location.Numerous surveys on Android security have primarily focused on types of malware attack,their propagation,and techniques to mitigate them.To the best of our knowledge,Android malware literature has never been explored using information modelling techniques.Further,promulgation of contemporary research trends in Android malware research has never been done from semantic point of view.This paper intends to identify intellectual core from Android malware literature using Latent Semantic Analysis(LSA).An extensive corpus of 843 articles on Android malware and security,published during 2009–2019,were processed using LSA.Subsequently,the truncated singular Value Decomposition(SVD)technique was used for dimensionality reduction.Later,machine learning methods were deployed to effectively segregate prominent topic solutions with minimal bias.Apropos to observed term and document loading matrix values,this five core research areas and twenty research trends were identified.Further,potential future research directions have been detailed to offer a quick reference for information scientists.The study concludes to the fact that Android security is crucial for pervasive Android devices.Static analysis is the most widely investigated core area within Android security research and is expected to remain in trend in near future.Research trends indicate the need for a faster yet effective model to detect Android applications causing obfuscation,financial attacks and stealing user information.展开更多
A novel method based on interval temporal syntactic model was proposed to recognize human activities in video flow. The method is composed of two parts: feature extract and activities recognition. Trajectory shape des...A novel method based on interval temporal syntactic model was proposed to recognize human activities in video flow. The method is composed of two parts: feature extract and activities recognition. Trajectory shape descriptor, speeded up robust features(SURF) and histograms of optical flow(HOF) were proposed to represent human activities, which provide more exhaustive information to describe human activities on shape, structure and motion. In the process of recognition, a probabilistic latent semantic analysis model(PLSA) was used to recognize sample activities at the first step. Then, an interval temporal syntactic model, which combines the syntactic model with the interval algebra to model the temporal dependencies of activities explicitly, was introduced to recognize the complex activities with a time relationship. Experiments results show the effectiveness of the proposed method in comparison with other state-of-the-art methods on the public databases for the recognition of complex activities.展开更多
This paper presents a new method for refining image annotation by integrating probabilistic la- tent semantic analysis (PLSA) with conditional random field (CRF). First a PLSA model with asymmetric modalities is c...This paper presents a new method for refining image annotation by integrating probabilistic la- tent semantic analysis (PLSA) with conditional random field (CRF). First a PLSA model with asymmetric modalities is constructed to predict a candidate set of annotations with confidence scores, and then model semantic relationship among the candidate annotations by leveraging conditional ran- dom field. In CRF, the confidence scores generated lay the PLSA model and the Fliekr distance be- tween pairwise candidate annotations are considered as local evidences and contextual potentials re- spectively. The novelty of our method mainly lies in two aspects : exploiting PLSA to predict a candi- date set of annotations with confidence scores as well as CRF to further explore the semantic context among candidate annotations for precise image annotation. To demonstrate the effectiveness of the method proposed in this paper, an experiment is conducted on the standard Corel dataset and its re- sults are 'compared favorably with several state-of-the-art approaches.展开更多
A novel image auto-annotation method is presented based on probabilistic latent semantic analysis(PLSA) model and multiple Markov random fields(MRF).A PLSA model with asymmetric modalities is first constructed to esti...A novel image auto-annotation method is presented based on probabilistic latent semantic analysis(PLSA) model and multiple Markov random fields(MRF).A PLSA model with asymmetric modalities is first constructed to estimate the joint probability between images and semantic concepts,then a subgraph is extracted served as the corresponding structure of Markov random fields and inference over it is performed by the iterative conditional modes so as to capture the final annotation for the image.The novelty of our method mainly lies in two aspects:exploiting PLSA to estimate the joint probability between images and semantic concepts as well as multiple MRF to further explore the semantic context among keywords for accurate image annotation.To demonstrate the effectiveness of this approach,an experiment on the Corel5 k dataset is conducted and its results are compared favorably with the current state-of-the-art approaches.展开更多
文摘In sports, virtual spaces are sometimes utilized to enhance performance or user experience. In this study, we conducted a frequency analysis, semantic network analysis, and topic modeling using 134 abstracts obtained through keyword searches focusing on “sport(s)” in combination with “metaverse”, “augmented reality”, “virtual reality”, “lifelogging”, and “mixed reality”. First, the top 20 words were extracted through frequency analysis, and then each type of extracted word was retained to select seven words. The analysis revealed the emergence of key themes such as “user(s)”, “game(s)”, “technolog(y/ies)”, “experience(d)”, “physical”, “training”, and “video”, with variations in intensity depending on the type of metaverse. Second, the relationships between the words were reconfirmed using semantic networks based on the seven selected words. Finally, topic modeling analysis was conducted to uncover themes specific to each type of metaverse. We also found that “performance/scoring” was a prominent word across all types of metaverses. This suggests that in addition to providing enjoyment through sports, there is a high possibility that all users (both general users and athletes) utilize the metaverse to achieve positive outcomes and success. The importance of “performance/scoring” in sports may seem obvious;however, it also provides significant insights for practitioners when combined with metaverse-related keywords. Ultimately, this study has managerial implications for enhancing the performance of specialized users in the sports industry.
文摘To further enhance the efficiencies of search engines,achieving capabilities of searching,indexing and locating the information in the deep web,latent semantic analysis is a simple and effective way.Through the latent semantic analysis of the attributes in the query interfaces and the unique entrances of the deep web sites,the hidden semantic structure information can be retrieved and dimension reduction can be achieved to a certain extent.Using this semantic structure information,the contents in the site can be inferred and the similarity measures among sites in deep web can be revised.Experimental results show that latent semantic analysis revises and improves the semantic understanding of the query form in the deep web,which overcomes the shortcomings of the keyword-based methods.This approach can be used to effectively search the most similar site for any given site and to obtain a site list which conforms to the restrictions one specifies.
基金Supported by the National Program on Key Basic Research Project(No.2013CB329502)the National Natural Science Foundation of China(No.61202212)+1 种基金the Special Research Project of the Educational Department of Shaanxi Province of China(No.15JK1038)the Key Research Project of Baoji University of Arts and Sciences(No.ZK16047)
文摘In recent years,multimedia annotation problem has been attracting significant research attention in multimedia and computer vision areas,especially for automatic image annotation,whose purpose is to provide an efficient and effective searching environment for users to query their images more easily. In this paper,a semi-supervised learning based probabilistic latent semantic analysis( PLSA) model for automatic image annotation is presenred. Since it's often hard to obtain or create labeled images in large quantities while unlabeled ones are easier to collect,a transductive support vector machine( TSVM) is exploited to enhance the quality of the training image data. Then,different image features with different magnitudes will result in different performance for automatic image annotation. To this end,a Gaussian normalization method is utilized to normalize different features extracted from effective image regions segmented by the normalized cuts algorithm so as to reserve the intrinsic content of images as complete as possible. Finally,a PLSA model with asymmetric modalities is constructed based on the expectation maximization( EM) algorithm to predict a candidate set of annotations with confidence scores. Extensive experiments on the general-purpose Corel5k dataset demonstrate that the proposed model can significantly improve performance of traditional PLSA for the task of automatic image annotation.
基金Supported in part by the National Natural Science Foundation of China (No. 60572045)the Ministry of Education of China Ph.D. Program Foundation (No.20050698033)Cooperation Project (2005.7-2007.6) with Microsoft Research Asia.
文摘Semantic video analysis plays an important role in the field of machine intelligence and pattern recognition. In this paper, based on the Hidden Markov Model (HMM), a semantic recognition framework on compressed videos is proposed to analyze the video events according to six low-level features. After the detailed analysis of video events, the pattern of global motion and five features in foreground—the principal parts of videos, are employed as the observations of the Hidden Markov Model to classify events in videos. The applications of the proposed framework in some video event detections demonstrate the promising success of the proposed framework on semantic video analysis.
文摘Because of everyone's involvement in social networks, social networks are full of massive multimedia data, and events are got released and disseminated through social networks in the form of multi-modal and multi-attribute heterogeneous data. There have been numerous researches on social network search. Considering the spatio-temporal feature of messages and social relationships among users, we summarized an overall social network search framework from the perspective of semantics based on existing researches. For social network search, the acquisition and representation of spatio-temporal data is the basis, the semantic analysis and modeling of social network cross-media big data is an important component, deep semantic learning of social networks is the key research field, and the indexing and ranking mechanism is the indispensable part. This paper reviews the current studies in these fields, and then main challenges of social network search are given. Finally, we give an outlook to the prospect and further work of social network search.
基金supported by the National Nature Science Foundation of China under Grant No.61170295 the Project of National ministry under Grant No.A2120110006+2 种基金 the Co-Funding Project of Beijing Municipal Education Commission under Grant No.JD100060630 the Beijing Education Committee General Program under Grant No. KM201211232010 the National Nature Science Foundation of China under Grant NO. 61370065
文摘The global view of firewall policy conflict is important for administrators to optimize the policy.It has been lack of appropriate firewall policy global conflict analysis,existing methods focus on local conflict detection.We research the global conflict detection algorithm in this paper.We presented a semantic model that captures more complete classifications of the policy using knowledge concept in rough set.Based on this model,we presented the global conflict formal model,and represent it with OBDD(Ordered Binary Decision Diagram).Then we developed GFPCDA(Global Firewall Policy Conflict Detection Algorithm) algorithm to detect global conflict.In experiment,we evaluated the usability of our semantic model by eliminating the false positives and false negatives caused by incomplete policy semantic model,of a classical algorithm.We compared this algorithm with GFPCDA algorithm.The results show that GFPCDA detects conflicts more precisely and independently,and has better performance.
基金This work was supported by Taif University Researchers Supporting Project number(TURSP-2020/292),Taif University,Taif,Saudi Arabia.This research was funded by the Deanship of Scientific Research at Princess Nourah bint Abdulrahman University through the fast-track Research Funding Program.
文摘Social media platforms provide new value for markets and research companies.This article explores the use of social media data to enhance customer value propositions.The case study involves a company that develops wearable Internet of Things(IoT)devices and services for stress management.Netnography and semantic annotation for recognizing and categorizing the context of tweets are conducted to gain a better understanding of users’stress management practices.The aim is to analyze the tweets about stress management practices and to identify the context from the tweets.Thereafter,we map the tweets on pleasure and arousal to elicit customer insights.We analyzed a case study of a marketing strategy on the Twitter platform.Participants in the marketing campaign shared photos and texts about their stress management practices.Machine learning techniques were used to evaluate and estimate the emotions and contexts of the tweets posted by the campaign participants.The computational semantic analysis of the tweets was compared to the text analysis of the tweets.The content analysis of only tweet images resulted in 96%accuracy in detecting tweet context,while that of the textual content of tweets yielded an accuracy of 91%.Semantic tagging by Ontotext was able to detect correct tweet context with an accuracy of 50%.
基金Fundamental Research Funds for the Central Universities of Ministry of Education of China(No.19D111201)。
文摘Current research on metaphor analysis is generally knowledge-based and corpus-based,which calls for methods of automatic feature extraction and weight calculation.Combining natural language processing(NLP),latent semantic analysis(LSA),and Pearson correlation coefficient,this paper proposes a metaphor analysis method for extracting the content words from both literal and metaphorical corpus,calculating correlation degree,and analyzing their relationships.The value of the proposed method was demonstrated through a case study by using a corpus with keyword“飞翔(fly)”.When compared with the method of Pearson correlation coefficient,the experiment shows that the LSA can produce better results with greater significance in correlation degree.It is also found that the number of common words that appeared in both literal and metaphorical word bags decreased with the correlation degree.The case study also revealed that there are more nouns appear in literal corpus,and more adjectives and adverbs appear in metaphorical corpus.The method proposed will benefit NLP researchers to develop the required step-by-step calculation tools for accurate quantitative analysis.
基金This research was supported and funded by KAU Scientific Endowment,King Abdulaziz University,Jeddah,Saudi Arabia.
文摘A document layout can be more informative than merely a document’s visual and structural appearance.Thus,document layout analysis(DLA)is considered a necessary prerequisite for advanced processing and detailed document image analysis to be further used in several applications and different objectives.This research extends the traditional approaches of DLA and introduces the concept of semantic document layout analysis(SDLA)by proposing a novel framework for semantic layout analysis and characterization of handwritten manuscripts.The proposed SDLA approach enables the derivation of implicit information and semantic characteristics,which can be effectively utilized in dozens of practical applications for various purposes,in a way bridging the semantic gap and providingmore understandable high-level document image analysis and more invariant characterization via absolute and relative labeling.This approach is validated and evaluated on a large dataset ofArabic handwrittenmanuscripts comprising complex layouts.The experimental work shows promising results in terms of accurate and effective semantic characteristic-based clustering and retrieval of handwritten manuscripts.It also indicates the expected efficacy of using the capabilities of the proposed approach in automating and facilitating many functional,reallife tasks such as effort estimation and pricing of transcription or typing of such complex manuscripts.
文摘The main focus of the article is the semantic analysis and genesis of the words that create the lexical base of the modern Azerbaijani language to a certain extent and belong to the roots system of the language.The goal is to restore the words which have gone through deformation and flexion for thousands of years to their initial forms.The concept of stem cells in genetics has also been utilized as an analogy method because the author believes that languages are living organisms too and they have words and elements functioning as stem cells.Thus,the principal idea is that the linguistic units and words entering the organic system of a language are deprivations of the aforementioned linguistic stem cells.The stem words and concepts-the original elements of a language are determined in the first place and all the following analyses are built upon them.Such studies contain a wide range of comparativist investigations as well.Examples from the Ancient Greek and Latin languages have also been used as comparativism objects.Discovery of such words will not only give us linguistic information but also objective historical information on different aspects.This fact can be considered one of the main reasons making this kind of study very significant.
文摘Chinese Color Words two words have a higher degree turn and turn-grade class degree,they are "white"(white) and "black"(black),these two sets of words are generally located on human color perception of the system the top three,we believe that the typical basic color terms most likely to turn and turn-grade class,but different history,culture and other aspects of cognition,cross-grammatical category they are different order.Based on this,in English and Chinese Basic Color Terms "Black" and "White" Cognitive Semantic Analysis of the research topic,this in-depth study of this aspect of the study hope to be beneficial to help.
文摘Software testing is a critical phase due to misconceptions about ambiguities in the requirements during specification,which affect the testing process.Therefore,it is difficult to identify all faults in software.As requirement changes continuously,it increases the irrelevancy and redundancy during testing.Due to these challenges;fault detection capability decreases and there arises a need to improve the testing process,which is based on changes in requirements specification.In this research,we have developed a model to resolve testing challenges through requirement prioritization and prediction in an agile-based environment.The research objective is to identify the most relevant and meaningful requirements through semantic analysis for correct change analysis.Then compute the similarity of requirements through case-based reasoning,which predicted the requirements for reuse and restricted to error-based requirements.Afterward,the apriori algorithm mapped out requirement frequency to select relevant test cases based on frequently reused or not reused test cases to increase the fault detection rate.Furthermore,the proposed model was evaluated by conducting experiments.The results showed that requirement redundancy and irrelevancy improved due to semantic analysis,which correctly predicted the requirements,increasing the fault detection rate and resulting in high user satisfaction.The predicted requirements are mapped into test cases,increasing the fault detection rate after changes to achieve higher user satisfaction.Therefore,the model improves the redundancy and irrelevancy of requirements by more than 90%compared to other clustering methods and the analytical hierarchical process,achieving an 80%fault detection rate at an earlier stage.Hence,it provides guidelines for practitioners and researchers in the modern era.In the future,we will provide the working prototype of this model for proof of concept.
文摘This study introduces the Orbit Weighting Scheme(OWS),a novel approach aimed at enhancing the precision and efficiency of Vector Space information retrieval(IR)models,which have traditionally relied on weighting schemes like tf-idf and BM25.These conventional methods often struggle with accurately capturing document relevance,leading to inefficiencies in both retrieval performance and index size management.OWS proposes a dynamic weighting mechanism that evaluates the significance of terms based on their orbital position within the vector space,emphasizing term relationships and distribution patterns overlooked by existing models.Our research focuses on evaluating OWS’s impact on model accuracy using Information Retrieval metrics like Recall,Precision,InterpolatedAverage Precision(IAP),andMeanAverage Precision(MAP).Additionally,we assessOWS’s effectiveness in reducing the inverted index size,crucial for model efficiency.We compare OWS-based retrieval models against others using different schemes,including tf-idf variations and BM25Delta.Results reveal OWS’s superiority,achieving a 54%Recall and 81%MAP,and a notable 38%reduction in the inverted index size.This highlights OWS’s potential in optimizing retrieval processes and underscores the need for further research in this underrepresented area to fully leverage OWS’s capabilities in information retrieval methodologies.
基金supported by the National Key Research and Development Program of China under the theme“Key technologies for urban sustainable development evaluation and decision-making support” [Grant No.2022YFC3802900]the Guangxi Key Research and Development Program [Grant No.Guike AB21220057].
文摘Urban sustainability assessment is an effective method for objectively presenting the current state of sustainable urban development and diagnosing sustainability-related issues.As the global community intensifies its efforts to implement the sustainable development goals(SDGs),the demand for assessing progress in urban sustainable development has increased.This has led to the emergence of numerous indicator systems with varying scales and themes published by different entities.Cities participating in these evaluations often encounter difficulties in matching indicators or the absence of certain indicators.In this context,urban decision makers and planners urgently need to identify substitute indicators that can express the semantic meaning of the original indicators and consider the availability of indicators for participating cities.Hence,this study explores the relationships of substitution between indicators and constructs a collection of substitute indicators to serve as a reference for sustainable urban development assessment.Specifically,building on a review of international and Chinese indicators related to urban sustainability assessment,this study employs natural semantic analysis methods based on the Word2Vec model and cosine similarity algorithm to calculate the similarity between indicators related to sustainable urban development.The results show that the Skip-gram algorithm with a word vector dimensionality of 600 has the best performance in terms of calculating the similarity between sustainable urban development assessment indicators.The findings provide valuable insights into selecting substitute indicators for future sustainable urban development assessment,particularly in China.
基金The National Natural Science Foundation of China(No60573090,60673139)
文摘In order to solve the problem that current search engines provide query-oriented searches rather than user-oriented ones, and that this improper orientation leads to the search engines' inability to meet the personalized requirements of users, a novel method based on probabilistic latent semantic analysis (PLSA) is proposed to convert query-oriented web search to user-oriented web search. First, a user profile represented as a user' s topics of interest vector is created by analyzing the user' s click through data based on PLSA, then the user' s queries are mapped into categories based on the user' s preferences, and finally the result list is re-ranked according to the user' s interests based on the new proposed method named user-oriented PageRank (UOPR). Experiments on real life datasets show that the user-oriented search system that adopts PLSA takes considerable consideration of user preferences and better satisfies a user' s personalized information needs.
基金Project (No. 2002AA119050) supported by the National Hi-TechResearch and Development Program (863) of China
文摘Automatic Chinese text summarization for dialogue style is a relatively new research area. In this paper, Latent Semantic Analysis (LSA) is first used to extract semantic knowledge from a given document, all question paragraphs are identified, an automatic text segmentation approach analogous to Text'filing is exploited to improve the precision of correlating question paragraphs and answer paragraphs, and finally some "important" sentences are extracted from the generic content and the question-answer pairs to generate a complete summary. Experimental results showed that our approach is highly efficient and improves significantly the coherence of the summary while not compromising informativeness.
基金National Research Foundation of Korea-Grant funded by Korean Government(Ministry of Science&ICT)-NRF-2020R1A2B5B02002478 through Dr.Kyung-sup Kwak.
文摘Android has been dominating the smartphone market for more than a decade and has managed to capture 87.8%of the market share.Such popularity of Android has drawn the attention of cybercriminals and malware developers.The malicious applications can steal sensitive information like contacts,read personal messages,record calls,send messages to premium-rate numbers,cause financial loss,gain access to the gallery and can access the user’s geographic location.Numerous surveys on Android security have primarily focused on types of malware attack,their propagation,and techniques to mitigate them.To the best of our knowledge,Android malware literature has never been explored using information modelling techniques.Further,promulgation of contemporary research trends in Android malware research has never been done from semantic point of view.This paper intends to identify intellectual core from Android malware literature using Latent Semantic Analysis(LSA).An extensive corpus of 843 articles on Android malware and security,published during 2009–2019,were processed using LSA.Subsequently,the truncated singular Value Decomposition(SVD)technique was used for dimensionality reduction.Later,machine learning methods were deployed to effectively segregate prominent topic solutions with minimal bias.Apropos to observed term and document loading matrix values,this five core research areas and twenty research trends were identified.Further,potential future research directions have been detailed to offer a quick reference for information scientists.The study concludes to the fact that Android security is crucial for pervasive Android devices.Static analysis is the most widely investigated core area within Android security research and is expected to remain in trend in near future.Research trends indicate the need for a faster yet effective model to detect Android applications causing obfuscation,financial attacks and stealing user information.
基金Project(50808025)supported by the National Natural Science Foundation of ChinaProject(20090162110057)supported by the Doctoral Fund of Ministry of Education,China
文摘A novel method based on interval temporal syntactic model was proposed to recognize human activities in video flow. The method is composed of two parts: feature extract and activities recognition. Trajectory shape descriptor, speeded up robust features(SURF) and histograms of optical flow(HOF) were proposed to represent human activities, which provide more exhaustive information to describe human activities on shape, structure and motion. In the process of recognition, a probabilistic latent semantic analysis model(PLSA) was used to recognize sample activities at the first step. Then, an interval temporal syntactic model, which combines the syntactic model with the interval algebra to model the temporal dependencies of activities explicitly, was introduced to recognize the complex activities with a time relationship. Experiments results show the effectiveness of the proposed method in comparison with other state-of-the-art methods on the public databases for the recognition of complex activities.
基金Supported by the National Basic Research Priorities Programme(No.2013CB329502)the National High Technology Research and Development Programme of China(No.2012AA011003)+1 种基金the Natural Science Basic Research Plan in Shanxi Province of China(No.2014JQ2-6036)the Science and Technology R&D Program of Baoji City(No.203020013,2013R2-2)
文摘This paper presents a new method for refining image annotation by integrating probabilistic la- tent semantic analysis (PLSA) with conditional random field (CRF). First a PLSA model with asymmetric modalities is constructed to predict a candidate set of annotations with confidence scores, and then model semantic relationship among the candidate annotations by leveraging conditional ran- dom field. In CRF, the confidence scores generated lay the PLSA model and the Fliekr distance be- tween pairwise candidate annotations are considered as local evidences and contextual potentials re- spectively. The novelty of our method mainly lies in two aspects : exploiting PLSA to predict a candi- date set of annotations with confidence scores as well as CRF to further explore the semantic context among candidate annotations for precise image annotation. To demonstrate the effectiveness of the method proposed in this paper, an experiment is conducted on the standard Corel dataset and its re- sults are 'compared favorably with several state-of-the-art approaches.
基金Supported by the National Basic Research Priorities Program(No.2013CB329502)the National High-tech R&D Program of China(No.2012AA011003)+1 种基金National Natural Science Foundation of China(No.61035003,61072085,60933004,60903141)the National Scienceand Technology Support Program of China(No.2012BA107B02)
文摘A novel image auto-annotation method is presented based on probabilistic latent semantic analysis(PLSA) model and multiple Markov random fields(MRF).A PLSA model with asymmetric modalities is first constructed to estimate the joint probability between images and semantic concepts,then a subgraph is extracted served as the corresponding structure of Markov random fields and inference over it is performed by the iterative conditional modes so as to capture the final annotation for the image.The novelty of our method mainly lies in two aspects:exploiting PLSA to estimate the joint probability between images and semantic concepts as well as multiple MRF to further explore the semantic context among keywords for accurate image annotation.To demonstrate the effectiveness of this approach,an experiment on the Corel5 k dataset is conducted and its results are compared favorably with the current state-of-the-art approaches.