The user’s intent to seek online information has been an active area of research in user profiling.User profiling considers user characteristics,behaviors,activities,and preferences to sketch user intentions,interest...The user’s intent to seek online information has been an active area of research in user profiling.User profiling considers user characteristics,behaviors,activities,and preferences to sketch user intentions,interests,and motivations.Determining user characteristics can help capture implicit and explicit preferences and intentions for effective user-centric and customized content presentation.The user’s complete online experience in seeking information is a blend of activities such as searching,verifying,and sharing it on social platforms.However,a combination of multiple behaviors in profiling users has yet to be considered.This research takes a novel approach and explores user intent types based on multidimensional online behavior in information acquisition.This research explores information search,verification,and dissemination behavior and identifies diverse types of users based on their online engagement using machine learning.The research proposes a generic user profile template that explains the user characteristics based on the internet experience and uses it as ground truth for data annotation.User feedback is based on online behavior and practices collected by using a survey method.The participants include both males and females from different occupation sectors and different ages.The data collected is subject to feature engineering,and the significant features are presented to unsupervised machine learning methods to identify user intent classes or profiles and their characteristics.Different techniques are evaluated,and the K-Mean clustering method successfully generates five user groups observing different user characteristics with an average silhouette of 0.36 and a distortion score of 1136.Feature average is computed to identify user intent type characteristics.The user intent classes are then further generalized to create a user intent template with an Inter-Rater Reliability of 75%.This research successfully extracts different user types based on their preferences in online content,platforms,criteria,and frequency.The study also validates the proposed template on user feedback data through Inter-Rater Agreement process using an external human rater.展开更多
With the development of information technology,the online retrieval of remote electronic data has become an important method for investigative agencies to collect evidence.In the current normative documents,the online...With the development of information technology,the online retrieval of remote electronic data has become an important method for investigative agencies to collect evidence.In the current normative documents,the online retrieval of electronic data is positioned as a new type of arbitrary investigative measure.However,study of its actual operation has found that the online retrieval of electronic data does not fully comply with the characteristics of arbitrary investigative measures.The root cause is its inaccurately defined nature due to analogy errors,an emphasis on the authenticity of electronic data at the cost of rights protection,insufficient effectiveness of normative documents to break through the boundaries of law,and superficial inconsistency found in the mechanical comparison with the nature of existing investigative measures causes.The nature of electronic data retrieved online should be defined according to different circumstances.The retrieval of electronic data disclosed on the Internet is an arbitrary investigative measure,and following procedural specifications should be sufficient.When investigators conceal their true identities and enter the cyberspace of the suspected crime through a registered account to extract dynamic electronic data for criminal activities,it is essentially a covert investigation in cyberspace,and they should follow the normative requirements for covert investigations.The retrieval of dynamic electronic data from private spaces is a technical investigative measure and should be implemented in accordance with the technical investigative procedures.Retrieval of remote“non-public electronic data involving privacy”is a mandatory investigative measure,and is essentially a search in the virtual space.Therefore,procedural specifications should be set in accordance with the standards of searching.展开更多
Personalized search utilizes user preferences to optimize search results,and most existing studies obtain user preferences by analyzing user behaviors in search engines that provide click-through data.However,the beha...Personalized search utilizes user preferences to optimize search results,and most existing studies obtain user preferences by analyzing user behaviors in search engines that provide click-through data.However,the behavioral data are noisy because users often clicked some irrelevant documents to find their required information,and the new user cold start issue represents a serious problem,greatly reducing the performance of personalized search.This paper attempts to utilize online social network data to obtain user preferences that can be used to personalize search results,mine the knowledge of user interests,user influence and user relationships from online social networks,and use this knowledge to optimize the results returned by search engines.The proposed model is based on a holonic multiagent system that improves the adaptability and scalability of the model.The experimental results show that utilizing online social network data to implement personalized search is feasible and that online social network data are significant for personalized search.展开更多
Online social media networks are gaining attention worldwide,with an increasing number of people relying on them to connect,communicate and share their daily pertinent event-related information.Event detection is now ...Online social media networks are gaining attention worldwide,with an increasing number of people relying on them to connect,communicate and share their daily pertinent event-related information.Event detection is now increasingly leveraging online social networks for highlighting events happening around the world via the Internet of People.In this paper,a novel Event Detection model based on Scoring and Word Embedding(ED-SWE)is proposed for discovering key events from a large volume of data streams of tweets and for generating an event summary using keywords and top-k tweets.The proposed ED-SWE model can distill high-quality tweets,reduce the negative impact of the advent of spam,and identify latent events in the data streams automatically.Moreover,a word embedding algorithm is used to learn a real-valued vector representation for a predefined fixed-sized vocabulary from a corpus of Twitter data.In order to further improve the performance of the Expectation-Maximization(EM)iteration algorithm,a novel initialization method based on the authority values of the tweets is also proposed in this paper to detect live events efficiently and precisely.Finally,a novel automatic identification method based on the cosine measure is used to automatically evaluate whether a given topic can form a live event.Experiments conducted on a real-world dataset demonstrate that the ED-SWE model exhibits better efficiency and accuracy than several state-of-art event detection models.展开更多
With the rapid development of higher education, more and more people are entitled doctoral or master’s degrees, resulting in the considerable increase in doctoral dissertations and master’s theses. The application o...With the rapid development of higher education, more and more people are entitled doctoral or master’s degrees, resulting in the considerable increase in doctoral dissertations and master’s theses. The application of IT technology renders the possibility to digitize those dissertations, which has contributed a lot to the construction of digital library. The characteristics and corresponding problems like property right, security and sharing will be discussed in this paper. And the paper makes a general introduction will be made to the specific ways adopted by the library of UESTC and its distribution of digitized dissertation resources.展开更多
随着内地和台湾地区交流的日益密切和频繁,加强两岸术语研究工作的交流与互鉴变得尤为重要。文章对台湾地区术语建设的管理结构、历时发展、已有成果,两岸共同编纂术语工具书的合作成果,“乐词网”术语搜索及资源在线平台,两岸共同建设...随着内地和台湾地区交流的日益密切和频繁,加强两岸术语研究工作的交流与互鉴变得尤为重要。文章对台湾地区术语建设的管理结构、历时发展、已有成果,两岸共同编纂术语工具书的合作成果,“乐词网”术语搜索及资源在线平台,两岸共同建设的“中华语文知识库”及其他语料库进行了详细介绍和全面梳理。对台湾地区在Web of Science(WOS)核心合集数据库中与术语相关的研究进行了主题抽样分析,借助文献计量学工具VOSviewer进行了可视化呈现。揭示了台湾地区学者在国际核心期刊上发表的术语相关研究的发展趋势和热点议题。以期为众多两岸术语研究者、语言爱好者提供研究与学习的素材和途径,助力两岸学者的沟通与合作,并确定未来协作努力的方向,也为两岸的术语建设、制定科技发展战略提供有益的参考和支撑。展开更多
文摘The user’s intent to seek online information has been an active area of research in user profiling.User profiling considers user characteristics,behaviors,activities,and preferences to sketch user intentions,interests,and motivations.Determining user characteristics can help capture implicit and explicit preferences and intentions for effective user-centric and customized content presentation.The user’s complete online experience in seeking information is a blend of activities such as searching,verifying,and sharing it on social platforms.However,a combination of multiple behaviors in profiling users has yet to be considered.This research takes a novel approach and explores user intent types based on multidimensional online behavior in information acquisition.This research explores information search,verification,and dissemination behavior and identifies diverse types of users based on their online engagement using machine learning.The research proposes a generic user profile template that explains the user characteristics based on the internet experience and uses it as ground truth for data annotation.User feedback is based on online behavior and practices collected by using a survey method.The participants include both males and females from different occupation sectors and different ages.The data collected is subject to feature engineering,and the significant features are presented to unsupervised machine learning methods to identify user intent classes or profiles and their characteristics.Different techniques are evaluated,and the K-Mean clustering method successfully generates five user groups observing different user characteristics with an average silhouette of 0.36 and a distortion score of 1136.Feature average is computed to identify user intent type characteristics.The user intent classes are then further generalized to create a user intent template with an Inter-Rater Reliability of 75%.This research successfully extracts different user types based on their preferences in online content,platforms,criteria,and frequency.The study also validates the proposed template on user feedback data through Inter-Rater Agreement process using an external human rater.
基金the phased research result of the Supreme People’s Procuratorate’s procuratorial theory research program“Research on the Governance Problems of the Crime of Aiding Information Network Criminal Activities”(Project Approval Number GJ2023D28)。
文摘With the development of information technology,the online retrieval of remote electronic data has become an important method for investigative agencies to collect evidence.In the current normative documents,the online retrieval of electronic data is positioned as a new type of arbitrary investigative measure.However,study of its actual operation has found that the online retrieval of electronic data does not fully comply with the characteristics of arbitrary investigative measures.The root cause is its inaccurately defined nature due to analogy errors,an emphasis on the authenticity of electronic data at the cost of rights protection,insufficient effectiveness of normative documents to break through the boundaries of law,and superficial inconsistency found in the mechanical comparison with the nature of existing investigative measures causes.The nature of electronic data retrieved online should be defined according to different circumstances.The retrieval of electronic data disclosed on the Internet is an arbitrary investigative measure,and following procedural specifications should be sufficient.When investigators conceal their true identities and enter the cyberspace of the suspected crime through a registered account to extract dynamic electronic data for criminal activities,it is essentially a covert investigation in cyberspace,and they should follow the normative requirements for covert investigations.The retrieval of dynamic electronic data from private spaces is a technical investigative measure and should be implemented in accordance with the technical investigative procedures.Retrieval of remote“non-public electronic data involving privacy”is a mandatory investigative measure,and is essentially a search in the virtual space.Therefore,procedural specifications should be set in accordance with the standards of searching.
基金supported by the National Natural Science Foundation of China (61972300, 61672401, 61373045, and 61902288,)the Pre-Research Project of the “Thirteenth Five-Year-Plan” of China (315***10101 and 315**0102)
文摘Personalized search utilizes user preferences to optimize search results,and most existing studies obtain user preferences by analyzing user behaviors in search engines that provide click-through data.However,the behavioral data are noisy because users often clicked some irrelevant documents to find their required information,and the new user cold start issue represents a serious problem,greatly reducing the performance of personalized search.This paper attempts to utilize online social network data to obtain user preferences that can be used to personalize search results,mine the knowledge of user interests,user influence and user relationships from online social networks,and use this knowledge to optimize the results returned by search engines.The proposed model is based on a holonic multiagent system that improves the adaptability and scalability of the model.The experimental results show that utilizing online social network data to implement personalized search is feasible and that online social network data are significant for personalized search.
基金The work reported in this paper has been supported by UK-Jiangsu 20-20 World Class University Initiative programme.
文摘Online social media networks are gaining attention worldwide,with an increasing number of people relying on them to connect,communicate and share their daily pertinent event-related information.Event detection is now increasingly leveraging online social networks for highlighting events happening around the world via the Internet of People.In this paper,a novel Event Detection model based on Scoring and Word Embedding(ED-SWE)is proposed for discovering key events from a large volume of data streams of tweets and for generating an event summary using keywords and top-k tweets.The proposed ED-SWE model can distill high-quality tweets,reduce the negative impact of the advent of spam,and identify latent events in the data streams automatically.Moreover,a word embedding algorithm is used to learn a real-valued vector representation for a predefined fixed-sized vocabulary from a corpus of Twitter data.In order to further improve the performance of the Expectation-Maximization(EM)iteration algorithm,a novel initialization method based on the authority values of the tweets is also proposed in this paper to detect live events efficiently and precisely.Finally,a novel automatic identification method based on the cosine measure is used to automatically evaluate whether a given topic can form a live event.Experiments conducted on a real-world dataset demonstrate that the ED-SWE model exhibits better efficiency and accuracy than several state-of-art event detection models.
文摘With the rapid development of higher education, more and more people are entitled doctoral or master’s degrees, resulting in the considerable increase in doctoral dissertations and master’s theses. The application of IT technology renders the possibility to digitize those dissertations, which has contributed a lot to the construction of digital library. The characteristics and corresponding problems like property right, security and sharing will be discussed in this paper. And the paper makes a general introduction will be made to the specific ways adopted by the library of UESTC and its distribution of digitized dissertation resources.
文摘随着内地和台湾地区交流的日益密切和频繁,加强两岸术语研究工作的交流与互鉴变得尤为重要。文章对台湾地区术语建设的管理结构、历时发展、已有成果,两岸共同编纂术语工具书的合作成果,“乐词网”术语搜索及资源在线平台,两岸共同建设的“中华语文知识库”及其他语料库进行了详细介绍和全面梳理。对台湾地区在Web of Science(WOS)核心合集数据库中与术语相关的研究进行了主题抽样分析,借助文献计量学工具VOSviewer进行了可视化呈现。揭示了台湾地区学者在国际核心期刊上发表的术语相关研究的发展趋势和热点议题。以期为众多两岸术语研究者、语言爱好者提供研究与学习的素材和途径,助力两岸学者的沟通与合作,并确定未来协作努力的方向,也为两岸的术语建设、制定科技发展战略提供有益的参考和支撑。