One of the critical hurdles, and breakthroughs, in the field of Natural Language Processing (NLP) in the last two decades has been the development of techniques for text representation that solves the so-called curse ...One of the critical hurdles, and breakthroughs, in the field of Natural Language Processing (NLP) in the last two decades has been the development of techniques for text representation that solves the so-called curse of dimensionality, a problem which plagues NLP in general given that the feature set for learning starts as a function of the size of the language in question, upwards of hundreds of thousands of terms typically. As such, much of the research and development in NLP in the last two decades has been in finding and optimizing solutions to this problem, to feature selection in NLP effectively. This paper looks at the development of these various techniques, leveraging a variety of statistical methods which rest on linguistic theories that were advanced in the middle of the last century, namely the distributional hypothesis which suggests that words that are found in similar contexts generally have similar meanings. In this survey paper we look at the development of some of the most popular of these techniques from a mathematical as well as data structure perspective, from Latent Semantic Analysis to Vector Space Models to their more modern variants which are typically referred to as word embeddings. In this review of algoriths such as Word2Vec, GloVe, ELMo and BERT, we explore the idea of semantic spaces more generally beyond applicability to NLP.展开更多
The goal of zero-shot recognition is to classify classes it has never seen before, which needs to build a bridge between seen and unseen classes through semantic embedding space. Therefore, semantic embedding space le...The goal of zero-shot recognition is to classify classes it has never seen before, which needs to build a bridge between seen and unseen classes through semantic embedding space. Therefore, semantic embedding space learning plays an important role in zero-shot recognition. Among existing works, semantic embedding space is mainly taken by user-defined attribute vectors. However, the discriminative information included in the user-defined attribute vector is limited. In this paper, we propose to learn an extra latent attribute space automatically to produce a more generalized and discriminative semantic embedded space. To prevent the bias problem, both user-defined attribute vector and latent attribute space are optimized by adversarial learning with auto-encoders. We also propose to reconstruct semantic patterns produced by explanatory graphs, which can make semantic embedding space more sensitive to usefully semantic information and less sensitive to useless information. The proposed method is evaluated on the AwA2 and CUB dataset. These results show that our proposed method achieves superior performance.展开更多
This study introduces the Orbit Weighting Scheme(OWS),a novel approach aimed at enhancing the precision and efficiency of Vector Space information retrieval(IR)models,which have traditionally relied on weighting schem...This study introduces the Orbit Weighting Scheme(OWS),a novel approach aimed at enhancing the precision and efficiency of Vector Space information retrieval(IR)models,which have traditionally relied on weighting schemes like tf-idf and BM25.These conventional methods often struggle with accurately capturing document relevance,leading to inefficiencies in both retrieval performance and index size management.OWS proposes a dynamic weighting mechanism that evaluates the significance of terms based on their orbital position within the vector space,emphasizing term relationships and distribution patterns overlooked by existing models.Our research focuses on evaluating OWS’s impact on model accuracy using Information Retrieval metrics like Recall,Precision,InterpolatedAverage Precision(IAP),andMeanAverage Precision(MAP).Additionally,we assessOWS’s effectiveness in reducing the inverted index size,crucial for model efficiency.We compare OWS-based retrieval models against others using different schemes,including tf-idf variations and BM25Delta.Results reveal OWS’s superiority,achieving a 54%Recall and 81%MAP,and a notable 38%reduction in the inverted index size.This highlights OWS’s potential in optimizing retrieval processes and underscores the need for further research in this underrepresented area to fully leverage OWS’s capabilities in information retrieval methodologies.展开更多
The large manipulator outside the space cabin is a multi-degree of freedom actuator for space operations.In order to realize the automatic control and flexible operation of the space manipulator,a novel spoke structur...The large manipulator outside the space cabin is a multi-degree of freedom actuator for space operations.In order to realize the automatic control and flexible operation of the space manipulator,a novel spoke structure piezoelectric six-dimensional force/torque sensor with redundancy ability,high stiffness and good decoupling performance is innovatively proposed.Based on the deformation coordination relationship,the redundancy measurement mechanism is revealed.The mathematical models of the sensor with and without branch fault are established respectively.The finite element model is established to verify the feasibility of structure and redundancy measuring principle of the sensor.Depending on the theoretical analysis and simulation analysis,the prototype of the sensor is developed.Static and dynamic calibration experiments are carried out.The actual output voltage signal of the six-dimensional force/torque sensor is collected to establish the equation between the standard input applied load and the actual output voltage signal.Based on ant colony optimized BP algorithm,performance indexes of the sensor with and without branch fault are analyzed respectively.The experimental results show that the spoke piezoelectric sixdimensional force/torque sensor with the eight-point support structure has good accuracy and reliability.Meanwhile,it has strong decoupling characteristic that can effectively shield the coupling between dimensions.The nonlinear errors and maximum interference errors of decoupled data with and without branch faults are less than 1% and 2%,respectively.The natural frequency of the sixdimensional force sensor can reach 2856.45 Hz and has good dynamic characteristics.The research content lays a theoretical and experimental foundation for the design,development and application of the new six-dimensional force/torque sensors with redundancy.Meanwhile,it will significantly improve the research level in this field,and provide a strong guarantee for the smooth implementation of force feedback control of the space station manipulator project.展开更多
在全球城市化和环境压力加剧的背景下,对城市街道绿化泛类结构(urban street greening general structure,USGGS)的量化是加强城市区域碳汇、缓解城市热岛效应以应对全球气候变化的重要前提。通过量化与分析不同城市的USGGS,探究其与城...在全球城市化和环境压力加剧的背景下,对城市街道绿化泛类结构(urban street greening general structure,USGGS)的量化是加强城市区域碳汇、缓解城市热岛效应以应对全球气候变化的重要前提。通过量化与分析不同城市的USGGS,探究其与城市建成环境之间的关系。使用改进的DeepLabV3+神经网络模型,对天津、杭州、深圳的城市全景街景图像进行语义分割,并结合细粒度数据量化USGGS,使用Robust回归模型分析USGGS与城市功能属性POI的关系。研究显示,天津的USGGS主要由单乔木和乔-灌结构组成,与商业属性和生活属性的POI紧密相关;而杭州和深圳则展现出包括草本植物在内的多样化USGGS与休闲文化设施的POI更强的关联性。通过对3个城市USGGS的量化、分析与比较,为城市绿色基础设施规划和管理奠定了一定的数据基础,同时基于城市街景图像对USGGS的分析也为城市碳汇计算与城市热环境研究提供了新的视角。展开更多
文摘One of the critical hurdles, and breakthroughs, in the field of Natural Language Processing (NLP) in the last two decades has been the development of techniques for text representation that solves the so-called curse of dimensionality, a problem which plagues NLP in general given that the feature set for learning starts as a function of the size of the language in question, upwards of hundreds of thousands of terms typically. As such, much of the research and development in NLP in the last two decades has been in finding and optimizing solutions to this problem, to feature selection in NLP effectively. This paper looks at the development of these various techniques, leveraging a variety of statistical methods which rest on linguistic theories that were advanced in the middle of the last century, namely the distributional hypothesis which suggests that words that are found in similar contexts generally have similar meanings. In this survey paper we look at the development of some of the most popular of these techniques from a mathematical as well as data structure perspective, from Latent Semantic Analysis to Vector Space Models to their more modern variants which are typically referred to as word embeddings. In this review of algoriths such as Word2Vec, GloVe, ELMo and BERT, we explore the idea of semantic spaces more generally beyond applicability to NLP.
文摘The goal of zero-shot recognition is to classify classes it has never seen before, which needs to build a bridge between seen and unseen classes through semantic embedding space. Therefore, semantic embedding space learning plays an important role in zero-shot recognition. Among existing works, semantic embedding space is mainly taken by user-defined attribute vectors. However, the discriminative information included in the user-defined attribute vector is limited. In this paper, we propose to learn an extra latent attribute space automatically to produce a more generalized and discriminative semantic embedded space. To prevent the bias problem, both user-defined attribute vector and latent attribute space are optimized by adversarial learning with auto-encoders. We also propose to reconstruct semantic patterns produced by explanatory graphs, which can make semantic embedding space more sensitive to usefully semantic information and less sensitive to useless information. The proposed method is evaluated on the AwA2 and CUB dataset. These results show that our proposed method achieves superior performance.
文摘This study introduces the Orbit Weighting Scheme(OWS),a novel approach aimed at enhancing the precision and efficiency of Vector Space information retrieval(IR)models,which have traditionally relied on weighting schemes like tf-idf and BM25.These conventional methods often struggle with accurately capturing document relevance,leading to inefficiencies in both retrieval performance and index size management.OWS proposes a dynamic weighting mechanism that evaluates the significance of terms based on their orbital position within the vector space,emphasizing term relationships and distribution patterns overlooked by existing models.Our research focuses on evaluating OWS’s impact on model accuracy using Information Retrieval metrics like Recall,Precision,InterpolatedAverage Precision(IAP),andMeanAverage Precision(MAP).Additionally,we assessOWS’s effectiveness in reducing the inverted index size,crucial for model efficiency.We compare OWS-based retrieval models against others using different schemes,including tf-idf variations and BM25Delta.Results reveal OWS’s superiority,achieving a 54%Recall and 81%MAP,and a notable 38%reduction in the inverted index size.This highlights OWS’s potential in optimizing retrieval processes and underscores the need for further research in this underrepresented area to fully leverage OWS’s capabilities in information retrieval methodologies.
基金supported by the National Natural Science Foundation of China(No.51875250)a Project of Shandong Province Higher Educational Youth Innovation Science and Technology Program,China(No.2019KJB018)a Project of the“20 Regulations for New Universities”Funding Program of Jinan,China(No.202228116)。
文摘The large manipulator outside the space cabin is a multi-degree of freedom actuator for space operations.In order to realize the automatic control and flexible operation of the space manipulator,a novel spoke structure piezoelectric six-dimensional force/torque sensor with redundancy ability,high stiffness and good decoupling performance is innovatively proposed.Based on the deformation coordination relationship,the redundancy measurement mechanism is revealed.The mathematical models of the sensor with and without branch fault are established respectively.The finite element model is established to verify the feasibility of structure and redundancy measuring principle of the sensor.Depending on the theoretical analysis and simulation analysis,the prototype of the sensor is developed.Static and dynamic calibration experiments are carried out.The actual output voltage signal of the six-dimensional force/torque sensor is collected to establish the equation between the standard input applied load and the actual output voltage signal.Based on ant colony optimized BP algorithm,performance indexes of the sensor with and without branch fault are analyzed respectively.The experimental results show that the spoke piezoelectric sixdimensional force/torque sensor with the eight-point support structure has good accuracy and reliability.Meanwhile,it has strong decoupling characteristic that can effectively shield the coupling between dimensions.The nonlinear errors and maximum interference errors of decoupled data with and without branch faults are less than 1% and 2%,respectively.The natural frequency of the sixdimensional force sensor can reach 2856.45 Hz and has good dynamic characteristics.The research content lays a theoretical and experimental foundation for the design,development and application of the new six-dimensional force/torque sensors with redundancy.Meanwhile,it will significantly improve the research level in this field,and provide a strong guarantee for the smooth implementation of force feedback control of the space station manipulator project.
文摘在全球城市化和环境压力加剧的背景下,对城市街道绿化泛类结构(urban street greening general structure,USGGS)的量化是加强城市区域碳汇、缓解城市热岛效应以应对全球气候变化的重要前提。通过量化与分析不同城市的USGGS,探究其与城市建成环境之间的关系。使用改进的DeepLabV3+神经网络模型,对天津、杭州、深圳的城市全景街景图像进行语义分割,并结合细粒度数据量化USGGS,使用Robust回归模型分析USGGS与城市功能属性POI的关系。研究显示,天津的USGGS主要由单乔木和乔-灌结构组成,与商业属性和生活属性的POI紧密相关;而杭州和深圳则展现出包括草本植物在内的多样化USGGS与休闲文化设施的POI更强的关联性。通过对3个城市USGGS的量化、分析与比较,为城市绿色基础设施规划和管理奠定了一定的数据基础,同时基于城市街景图像对USGGS的分析也为城市碳汇计算与城市热环境研究提供了新的视角。