The processing of nonlinear data was one of hot topics in surveying and mapping field in recent years. As a result, many linear methods and nonlinear methods have been developed. But the methods for processing general...The processing of nonlinear data was one of hot topics in surveying and mapping field in recent years. As a result, many linear methods and nonlinear methods have been developed. But the methods for processing generalized nonlinear surveying and mapping data, especially for different data types and including unknown parameters with random or nonrandom, are seldom noticed. A new algorithm model is presented in this paper for processing nonlinear dynamic multiple-period and multiple-accuracy data derived from deformation monitoring network.展开更多
Predicting comfort levels in cities is challenging due to the many metric assessment.To overcome these challenges,much research is being done in the computing community to develop methods capable of generating outdoor...Predicting comfort levels in cities is challenging due to the many metric assessment.To overcome these challenges,much research is being done in the computing community to develop methods capable of generating outdoor comfort data.Machine Learning(ML)provides many opportunities to discover patterns in large datasets such as urban data.This paper proposes a data-driven approach to build a predictive and data-generative model to assess outdoor thermal comfort.The model benefits from the results of a study,which analyses Computational Fluid Dynamics(CFD)urban simulation to determine the thermal and wind comfort in Tallinn,Estonia.The ML model was built based on classification,and it uses an opaque ML model.The results were evaluated by applying different metrics and show us that the approach allows the implementation of a data-generative ML model to generate reliable data on outdoor comfort that can be used by urban stakeholders,planners,and researchers.展开更多
Annotating named entity recognition (NER) training corpora is a costly but necessary process for supervised NER approaches. This paper presents a general framework to generate large-scale NER training data from para...Annotating named entity recognition (NER) training corpora is a costly but necessary process for supervised NER approaches. This paper presents a general framework to generate large-scale NER training data from parallel corpora. In our method, we first employ a high performance NER system on one side of a bilingual corpus. Then, we project the named entity (NE) labels to the other side according to the word level alignments. Finally, we propose several strategies to select high-quality auto-labeled NER training data. We apply our approach to Chinese NER using an English-Chinese parallel corpus. Experimental results show that our approach can collect high-quality labeled data and can help improve Chinese NER.展开更多
文摘The processing of nonlinear data was one of hot topics in surveying and mapping field in recent years. As a result, many linear methods and nonlinear methods have been developed. But the methods for processing generalized nonlinear surveying and mapping data, especially for different data types and including unknown parameters with random or nonrandom, are seldom noticed. A new algorithm model is presented in this paper for processing nonlinear dynamic multiple-period and multiple-accuracy data derived from deformation monitoring network.
基金This work has been supported by the European Commission through the H2020 project Finest Twins(grant No.856602).
文摘Predicting comfort levels in cities is challenging due to the many metric assessment.To overcome these challenges,much research is being done in the computing community to develop methods capable of generating outdoor comfort data.Machine Learning(ML)provides many opportunities to discover patterns in large datasets such as urban data.This paper proposes a data-driven approach to build a predictive and data-generative model to assess outdoor thermal comfort.The model benefits from the results of a study,which analyses Computational Fluid Dynamics(CFD)urban simulation to determine the thermal and wind comfort in Tallinn,Estonia.The ML model was built based on classification,and it uses an opaque ML model.The results were evaluated by applying different metrics and show us that the approach allows the implementation of a data-generative ML model to generate reliable data on outdoor comfort that can be used by urban stakeholders,planners,and researchers.
基金This work was supported by the National Natural Science Foundation of China (Grant Nos. 61133012, 61273321) and the National 863 Leading Technology Research Project (2012AA011102). Special thanks to Wanxiang Che, Yanyan Zhao, Wei He, Fikadu Gemechu, Yuhang Guo, Zhenghua Li, Meishan Zhang and the anonymous reviewers for insightful comments and suggestions.
文摘Annotating named entity recognition (NER) training corpora is a costly but necessary process for supervised NER approaches. This paper presents a general framework to generate large-scale NER training data from parallel corpora. In our method, we first employ a high performance NER system on one side of a bilingual corpus. Then, we project the named entity (NE) labels to the other side according to the word level alignments. Finally, we propose several strategies to select high-quality auto-labeled NER training data. We apply our approach to Chinese NER using an English-Chinese parallel corpus. Experimental results show that our approach can collect high-quality labeled data and can help improve Chinese NER.