Purpose:Many science,technology and innovation(STI)resources are attached with several different labels.To assign automatically the resulting labels to an interested instance,many approaches with good performance on t...Purpose:Many science,technology and innovation(STI)resources are attached with several different labels.To assign automatically the resulting labels to an interested instance,many approaches with good performance on the benchmark datasets have been proposed for multi-label classification task in the literature.Furthermore,several open-source tools implementing these approaches have also been developed.However,the characteristics of real-world multi-label patent and publication datasets are not completely in line with those of benchmark ones.Therefore,the main purpose of this paper is to evaluate comprehensively seven multi-label classification methods on real-world datasets.Research limitations:Three real-world datasets differ in the following aspects:statement,data quality,and purposes.Additionally,open-source tools designed for multi-label classification also have intrinsic differences in their approaches for data processing and feature selection,which in turn impacts the performance of a multi-label classification approach.In the near future,we will enhance experimental precision and reinforce the validity of conclusions by employing more rigorous control over variables through introducing expanded parameter settings.Practical implications:The observed Macro F1 and Micro F1 scores on real-world datasets typically fall short of those achieved on benchmark datasets,underscoring the complexity of real-world multi-label classification tasks.Approaches leveraging deep learning techniques offer promising solutions by accommodating the hierarchical relationships and interdependencies among labels.With ongoing enhancements in deep learning algorithms and large-scale models,it is expected that the efficacy of multi-label classification tasks will be significantly improved,reaching a level of practical utility in the foreseeable future.Originality/value:(1)Seven multi-label classification methods are comprehensively compared on three real-world datasets.(2)The TextCNN and TextRCNN models perform better on small-scale datasets with more complex hierarchical structure of labels and more balanced document-label distribution.(3)The MLkNN method works better on the larger-scale dataset with more unbalanced document-label distribution.展开更多
This article delves into the analysis of performance and utilization of Support Vector Machines (SVMs) for the critical task of forest fire detection using image datasets. With the increasing threat of forest fires to...This article delves into the analysis of performance and utilization of Support Vector Machines (SVMs) for the critical task of forest fire detection using image datasets. With the increasing threat of forest fires to ecosystems and human settlements, the need for rapid and accurate detection systems is of utmost importance. SVMs, renowned for their strong classification capabilities, exhibit proficiency in recognizing patterns associated with fire within images. By training on labeled data, SVMs acquire the ability to identify distinctive attributes associated with fire, such as flames, smoke, or alterations in the visual characteristics of the forest area. The document thoroughly examines the use of SVMs, covering crucial elements like data preprocessing, feature extraction, and model training. It rigorously evaluates parameters such as accuracy, efficiency, and practical applicability. The knowledge gained from this study aids in the development of efficient forest fire detection systems, enabling prompt responses and improving disaster management. Moreover, the correlation between SVM accuracy and the difficulties presented by high-dimensional datasets is carefully investigated, demonstrated through a revealing case study. The relationship between accuracy scores and the different resolutions used for resizing the training datasets has also been discussed in this article. These comprehensive studies result in a definitive overview of the difficulties faced and the potential sectors requiring further improvement and focus.展开更多
This paper proposed a method to generate semi-experimental biomedical datasets based on full-wave simulation software.The system noise such as antenna port couplings is fully considered in the proposed datasets,which ...This paper proposed a method to generate semi-experimental biomedical datasets based on full-wave simulation software.The system noise such as antenna port couplings is fully considered in the proposed datasets,which is more realistic than synthetical datasets.In this paper,datasets containing different shapes are constructed based on the relative permittivities of human tissues.Then,a back-propagation scheme is used to obtain the rough reconstructions,which will be fed into a U-net convolutional neural network(CNN)to recover the high-resolution images.Numerical results show that the network trained on the datasets generated by the proposed method can obtain satisfying reconstruction results and is promising to be applied in real-time biomedical imaging.展开更多
Phishing attacks pose a significant security threat by masquerading as trustworthy entities to steal sensitive information,a problem that persists despite user awareness.This study addresses the pressing issue of phis...Phishing attacks pose a significant security threat by masquerading as trustworthy entities to steal sensitive information,a problem that persists despite user awareness.This study addresses the pressing issue of phishing attacks on websites and assesses the performance of three prominent Machine Learning(ML)models—Artificial Neural Networks(ANN),Convolutional Neural Networks(CNN),and Long Short-Term Memory(LSTM)—utilizing authentic datasets sourced from Kaggle and Mendeley repositories.Extensive experimentation and analysis reveal that the CNN model achieves a better accuracy of 98%.On the other hand,LSTM shows the lowest accuracy of 96%.These findings underscore the potential of ML techniques in enhancing phishing detection systems and bolstering cybersecurity measures against evolving phishing tactics,offering a promising avenue for safeguarding sensitive information and online security.展开更多
Recently,automotive intrusion detection systems(IDSs)have emerged as promising defense approaches to counter attacks on in-vehicle networks(IVNs).However,the effectiveness of IDSs relies heavily on the quality of the ...Recently,automotive intrusion detection systems(IDSs)have emerged as promising defense approaches to counter attacks on in-vehicle networks(IVNs).However,the effectiveness of IDSs relies heavily on the quality of the datasets used for training and evaluation.Despite the availability of several datasets for automotive IDSs,there has been a lack of comprehensive analysis focusing on assessing these datasets.This paper aims to address the need for dataset assessment in the context of automotive IDSs.It proposes qualitative and quantitative metrics that are independent of specific automotive IDSs,to evaluate the quality of datasets.These metrics take into consideration various aspects such as dataset description,collection environment,and attack complexity.This paper evaluates eight commonly used datasets for automotive IDSs using the proposed metrics.The evaluation reveals biases in the datasets,particularly in terms of limited contexts and lack of diversity.Additionally,it highlights that the attacks in the datasets were mostly injected without considering normal behaviors,which poses challenges for training and evaluating machine learning-based IDSs.This paper emphasizes the importance of addressing the identified limitations in existing datasets to improve the performance and adaptability of automotive IDSs.The proposed metrics can serve as valuable guidelines for researchers and practitioners in selecting and constructing high-quality datasets for automotive security applications.Finally,this paper presents the requirements for high-quality datasets,including the need for representativeness,diversity,and balance.展开更多
The CRA-Interim trial production of the global atmospheric reanalysis for 10 years from 2007 to 2016 was carried out by the China Meteorological Administration in 2017. The structural characteristics of the horizontal...The CRA-Interim trial production of the global atmospheric reanalysis for 10 years from 2007 to 2016 was carried out by the China Meteorological Administration in 2017. The structural characteristics of the horizontal shear line over the Tibetan Plateau (TPHSL) based on the CRA-Interim datasets are examined by objectively identifying the shear line, and are compared with the analysis results of the European Centre for Medium-Range Weather Forecasts reanalysis data (ERA-Interim). The case occurred at 18UTC on July 5, 2016. The results show that both of the ERA-Interim and CRA-Interim datasets can well reveal the circulation background and the dynamic and thermal structure characteristics of TPHSL, and they have shown some similar features. The middle and high latitudes at 500 hPa are characterized by the circulation situation of"two troughs and two ridges", and at 200 hPa, the TPHSL is located in the northeast quadrant of the South Asian High Pressure (SAHP). The TPHSL locates in the positive vorticity zone and passes through the positive vorticity center corresponding to the ascending motion. Near the TPHSL, the contours of pseudo-equivalent potential temperature (θse) tend to be intensive, with a high-value center on the south side of the TPHSL. The TPHSL can extend to460 hPa and vertically inclines northward. There is a positive vorticity zone near the TPHSL which is also characterized by the northward inclination with the height, the ascending motion near the TPHSL can extend to 300 hPa, and the atmospheric layer above the TPHSL is stable. However, the intensities of the TPHSL’s structure characteristics analyzed with the two datasets are different, revealing the relatively strong intensity of geopotential height field, vertical velocity field, vorticity field and divergence field from the CRA-Interim datasets. In addition, the vertical profiles of the dynamic and water vapor thermal physical quantities of the two datasets are also consistent in the east and west part of the TPHSL. In summary, the reliable and usable CRA-Interim datasets show excellent properties in the analysis on the structural characteristics of a horizontal shear line over the Tibetan Plateau.展开更多
Based on C-LSAT2.0,using high-and low-frequency components reconstruction methods,combined with observation constraint masking,a reconstructed C-LSAT2.0 with 756 ensemble members from the 1850s to 2018 has been develo...Based on C-LSAT2.0,using high-and low-frequency components reconstruction methods,combined with observation constraint masking,a reconstructed C-LSAT2.0 with 756 ensemble members from the 1850s to 2018 has been developed.These ensemble versions have been merged with the ERSSTv5 ensemble dataset,and an upgraded version of the CMSTInterim dataset with 5°×5°resolution has been developed.The CMST-Interim dataset has significantly improved the coverage rate of global surface temperature data.After reconstruction,the data coverage before 1950 increased from 78%−81%of the original CMST to 81%−89%.The total coverage after 1955 reached about 93%,including more than 98%in the Northern Hemisphere and 81%−89%in the Southern Hemisphere.Through the reconstruction ensemble experiments with different parameters,a good basis is provided for more systematic uncertainty assessment of C-LSAT2.0 and CMSTInterim.In comparison with the original CMST,the global mean surface temperatures are estimated to be cooler in the second half of 19th century and warmer during the 21st century,which shows that the global warming trend is further amplified.The global warming trends are updated from 0.085±0.004℃(10 yr)^(–1)and 0.128±0.006℃(10 yr)^(–1)to 0.089±0.004℃(10 yr)^(–1)and 0.137±0.007℃(10 yr)^(–1),respectively,since the start and the second half of 20th century.展开更多
In this study, the upper ocean heat content (OHC) variations in the South China Sea (SCS) during 1993- 2006 were investigated by examining ocean temperatures in seven datasets, including World Ocean Atlas 2009 (W...In this study, the upper ocean heat content (OHC) variations in the South China Sea (SCS) during 1993- 2006 were investigated by examining ocean temperatures in seven datasets, including World Ocean Atlas 2009 (WOA09) (climatology), Ishii datasets, Ocean General Circulation ModeI for the Earth Simulator (OFES), Simple Ocean Data Assimilation system (SODA), Global Ocean Data Assimilation System (GODAS), China Oceanic ReAnalysis system (CORA) , and an ocean reanalysis dataset for the joining area of Asia and Indian-Pacific Ocean (AIPO1.0). Among these datasets, two were independent of any numerical model, four relied on data assimilation, and one was generated without any data assimilation. The annual cycles revealed by the seven datasets were similar, but the interannual variations were different. Vertical structures of temperatures along the 18~N, 12.75~N, and 120~E sections were compared with data collected during open cruises in 1998 and 2005-08. The results indicated that Ishii, OFES, CORA, and AIPO1.0 were more consistent with the observations. Through systematic shortcomings and advantages in presenting the upper comparisons, we found that each dataset had its own OHC in the SCS.展开更多
Version 4(v4) of the Extended Reconstructed Sea Surface Temperature(ERSST) dataset is compared with its precedent, the widely used version 3b(v3b). The essential upgrades applied to v4 lead to remarkable differences i...Version 4(v4) of the Extended Reconstructed Sea Surface Temperature(ERSST) dataset is compared with its precedent, the widely used version 3b(v3b). The essential upgrades applied to v4 lead to remarkable differences in the characteristics of the sea surface temperature(SST) anomaly(SSTa) in both the temporal and spatial domains. First, the largest discrepancy of the global mean SSTa values around the 1940 s is due to ship-observation corrections made to reconcile observations from buckets and engine intake thermometers. Second, differences in global and regional mean SSTa values between v4 and v3b exhibit a downward trend(around-0.032℃ per decade) before the 1940s, an upward trend(around 0.014℃ per decade) during the period of 1950–2015, interdecadal oscillation with one peak around the 1980s, and two troughs during the 1960s and 2000s, respectively. This does not derive from treatments of the polar or the other data-void regions, since the difference of the SSTa does not share the common features. Third, the spatial pattern of the ENSO-related variability of v4 exhibits a wider but weaker cold tongue in the tropical region of the Pacific Ocean compared with that of v3b, which could be attributed to differences in gap-filling assumptions since the latter features satellite observations whereas the former features in situ ones. This intercomparison confirms that the structural uncertainty arising from underlying assumptions on the treatment of diverse SST observations even in the same SST product family is the main source of significant SST differences in the temporal domain. Why this uncertainty introduces artificial decadal oscillations remains unknown.展开更多
Anomaly based approaches in network intrusion detection suffer from evaluation, comparison and deployment which originate from the scarcity of adequate publicly available network trace datasets. Also, publicly availab...Anomaly based approaches in network intrusion detection suffer from evaluation, comparison and deployment which originate from the scarcity of adequate publicly available network trace datasets. Also, publicly available datasets are either outdated or generated in a controlled environment. Due to the ubiquity of cloud computing environments in commercial and government internet services, there is a need to assess the impacts of network attacks in cloud data centers. To the best of our knowledge, there is no publicly available dataset which captures the normal and anomalous network traces in the interactions between cloud users and cloud data centers. In this paper, we present an experimental platform designed to represent a practical interaction between cloud users and cloud services and collect network traces resulting from this interaction to conduct anomaly detection. We use Amazon web services (AWS) platform for conducting our experiments.展开更多
In the past few decades,meteorological datasets from remote sensing techniques in agricultural and water resources management have been used by various researchers and managers.Based on the literature,meteorological d...In the past few decades,meteorological datasets from remote sensing techniques in agricultural and water resources management have been used by various researchers and managers.Based on the literature,meteorological datasets are not more accurate than synoptic stations,but their various advantages,such as spatial coverage,time coverage,accessibility,and free use,have made these techniques superior,and sometimes we can use them instead of synoptic stations.In this study,we used four meteorological datasets,including Climatic Research Unit gridded Time Series(CRU TS),Global Precipitation Climatology Centre(GPCC),Agricultural National Aeronautics and Space Administration Modern-Era Retrospective Analysis for Research and Applications(AgMERRA),Agricultural Climate Forecast System Reanalysis(AgCFSR),to estimate climate variables,i.e.,precipitation,maximum temperature,and minimum temperature,and crop variables,i.e.,reference evapotranspiration,irrigation requirement,biomass,and yield of maize,in Qazvin Province of Iran during 1980-2009.At first,data were gathered from the four meteorological datasets and synoptic station in this province,and climate variables were calculated.Then,after using the AquaCrop model to calculate the crop variables,we compared the results of the synoptic station and meteorological datasets.All the four meteorological datasets showed strong performance for estimating climate variables.AgMERRA and AgCFSR had more accurate estimations for precipitation and maximum temperature.However,their normalized root mean square error was inferior to CRU for minimum temperature.Furthermore,they were all very efficient for estimating the biomass and yield of maize in this province.For reference evapotranspiration and irrigation requirement CRU TS and GPCC were the most efficient rather than AgMERRA and AgCFSR.But for the estimation of biomass and yield,all the four meteorological datasets were reliable.To sum up,GPCC and AgCFSR were the two best datasets in this study.This study suggests the use of meteorological datasets in water resource management and agricultural management to monitor past changes and estimate recent trends.展开更多
In this study,through experimental research and an investigation on large datasets of the durability parameters in ocean engineering,the values,ranges,and types of distribution of the durability parameters employed fo...In this study,through experimental research and an investigation on large datasets of the durability parameters in ocean engineering,the values,ranges,and types of distribution of the durability parameters employed for the durability design in ocean engineering in northern China were confirmed.Based on a modified theoretical model of chloride diffusion and the reliability theory,the service lives of concrete structures exposed to the splash,tidal,and underwater zones were calculated.Mixed concrete proportions meeting the requirement of a service life of 100 or 120 years were designed,and a cover thickness requirement was proposed.In addition,the effects of the different time-varying relationships of the boundary condition(Cs)and diffusion coefficient(Df)on the service life were compared;the results showed that the time-varying relationships used in this study(i.e.,Cscontinuously increased and then remained stable,and Dfcontinuously decreased and then remained stable)were beneficial for the durability design of concrete structures in marine environment.展开更多
The differences in the climatology of extratropical transition(ET) of western North Pacific tropical cyclones(TCs) were investigated in this study using the TCs best-track datasets of China Meteorological Administrati...The differences in the climatology of extratropical transition(ET) of western North Pacific tropical cyclones(TCs) were investigated in this study using the TCs best-track datasets of China Meteorological Administration(CMA),Japan Meteorological Agency(JMA) and the Joint Typhoon Warning Center(JTWC). The results show that the ET identification, ET completion time, and post-ET duration reported in the JTWC dataset are greatly different from those in CMA and JMA datasets during 2004-2010. However, the key differences between the CMA and JMA datasets from 1951 to 2010 are the ET identification and the post-ET duration, because of inconsistent objective ET criteria used in the centers. Further analysis indicates that annual ET percentage of CMA was lower than that of JMA, and exhibited an interannual decreasing trend, while that of JMA was an unchanged trend. The western North Pacific ET events occurred mainly during the period June to November. The latitude of ET occurrence shifted northward from February to August,followed by a southward shift. Most of ET events were observed between 35°N and 45°N. From a regional perspective,TCs tended to undergo ET in Japan and the ocean east to it. It is found that TCs which experienced the ET process at higher latitudes were generally more intense at the ET completion time. TCs completing the ET overland or offshore were weaker than those finishing the ET over the ocean. Most of the TCs weakened 24 h before the completion of ET.In contrast, 21%(27%) of the TCs showed an intensification process based on the CMA(JMA) dataset during the post-ET period. The results presented in this study indicate that consistent ET determination criteria are needed to reduce the uncertainty involved in ET identification among the centers.展开更多
Prediction of tunneling-induced ground settlements is an essential task,particularly for tunneling in urban settings.Ground settlements should be limited within a tolerable threshold to avoid damages to aboveground st...Prediction of tunneling-induced ground settlements is an essential task,particularly for tunneling in urban settings.Ground settlements should be limited within a tolerable threshold to avoid damages to aboveground structures.Machine learning(ML)methods are becoming popular in many fields,including tunneling and underground excavations,as a powerful learning and predicting technique.However,the available datasets collected from a tunneling project are usually small from the perspective of applying ML methods.Can ML algorithms effectively predict tunneling-induced ground settlements when the available datasets are small?In this study,seven ML methods are utilized to predict tunneling-induced ground settlement using 14 contributing factors measured before or during tunnel excavation.These methods include multiple linear regression(MLR),decision tree(DT),random forest(RF),gradient boosting(GB),support vector regression(SVR),back-propagation neural network(BPNN),and permutation importancebased BPNN(PI-BPNN)models.All methods except BPNN and PI-BPNN are shallow-structure ML methods.The effectiveness of these seven ML approaches on small datasets is evaluated using model accuracy and stability.The model accuracy is measured by the coefficient of determination(R2)of training and testing datasets,and the stability of a learning algorithm indicates robust predictive performance.Also,the quantile error(QE)criterion is introduced to assess model predictive performance considering underpredictions and overpredictions.Our study reveals that the RF algorithm outperforms all the other models with the highest model prediction accuracy(0.9)and stability(3.0210^(-27)).Deep-structure ML models do not perform well for small datasets with relatively low model accuracy(0.59)and stability(5.76).The PI-BPNN architecture is proposed and designed for small datasets,showing better performance than typical BPNN.Six important contributing factors of ground settlements are identified,including tunnel depth,the distance between tunnel face and surface monitoring points(DTM),weighted average soil compressibility modulus(ACM),grouting pressure,penetrating rate and thrust force.展开更多
Data-driven algorithms for predicting mechanical properties with small datasets are evaluated in a case study on gear steel hardenability.The limitations of current data-driven algorithms and empirical models are iden...Data-driven algorithms for predicting mechanical properties with small datasets are evaluated in a case study on gear steel hardenability.The limitations of current data-driven algorithms and empirical models are identified.Challenges in analysing small datasets are discussed,and solution is proposed to handle small datasets with multiple variables.Gaussian methods in combination with novel predictive algorithms are utilized to overcome the challenges in analysing gear steel hardenability data and to gain insight into alloying elements interaction and structure homogeneity.The gained fundamental knowledge integrated with machine learning is shown to be superior to the empirical equations in predicting hardenability.Metallurgical-property relationships between chemistry,sample size,and hardness are predicted via two optimized machine learning algorithms:neural networks(NNs)and extreme gradient boosting(XGboost).A comparison is drawn between all algorithms,evaluating their performance based on small data sets.The results reveal that XGboost has the highest potential for predicting hardenability using small datasets with class imbalance and large inhomogeneity issues.展开更多
Hydro-climatological study is difficult in most of the developing countries due to the paucity of monitoring stations. Gridded climatological data provides an opportunity to extrapolate climate to areas without monito...Hydro-climatological study is difficult in most of the developing countries due to the paucity of monitoring stations. Gridded climatological data provides an opportunity to extrapolate climate to areas without monitoring stations based on their ability to replicate the Spatio-temporal distribution and variability of observed datasets. Simple correlation and error analyses are not enough to predict the variability and distribution of precipitation and temperature. In this study, the coefficient of correlation (R2), Root mean square error (RMSE), mean bias error (MBE) and mean wet and dry spell lengths were used to evaluate the performance of three widely used daily gridded precipitation, maximum and minimum temperature datasets from the Climatic Research Unit (CRU), Princeton University Global Meteorological Forcing (PGF) and Climate Forecast System Reanalysis (CFSR) datasets available over the Niger Delta part of Nigeria. The Standardised Precipitation Index was used to assess the confidence of using gridded precipitation products on water resource management. Results of correlation, error, and spell length analysis revealed that the CRU and PGF datasets performed much better than the CFSR datasets. SPI values also indicate a good association between station and CRU precipitation products. The CFSR datasets in comparison with the other data products in many years overestimated and underestimated the SPI. This indicates weak accuracy in predictability, hence not reliable for water resource management in the study area. However, CRU data products were found to perform much better in most of the statistical assessments conducted. This makes the methods used in this study to be useful for the assessment of various gridded datasets in various hydrological and climatic applications.展开更多
Generative AI models for music and the arts in general are increasingly complex and hard to understand.The field of ex-plainable AI(XAI)seeks to make complex and opaque AI models such as neural networks more understan...Generative AI models for music and the arts in general are increasingly complex and hard to understand.The field of ex-plainable AI(XAI)seeks to make complex and opaque AI models such as neural networks more understandable to people.One ap-proach to making generative AI models more understandable is to impose a small number of semantically meaningful attributes on gen-erative AI models.This paper contributes a systematic examination of the impact that different combinations of variational auto-en-coder models(measureVAE and adversarialVAE),configurations of latent space in the AI model(from 4 to 256 latent dimensions),and training datasets(Irish folk,Turkish folk,classical,and pop)have on music generation performance when 2 or 4 meaningful musical at-tributes are imposed on the generative model.To date,there have been no systematic comparisons of such models at this level of com-binatorial detail.Our findings show that measureVAE has better reconstruction performance than adversarialVAE which has better musical attribute independence.Results demonstrate that measureVAE was able to generate music across music genres with inter-pretable musical dimensions of control,and performs best with low complexity music such as pop and rock.We recommend that a 32 or 64 latent dimensional space is optimal for 4 regularised dimensions when using measureVAE to generate music across genres.Our res-ults are the first detailed comparisons of configurations of state-of-the-art generative AI models for music and can be used to help select and configure AI models,musical features,and datasets for more understandable generation of music.展开更多
High resolution satellite images are becoming increasingly available for urban multi-temporal semantic understanding.However,few datasets can be used for land-use/land-cover(LULC)classification,binary change detection...High resolution satellite images are becoming increasingly available for urban multi-temporal semantic understanding.However,few datasets can be used for land-use/land-cover(LULC)classification,binary change detection(BCD)and semantic change detection(SCD)simultaneously because classification datasets always have one time phase and BCD datasets focus only on the changed location,ignoring the changed classes.Public SCD datasets are rare but much needed.To solve the above problems,a tri-temporal SCD dataset made up of Gaofen-2(GF-2)remote sensing imagery(with 11 LULC classes and 60 change directions)was built in this study,namely,the Wuhan Urban Semantic Understanding(WUSU)dataset.Popular deep learning based methods for LULC classification,BCD and SCD are tested to verify the reliability of WUSU.A Siamese-based multi-task joint framework with a multi-task joint loss(MJ loss)named ChangeMJ is proposed to restore the object boundaries and obtains the best results in LULC classification,BCD and SCD,compared to the state-of-the-art(SOTA)methods.Finally,a large spatial-scale mapping for Wuhan central urban area is carried out to verify that the WUsU dataset and the ChangeMJ framework have good application values.展开更多
With the advancement of technology and the increase in user demands, gesture recognition played a pivotal role in the field of human-computer interaction. Among various sensing devices, Time-of-Flight (ToF) sensors we...With the advancement of technology and the increase in user demands, gesture recognition played a pivotal role in the field of human-computer interaction. Among various sensing devices, Time-of-Flight (ToF) sensors were widely applied due to their low cost. This paper explored the implementation of a human hand posture recognition system using ToF sensors and residual neural networks. Firstly, this paper reviewed the typical applications of human hand recognition. Secondly, this paper designed a hand gesture recognition system using a ToF sensor VL53L5. Subsequently, data preprocessing was conducted, followed by training the constructed residual neural network. Then, the recognition results were analyzed, indicating that gesture recognition based on the residual neural network achieved an accuracy of 98.5% in a 5-class classification scenario. Finally, the paper discussed existing issues and future research directions.展开更多
Land use/cover change is an important parameter in the climate and ecological simulations. Although they had been widely used in the community, SAGE dataset and HYDE dataset, the two representative global historical l...Land use/cover change is an important parameter in the climate and ecological simulations. Although they had been widely used in the community, SAGE dataset and HYDE dataset, the two representative global historical land use datasets, were little assessed about their accuracies in regional scale. Here, we carried out some assessments for the traditional cultivated region of China (TCRC) over last 300 years, by comparing SAGE2010 and HYDE (v3.1) with Chinese Historical Cropland Dataset (CHCD). The comparisons were performed at three spatial scales: entire study area, provincial area and 60 km by 60 km grid cell. The results show that (1) the cropland area from SAGE2010 was much more than that from CHCD moreover, the growth at a rate of 0.51% from 1700 to 1950 and -0.34% after 1950 were also inconsistent with that from CHCD. (2) HYDE dataset (v3.1) was closer to CHCD dataset than SAGE dataset on entire study area. However, the large biases could be detected at provincial scale and 60 km by 60 km grid cell scale. The percent of grid cells having biases greater than 70% (〈-70% or 〉70%) and 90% (〈-90% or 〉90%) accounted for 56%-63% and 40%-45% of the total grid cells respectively while those having biases range from -10% to 10% and from -30% to 30% account for only 5%-6% and 17% of the total grid cells respectively. (3) Using local historical archives to reconstruct historical dataset with high accuracy would be a valu- able way to improve the accuracy of climate and ecological simulation.展开更多
基金the Natural Science Foundation of China(Grant Numbers 72074014 and 72004012).
文摘Purpose:Many science,technology and innovation(STI)resources are attached with several different labels.To assign automatically the resulting labels to an interested instance,many approaches with good performance on the benchmark datasets have been proposed for multi-label classification task in the literature.Furthermore,several open-source tools implementing these approaches have also been developed.However,the characteristics of real-world multi-label patent and publication datasets are not completely in line with those of benchmark ones.Therefore,the main purpose of this paper is to evaluate comprehensively seven multi-label classification methods on real-world datasets.Research limitations:Three real-world datasets differ in the following aspects:statement,data quality,and purposes.Additionally,open-source tools designed for multi-label classification also have intrinsic differences in their approaches for data processing and feature selection,which in turn impacts the performance of a multi-label classification approach.In the near future,we will enhance experimental precision and reinforce the validity of conclusions by employing more rigorous control over variables through introducing expanded parameter settings.Practical implications:The observed Macro F1 and Micro F1 scores on real-world datasets typically fall short of those achieved on benchmark datasets,underscoring the complexity of real-world multi-label classification tasks.Approaches leveraging deep learning techniques offer promising solutions by accommodating the hierarchical relationships and interdependencies among labels.With ongoing enhancements in deep learning algorithms and large-scale models,it is expected that the efficacy of multi-label classification tasks will be significantly improved,reaching a level of practical utility in the foreseeable future.Originality/value:(1)Seven multi-label classification methods are comprehensively compared on three real-world datasets.(2)The TextCNN and TextRCNN models perform better on small-scale datasets with more complex hierarchical structure of labels and more balanced document-label distribution.(3)The MLkNN method works better on the larger-scale dataset with more unbalanced document-label distribution.
文摘This article delves into the analysis of performance and utilization of Support Vector Machines (SVMs) for the critical task of forest fire detection using image datasets. With the increasing threat of forest fires to ecosystems and human settlements, the need for rapid and accurate detection systems is of utmost importance. SVMs, renowned for their strong classification capabilities, exhibit proficiency in recognizing patterns associated with fire within images. By training on labeled data, SVMs acquire the ability to identify distinctive attributes associated with fire, such as flames, smoke, or alterations in the visual characteristics of the forest area. The document thoroughly examines the use of SVMs, covering crucial elements like data preprocessing, feature extraction, and model training. It rigorously evaluates parameters such as accuracy, efficiency, and practical applicability. The knowledge gained from this study aids in the development of efficient forest fire detection systems, enabling prompt responses and improving disaster management. Moreover, the correlation between SVM accuracy and the difficulties presented by high-dimensional datasets is carefully investigated, demonstrated through a revealing case study. The relationship between accuracy scores and the different resolutions used for resizing the training datasets has also been discussed in this article. These comprehensive studies result in a definitive overview of the difficulties faced and the potential sectors requiring further improvement and focus.
基金National Natural Science Foundation of China(No.61971036)Fundamental Research Funds for the Central Universities(No.2023CX01011)Beijing Nova Program(No.20230484361)。
文摘This paper proposed a method to generate semi-experimental biomedical datasets based on full-wave simulation software.The system noise such as antenna port couplings is fully considered in the proposed datasets,which is more realistic than synthetical datasets.In this paper,datasets containing different shapes are constructed based on the relative permittivities of human tissues.Then,a back-propagation scheme is used to obtain the rough reconstructions,which will be fed into a U-net convolutional neural network(CNN)to recover the high-resolution images.Numerical results show that the network trained on the datasets generated by the proposed method can obtain satisfying reconstruction results and is promising to be applied in real-time biomedical imaging.
文摘Phishing attacks pose a significant security threat by masquerading as trustworthy entities to steal sensitive information,a problem that persists despite user awareness.This study addresses the pressing issue of phishing attacks on websites and assesses the performance of three prominent Machine Learning(ML)models—Artificial Neural Networks(ANN),Convolutional Neural Networks(CNN),and Long Short-Term Memory(LSTM)—utilizing authentic datasets sourced from Kaggle and Mendeley repositories.Extensive experimentation and analysis reveal that the CNN model achieves a better accuracy of 98%.On the other hand,LSTM shows the lowest accuracy of 96%.These findings underscore the potential of ML techniques in enhancing phishing detection systems and bolstering cybersecurity measures against evolving phishing tactics,offering a promising avenue for safeguarding sensitive information and online security.
基金supported in part by the 2021 Autonomous Driving Development Innovation Project of the Ministry of Science and ICT,‘Development of Technology for Security and Ultra-High-Speed Integrity of the Next-Generation Internal Net-Work of Autonomous Vehicles’(No.2021-0-01348)and in part by the National Research Foundation of Korea(NRF)grant funded by the Korean Government Ministry of Science and ICT(MSIT)under Grant NRF-2021R1A2C2014428.
文摘Recently,automotive intrusion detection systems(IDSs)have emerged as promising defense approaches to counter attacks on in-vehicle networks(IVNs).However,the effectiveness of IDSs relies heavily on the quality of the datasets used for training and evaluation.Despite the availability of several datasets for automotive IDSs,there has been a lack of comprehensive analysis focusing on assessing these datasets.This paper aims to address the need for dataset assessment in the context of automotive IDSs.It proposes qualitative and quantitative metrics that are independent of specific automotive IDSs,to evaluate the quality of datasets.These metrics take into consideration various aspects such as dataset description,collection environment,and attack complexity.This paper evaluates eight commonly used datasets for automotive IDSs using the proposed metrics.The evaluation reveals biases in the datasets,particularly in terms of limited contexts and lack of diversity.Additionally,it highlights that the attacks in the datasets were mostly injected without considering normal behaviors,which poses challenges for training and evaluating machine learning-based IDSs.This paper emphasizes the importance of addressing the identified limitations in existing datasets to improve the performance and adaptability of automotive IDSs.The proposed metrics can serve as valuable guidelines for researchers and practitioners in selecting and constructing high-quality datasets for automotive security applications.Finally,this paper presents the requirements for high-quality datasets,including the need for representativeness,diversity,and balance.
基金National Science Foundation of China (42030611,91937301)The Second Tibetan Plateau Scientific Expedition and Research (STEP) Program (2019QZKK0105)。
文摘The CRA-Interim trial production of the global atmospheric reanalysis for 10 years from 2007 to 2016 was carried out by the China Meteorological Administration in 2017. The structural characteristics of the horizontal shear line over the Tibetan Plateau (TPHSL) based on the CRA-Interim datasets are examined by objectively identifying the shear line, and are compared with the analysis results of the European Centre for Medium-Range Weather Forecasts reanalysis data (ERA-Interim). The case occurred at 18UTC on July 5, 2016. The results show that both of the ERA-Interim and CRA-Interim datasets can well reveal the circulation background and the dynamic and thermal structure characteristics of TPHSL, and they have shown some similar features. The middle and high latitudes at 500 hPa are characterized by the circulation situation of"two troughs and two ridges", and at 200 hPa, the TPHSL is located in the northeast quadrant of the South Asian High Pressure (SAHP). The TPHSL locates in the positive vorticity zone and passes through the positive vorticity center corresponding to the ascending motion. Near the TPHSL, the contours of pseudo-equivalent potential temperature (θse) tend to be intensive, with a high-value center on the south side of the TPHSL. The TPHSL can extend to460 hPa and vertically inclines northward. There is a positive vorticity zone near the TPHSL which is also characterized by the northward inclination with the height, the ascending motion near the TPHSL can extend to 300 hPa, and the atmospheric layer above the TPHSL is stable. However, the intensities of the TPHSL’s structure characteristics analyzed with the two datasets are different, revealing the relatively strong intensity of geopotential height field, vertical velocity field, vorticity field and divergence field from the CRA-Interim datasets. In addition, the vertical profiles of the dynamic and water vapor thermal physical quantities of the two datasets are also consistent in the east and west part of the TPHSL. In summary, the reliable and usable CRA-Interim datasets show excellent properties in the analysis on the structural characteristics of a horizontal shear line over the Tibetan Plateau.
文摘Based on C-LSAT2.0,using high-and low-frequency components reconstruction methods,combined with observation constraint masking,a reconstructed C-LSAT2.0 with 756 ensemble members from the 1850s to 2018 has been developed.These ensemble versions have been merged with the ERSSTv5 ensemble dataset,and an upgraded version of the CMSTInterim dataset with 5°×5°resolution has been developed.The CMST-Interim dataset has significantly improved the coverage rate of global surface temperature data.After reconstruction,the data coverage before 1950 increased from 78%−81%of the original CMST to 81%−89%.The total coverage after 1955 reached about 93%,including more than 98%in the Northern Hemisphere and 81%−89%in the Southern Hemisphere.Through the reconstruction ensemble experiments with different parameters,a good basis is provided for more systematic uncertainty assessment of C-LSAT2.0 and CMSTInterim.In comparison with the original CMST,the global mean surface temperatures are estimated to be cooler in the second half of 19th century and warmer during the 21st century,which shows that the global warming trend is further amplified.The global warming trends are updated from 0.085±0.004℃(10 yr)^(–1)and 0.128±0.006℃(10 yr)^(–1)to 0.089±0.004℃(10 yr)^(–1)and 0.137±0.007℃(10 yr)^(–1),respectively,since the start and the second half of 20th century.
基金supported by the National Basic Research Program of China (Grant Nos. 2010CB950400 and 2013CB430301)the National Natural Science Foundation of China (Grant Nos. 41276025 and 41176023)+2 种基金the R&D Special Fund for Public Welfare Industry (Meteorology) (Grant No. GYHY201106036)The OFES simulation was conducted on the Earth Simulator under the support of JAMSTECsupported by the Data Sharing Infrastructure of Earth System Science-Data Sharing Service Center of the South China Sea and adjacent regions
文摘In this study, the upper ocean heat content (OHC) variations in the South China Sea (SCS) during 1993- 2006 were investigated by examining ocean temperatures in seven datasets, including World Ocean Atlas 2009 (WOA09) (climatology), Ishii datasets, Ocean General Circulation ModeI for the Earth Simulator (OFES), Simple Ocean Data Assimilation system (SODA), Global Ocean Data Assimilation System (GODAS), China Oceanic ReAnalysis system (CORA) , and an ocean reanalysis dataset for the joining area of Asia and Indian-Pacific Ocean (AIPO1.0). Among these datasets, two were independent of any numerical model, four relied on data assimilation, and one was generated without any data assimilation. The annual cycles revealed by the seven datasets were similar, but the interannual variations were different. Vertical structures of temperatures along the 18~N, 12.75~N, and 120~E sections were compared with data collected during open cruises in 1998 and 2005-08. The results indicated that Ishii, OFES, CORA, and AIPO1.0 were more consistent with the observations. Through systematic shortcomings and advantages in presenting the upper comparisons, we found that each dataset had its own OHC in the SCS.
基金supported by the National Key Basic Research and Development Plan (No.2015CB953900)the Natural Science Foundation of China (Nos.41330960 and 41776032)
文摘Version 4(v4) of the Extended Reconstructed Sea Surface Temperature(ERSST) dataset is compared with its precedent, the widely used version 3b(v3b). The essential upgrades applied to v4 lead to remarkable differences in the characteristics of the sea surface temperature(SST) anomaly(SSTa) in both the temporal and spatial domains. First, the largest discrepancy of the global mean SSTa values around the 1940 s is due to ship-observation corrections made to reconcile observations from buckets and engine intake thermometers. Second, differences in global and regional mean SSTa values between v4 and v3b exhibit a downward trend(around-0.032℃ per decade) before the 1940s, an upward trend(around 0.014℃ per decade) during the period of 1950–2015, interdecadal oscillation with one peak around the 1980s, and two troughs during the 1960s and 2000s, respectively. This does not derive from treatments of the polar or the other data-void regions, since the difference of the SSTa does not share the common features. Third, the spatial pattern of the ENSO-related variability of v4 exhibits a wider but weaker cold tongue in the tropical region of the Pacific Ocean compared with that of v3b, which could be attributed to differences in gap-filling assumptions since the latter features satellite observations whereas the former features in situ ones. This intercomparison confirms that the structural uncertainty arising from underlying assumptions on the treatment of diverse SST observations even in the same SST product family is the main source of significant SST differences in the temporal domain. Why this uncertainty introduces artificial decadal oscillations remains unknown.
文摘Anomaly based approaches in network intrusion detection suffer from evaluation, comparison and deployment which originate from the scarcity of adequate publicly available network trace datasets. Also, publicly available datasets are either outdated or generated in a controlled environment. Due to the ubiquity of cloud computing environments in commercial and government internet services, there is a need to assess the impacts of network attacks in cloud data centers. To the best of our knowledge, there is no publicly available dataset which captures the normal and anomalous network traces in the interactions between cloud users and cloud data centers. In this paper, we present an experimental platform designed to represent a practical interaction between cloud users and cloud services and collect network traces resulting from this interaction to conduct anomaly detection. We use Amazon web services (AWS) platform for conducting our experiments.
文摘In the past few decades,meteorological datasets from remote sensing techniques in agricultural and water resources management have been used by various researchers and managers.Based on the literature,meteorological datasets are not more accurate than synoptic stations,but their various advantages,such as spatial coverage,time coverage,accessibility,and free use,have made these techniques superior,and sometimes we can use them instead of synoptic stations.In this study,we used four meteorological datasets,including Climatic Research Unit gridded Time Series(CRU TS),Global Precipitation Climatology Centre(GPCC),Agricultural National Aeronautics and Space Administration Modern-Era Retrospective Analysis for Research and Applications(AgMERRA),Agricultural Climate Forecast System Reanalysis(AgCFSR),to estimate climate variables,i.e.,precipitation,maximum temperature,and minimum temperature,and crop variables,i.e.,reference evapotranspiration,irrigation requirement,biomass,and yield of maize,in Qazvin Province of Iran during 1980-2009.At first,data were gathered from the four meteorological datasets and synoptic station in this province,and climate variables were calculated.Then,after using the AquaCrop model to calculate the crop variables,we compared the results of the synoptic station and meteorological datasets.All the four meteorological datasets showed strong performance for estimating climate variables.AgMERRA and AgCFSR had more accurate estimations for precipitation and maximum temperature.However,their normalized root mean square error was inferior to CRU for minimum temperature.Furthermore,they were all very efficient for estimating the biomass and yield of maize in this province.For reference evapotranspiration and irrigation requirement CRU TS and GPCC were the most efficient rather than AgMERRA and AgCFSR.But for the estimation of biomass and yield,all the four meteorological datasets were reliable.To sum up,GPCC and AgCFSR were the two best datasets in this study.This study suggests the use of meteorological datasets in water resource management and agricultural management to monitor past changes and estimate recent trends.
基金financial support provided by the National Natural Science Foundation of China(51508272,11832013,51878350,and 51678304)。
文摘In this study,through experimental research and an investigation on large datasets of the durability parameters in ocean engineering,the values,ranges,and types of distribution of the durability parameters employed for the durability design in ocean engineering in northern China were confirmed.Based on a modified theoretical model of chloride diffusion and the reliability theory,the service lives of concrete structures exposed to the splash,tidal,and underwater zones were calculated.Mixed concrete proportions meeting the requirement of a service life of 100 or 120 years were designed,and a cover thickness requirement was proposed.In addition,the effects of the different time-varying relationships of the boundary condition(Cs)and diffusion coefficient(Df)on the service life were compared;the results showed that the time-varying relationships used in this study(i.e.,Cscontinuously increased and then remained stable,and Dfcontinuously decreased and then remained stable)were beneficial for the durability design of concrete structures in marine environment.
基金National Natural Science Foundation of China(41465003)National Natural Science Foundation of China(41665006)China National Special Funding Project for Meteorology(GYHY201406010,GYHY201306071)
文摘The differences in the climatology of extratropical transition(ET) of western North Pacific tropical cyclones(TCs) were investigated in this study using the TCs best-track datasets of China Meteorological Administration(CMA),Japan Meteorological Agency(JMA) and the Joint Typhoon Warning Center(JTWC). The results show that the ET identification, ET completion time, and post-ET duration reported in the JTWC dataset are greatly different from those in CMA and JMA datasets during 2004-2010. However, the key differences between the CMA and JMA datasets from 1951 to 2010 are the ET identification and the post-ET duration, because of inconsistent objective ET criteria used in the centers. Further analysis indicates that annual ET percentage of CMA was lower than that of JMA, and exhibited an interannual decreasing trend, while that of JMA was an unchanged trend. The western North Pacific ET events occurred mainly during the period June to November. The latitude of ET occurrence shifted northward from February to August,followed by a southward shift. Most of ET events were observed between 35°N and 45°N. From a regional perspective,TCs tended to undergo ET in Japan and the ocean east to it. It is found that TCs which experienced the ET process at higher latitudes were generally more intense at the ET completion time. TCs completing the ET overland or offshore were weaker than those finishing the ET over the ocean. Most of the TCs weakened 24 h before the completion of ET.In contrast, 21%(27%) of the TCs showed an intensification process based on the CMA(JMA) dataset during the post-ET period. The results presented in this study indicate that consistent ET determination criteria are needed to reduce the uncertainty involved in ET identification among the centers.
基金funded by the University Transportation Center for Underground Transportation Infrastructure(UTC-UTI)at the Colorado School of Mines under Grant No.69A3551747118 from the US Department of Transportation(DOT).
文摘Prediction of tunneling-induced ground settlements is an essential task,particularly for tunneling in urban settings.Ground settlements should be limited within a tolerable threshold to avoid damages to aboveground structures.Machine learning(ML)methods are becoming popular in many fields,including tunneling and underground excavations,as a powerful learning and predicting technique.However,the available datasets collected from a tunneling project are usually small from the perspective of applying ML methods.Can ML algorithms effectively predict tunneling-induced ground settlements when the available datasets are small?In this study,seven ML methods are utilized to predict tunneling-induced ground settlement using 14 contributing factors measured before or during tunnel excavation.These methods include multiple linear regression(MLR),decision tree(DT),random forest(RF),gradient boosting(GB),support vector regression(SVR),back-propagation neural network(BPNN),and permutation importancebased BPNN(PI-BPNN)models.All methods except BPNN and PI-BPNN are shallow-structure ML methods.The effectiveness of these seven ML approaches on small datasets is evaluated using model accuracy and stability.The model accuracy is measured by the coefficient of determination(R2)of training and testing datasets,and the stability of a learning algorithm indicates robust predictive performance.Also,the quantile error(QE)criterion is introduced to assess model predictive performance considering underpredictions and overpredictions.Our study reveals that the RF algorithm outperforms all the other models with the highest model prediction accuracy(0.9)and stability(3.0210^(-27)).Deep-structure ML models do not perform well for small datasets with relatively low model accuracy(0.59)and stability(5.76).The PI-BPNN architecture is proposed and designed for small datasets,showing better performance than typical BPNN.Six important contributing factors of ground settlements are identified,including tunnel depth,the distance between tunnel face and surface monitoring points(DTM),weighted average soil compressibility modulus(ACM),grouting pressure,penetrating rate and thrust force.
文摘Data-driven algorithms for predicting mechanical properties with small datasets are evaluated in a case study on gear steel hardenability.The limitations of current data-driven algorithms and empirical models are identified.Challenges in analysing small datasets are discussed,and solution is proposed to handle small datasets with multiple variables.Gaussian methods in combination with novel predictive algorithms are utilized to overcome the challenges in analysing gear steel hardenability data and to gain insight into alloying elements interaction and structure homogeneity.The gained fundamental knowledge integrated with machine learning is shown to be superior to the empirical equations in predicting hardenability.Metallurgical-property relationships between chemistry,sample size,and hardness are predicted via two optimized machine learning algorithms:neural networks(NNs)and extreme gradient boosting(XGboost).A comparison is drawn between all algorithms,evaluating their performance based on small data sets.The results reveal that XGboost has the highest potential for predicting hardenability using small datasets with class imbalance and large inhomogeneity issues.
文摘Hydro-climatological study is difficult in most of the developing countries due to the paucity of monitoring stations. Gridded climatological data provides an opportunity to extrapolate climate to areas without monitoring stations based on their ability to replicate the Spatio-temporal distribution and variability of observed datasets. Simple correlation and error analyses are not enough to predict the variability and distribution of precipitation and temperature. In this study, the coefficient of correlation (R2), Root mean square error (RMSE), mean bias error (MBE) and mean wet and dry spell lengths were used to evaluate the performance of three widely used daily gridded precipitation, maximum and minimum temperature datasets from the Climatic Research Unit (CRU), Princeton University Global Meteorological Forcing (PGF) and Climate Forecast System Reanalysis (CFSR) datasets available over the Niger Delta part of Nigeria. The Standardised Precipitation Index was used to assess the confidence of using gridded precipitation products on water resource management. Results of correlation, error, and spell length analysis revealed that the CRU and PGF datasets performed much better than the CFSR datasets. SPI values also indicate a good association between station and CRU precipitation products. The CFSR datasets in comparison with the other data products in many years overestimated and underestimated the SPI. This indicates weak accuracy in predictability, hence not reliable for water resource management in the study area. However, CRU data products were found to perform much better in most of the statistical assessments conducted. This makes the methods used in this study to be useful for the assessment of various gridded datasets in various hydrological and climatic applications.
文摘Generative AI models for music and the arts in general are increasingly complex and hard to understand.The field of ex-plainable AI(XAI)seeks to make complex and opaque AI models such as neural networks more understandable to people.One ap-proach to making generative AI models more understandable is to impose a small number of semantically meaningful attributes on gen-erative AI models.This paper contributes a systematic examination of the impact that different combinations of variational auto-en-coder models(measureVAE and adversarialVAE),configurations of latent space in the AI model(from 4 to 256 latent dimensions),and training datasets(Irish folk,Turkish folk,classical,and pop)have on music generation performance when 2 or 4 meaningful musical at-tributes are imposed on the generative model.To date,there have been no systematic comparisons of such models at this level of com-binatorial detail.Our findings show that measureVAE has better reconstruction performance than adversarialVAE which has better musical attribute independence.Results demonstrate that measureVAE was able to generate music across music genres with inter-pretable musical dimensions of control,and performs best with low complexity music such as pop and rock.We recommend that a 32 or 64 latent dimensional space is optimal for 4 regularised dimensions when using measureVAE to generate music across genres.Our res-ults are the first detailed comparisons of configurations of state-of-the-art generative AI models for music and can be used to help select and configure AI models,musical features,and datasets for more understandable generation of music.
基金supported by National Key Research and Development Program of China under grant number 2022YFB3903404National Natural Science Foundation of China under grant number 42325105,42071350LIESMARS Special Research Funding.
文摘High resolution satellite images are becoming increasingly available for urban multi-temporal semantic understanding.However,few datasets can be used for land-use/land-cover(LULC)classification,binary change detection(BCD)and semantic change detection(SCD)simultaneously because classification datasets always have one time phase and BCD datasets focus only on the changed location,ignoring the changed classes.Public SCD datasets are rare but much needed.To solve the above problems,a tri-temporal SCD dataset made up of Gaofen-2(GF-2)remote sensing imagery(with 11 LULC classes and 60 change directions)was built in this study,namely,the Wuhan Urban Semantic Understanding(WUSU)dataset.Popular deep learning based methods for LULC classification,BCD and SCD are tested to verify the reliability of WUSU.A Siamese-based multi-task joint framework with a multi-task joint loss(MJ loss)named ChangeMJ is proposed to restore the object boundaries and obtains the best results in LULC classification,BCD and SCD,compared to the state-of-the-art(SOTA)methods.Finally,a large spatial-scale mapping for Wuhan central urban area is carried out to verify that the WUsU dataset and the ChangeMJ framework have good application values.
文摘With the advancement of technology and the increase in user demands, gesture recognition played a pivotal role in the field of human-computer interaction. Among various sensing devices, Time-of-Flight (ToF) sensors were widely applied due to their low cost. This paper explored the implementation of a human hand posture recognition system using ToF sensors and residual neural networks. Firstly, this paper reviewed the typical applications of human hand recognition. Secondly, this paper designed a hand gesture recognition system using a ToF sensor VL53L5. Subsequently, data preprocessing was conducted, followed by training the constructed residual neural network. Then, the recognition results were analyzed, indicating that gesture recognition based on the residual neural network achieved an accuracy of 98.5% in a 5-class classification scenario. Finally, the paper discussed existing issues and future research directions.
基金China Global Change Research Program, No.2010CB950901 National Natural Science Foundation of China, No.41271227 No.41001122
文摘Land use/cover change is an important parameter in the climate and ecological simulations. Although they had been widely used in the community, SAGE dataset and HYDE dataset, the two representative global historical land use datasets, were little assessed about their accuracies in regional scale. Here, we carried out some assessments for the traditional cultivated region of China (TCRC) over last 300 years, by comparing SAGE2010 and HYDE (v3.1) with Chinese Historical Cropland Dataset (CHCD). The comparisons were performed at three spatial scales: entire study area, provincial area and 60 km by 60 km grid cell. The results show that (1) the cropland area from SAGE2010 was much more than that from CHCD moreover, the growth at a rate of 0.51% from 1700 to 1950 and -0.34% after 1950 were also inconsistent with that from CHCD. (2) HYDE dataset (v3.1) was closer to CHCD dataset than SAGE dataset on entire study area. However, the large biases could be detected at provincial scale and 60 km by 60 km grid cell scale. The percent of grid cells having biases greater than 70% (〈-70% or 〉70%) and 90% (〈-90% or 〉90%) accounted for 56%-63% and 40%-45% of the total grid cells respectively while those having biases range from -10% to 10% and from -30% to 30% account for only 5%-6% and 17% of the total grid cells respectively. (3) Using local historical archives to reconstruct historical dataset with high accuracy would be a valu- able way to improve the accuracy of climate and ecological simulation.