By employing the unique phenological feature of winter wheat extracted from peak before winter (PBW) and the advantages of moderate resolution imaging spectroradiometer (MODIS) data with high temporal resolution a...By employing the unique phenological feature of winter wheat extracted from peak before winter (PBW) and the advantages of moderate resolution imaging spectroradiometer (MODIS) data with high temporal resolution and intermediate spatial resolution, a remote sensing-based model for mapping winter wheat on the North China Plain was built through integration with Landsat images and land-use data. First, a phenological window, PBW was drawn from time-series MODIS data. Next, feature extraction was performed for the PBW to reduce feature dimension and enhance its information. Finally, a regression model was built to model the relationship of the phenological feature and the sample data. The amount of information of the PBW was evaluated and compared with that of the main peak (MP). The relative precision of the mapping reached up to 92% in comparison to the Landsat sample data, and ranged between 87 and 96% in comparison to the statistical data. These results were sufficient to satisfy the accuracy requirements for winter wheat mapping at a large scale. Moreover, the proposed method has the ability to obtain the distribution information for winter wheat in an earlier period than previous studies. This study could throw light on the monitoring of winter wheat in China by using unique phenological feature of winter wheat.展开更多
Clustering is used to gain an intuition of the struc tures in the data.Most of the current clustering algorithms pro duce a clustering structure even on data that do not possess such structure.In these cases,the algor...Clustering is used to gain an intuition of the struc tures in the data.Most of the current clustering algorithms pro duce a clustering structure even on data that do not possess such structure.In these cases,the algorithms force a structure in the data instead of discovering one.To avoid false structures in the relations of data,a novel clusterability assessment method called density-based clusterability measure is proposed in this paper.I measures the prominence of clustering structure in the data to evaluate whether a cluster analysis could produce a meaningfu insight to the relationships in the data.This is especially useful in time-series data since visualizing the structure in time-series data is hard.The performance of the clusterability measure is evalu ated against several synthetic data sets and time-series data sets which illustrate that the density-based clusterability measure can successfully indicate clustering structure of time-series data.展开更多
Underground coal fires are one of the most common and serious geohazards in most coal producing countries in the world. Monitoring their spatio-temporal changes plays an important role in controlling and preventing th...Underground coal fires are one of the most common and serious geohazards in most coal producing countries in the world. Monitoring their spatio-temporal changes plays an important role in controlling and preventing the effects of coal fires, and their environmental impact. In this study, the spatio-temporal changes of underground coal fires in Khanh Hoa coal field(North-East of Viet Nam) were analyzed using Landsat time-series data during the 2008-2016 period. Based on land surface temperatures retrieved from Landsat thermal data, underground coal fires related to thermal anomalies were identified using the MEDIAN+1.5×IQR(IQR: Interquartile range) threshold technique. The locations of underground coal fires were validated using a coal fire map produced by the field survey data and cross-validated using the daytime ASTER thermal infrared imagery. Based on the fires extracted from seven Landsat thermal imageries, the spatiotemporal changes of underground coal fire areas were analyzed. The results showed that the thermalanomalous zones have been correlated with known coal fires. Cross-validation of coal fires using ASTER TIR data showed a high consistency of 79.3%. The largest coal fire area of 184.6 hectares was detected in 2010, followed by 2014(181.1 hectares) and 2016(178.5 hectares). The smaller coal fire areas were extracted with areas of 133.6 and 152.5 hectares in 2011 and 2009 respectively. Underground coal fires were mainly detected in the northern and southern part, and tend to spread to north-west of the coal field.展开更多
Based on the 16d-composite MODIS (moderate resolution imaging spectroradiometer)-NDVI(normalized difference vegetation index) time-series data in 2004, vegetation in North Tibet Plateau was classified and seasonal...Based on the 16d-composite MODIS (moderate resolution imaging spectroradiometer)-NDVI(normalized difference vegetation index) time-series data in 2004, vegetation in North Tibet Plateau was classified and seasonal variations on the pixels selected from different vegetation type were analyzed. The Savitzky-Golay filtering algorithm was applied to perform a filtration processing for MODIS-NDVI time-series data. The processed time-series curves can reflect a real variation trend of vegetation growth. The NDVI time-series curves of coniferous forest, high-cold meadow, high-cold meadow steppe and high-cold steppe all appear a mono-peak model during vegetation growth with the maximum peak occurring in August. A decision-tree classification model was established according to either NDVI time-series data or land surface temperature data. And then, both classifying and processing for vegetations were carried out through the model based on NDVI time-series curves. An accuracy test illustrates that classification results are of high accuracy and credibility and the model is conducive for studying a climate variation and estimating a vegetation production at regional even global scale.展开更多
In order to effectively conduct emotion recognition from spontaneous, non-prototypical and unsegmented speech so as to create a more natural human-machine interaction; a novel speech emotion recognition algorithm base...In order to effectively conduct emotion recognition from spontaneous, non-prototypical and unsegmented speech so as to create a more natural human-machine interaction; a novel speech emotion recognition algorithm based on the combination of the emotional data field (EDF) and the ant colony search (ACS) strategy, called the EDF-ACS algorithm, is proposed. More specifically, the inter- relationship among the turn-based acoustic feature vectors of different labels are established by using the potential function in the EDF. To perform the spontaneous speech emotion recognition, the artificial colony is used to mimic the turn- based acoustic feature vectors. Then, the canonical ACS strategy is used to investigate the movement direction of each artificial ant in the EDF, which is regarded as the emotional label of the corresponding turn-based acoustic feature vector. The proposed EDF-ACS algorithm is evaluated on the continueous audio)'visual emotion challenge (AVEC) 2012 dataset, which contains the spontaneous, non-prototypical and unsegmented speech emotion data. The experimental results show that the proposed EDF-ACS algorithm outperforms the existing state-of-the-art algorithm in turn-based speech emotion recognition.展开更多
Four levels of the data from the search coil magnetometer(SCM) onboard the China Seismo-Electromagnetic Satellite(CSES)are defined and described. The data in different levels all contain three components of the wavefo...Four levels of the data from the search coil magnetometer(SCM) onboard the China Seismo-Electromagnetic Satellite(CSES)are defined and described. The data in different levels all contain three components of the waveform and/or spectrum of the induced magnetic field around the orbit in the frequency range of 10 Hz to 20 kHz; these are divided into an ultra-low-frequency band(ULF,10–200 Hz), an extremely low frequency band(ELF, 200–2200 Hz), and a very low frequency band(VLF, 1.8–20 kHz). Examples of data products for Level-2, Level-3, and Level-4 are presented. The initial results obtained in the commission test phase demonstrated that the SCM was in a normal operational status and that the data are of high enough quality to reliably capture most space weather events related to low-frequency geomagnetic disturbances.展开更多
Influenza is a kind of infectious disease, which spreads quickly and widely. The outbreak of influenza has brought huge losses to society. In this paper, four major categories of flu keywords, “prevention phase”, “...Influenza is a kind of infectious disease, which spreads quickly and widely. The outbreak of influenza has brought huge losses to society. In this paper, four major categories of flu keywords, “prevention phase”, “symptom phase”, “treatment phase”, and “commonly-used phrase” were set. Python web crawler was used to obtain relevant influenza data from the National Influenza Center’s influenza surveillance weekly report and Baidu Index. The establishment of support vector regression (SVR), least absolute shrinkage and selection operator (LASSO), convolutional neural networks (CNN) prediction models through machine learning, took into account the seasonal characteristics of the influenza, also established the time series model (ARMA). The results show that, it is feasible to predict influenza based on web search data. Machine learning shows a certain forecast effect in the prediction of influenza based on web search data. In the future, it will have certain reference value in influenza prediction. The ARMA(3,0) model predicts better results and has greater generalization. Finally, the lack of research in this paper and future research directions are given.展开更多
Multi-level searching is called Drill down search.Right now,no drill down search feature is available in the existing search engines like Google,Yahoo,Bing and Baidu.Drill down search is very much useful for the end u...Multi-level searching is called Drill down search.Right now,no drill down search feature is available in the existing search engines like Google,Yahoo,Bing and Baidu.Drill down search is very much useful for the end user tofind the exact search results among the huge paginated search results.Higher level of drill down search with category based search feature leads to get the most accurate search results but it increases the number and size of thefile system.The purpose of this manuscript is to implement a big data storage reduction binaryfile system model for category based drill down search engine that offers fast multi-levelfiltering capability.The basic methodology of the proposed model stores the search engine data in the binaryfile system model.To verify the effectiveness of the proposedfile system model,5 million unique keyword data are stored into a binaryfile,thereby analysing the proposedfile system with efficiency.Some experimental results are also provided based on real data that show our storage model speed and superiority.Experiments demonstrated that ourfile system expansion ratio is constant and it reduces the disk storage space up to 30%with conventional database/file system and it also increases the search performance for any levels of search.To discuss deeply,the paper starts with the short introduction of drill down search followed by the discussion of important technologies used to implement big data storage reduction system in detail.展开更多
The study investigated user experience, display complexity, display type (tables versus graphs), and task difficulty as variables affecting the user’s ability to navigate through complex visual data. A total of 64 pa...The study investigated user experience, display complexity, display type (tables versus graphs), and task difficulty as variables affecting the user’s ability to navigate through complex visual data. A total of 64 participants, 39 undergraduate students (novice users) and 25 graduate students (intermediate-level users) participated in the study. The experimental design was 2 × 2 × 2 × 3 mixed design using two between-subject variables (display complexity, user experience) and two within-subject variables (display format, question difficulty). The results indicated that response time was superior for graphs (relative to tables), especially when the questions were difficult. The intermediate users seemed to adopt more extensive search strategies than novices, as revealed by an analysis of the number of changes they made to the display prior to answering questions. It was concluded that designers of data displays should consider the (a) type of display, (b) difficulty of the task, and (c) expertise level of the user to obtain optimal levels of performance.展开更多
A search strategy over encrypted cloud data based on keywords has been improved and has presented a method using different strategies on the client and the server to improve the search efficiency in this paper. The cl...A search strategy over encrypted cloud data based on keywords has been improved and has presented a method using different strategies on the client and the server to improve the search efficiency in this paper. The client uses the Chinese and English to achieve the synonym construction of the keywords, the establishment of the fuzzy-syllable words and synonyms set of keywords and the implementation of fuzzy search strategy over the encryption of cloud data based on keywords. The server side through the analysis of the user’s query request provides keywords for users to choose and topic words and secondary words are picked out. System will match topic words with historical inquiry in time order, and then the new query result of the request is directly gained. The analysis of the simulation experiment shows that the fuzzy search strategy can make better use of historical results on the basis of privacy protection for the realization of efficient data search, saving the search time and improving the efficiency of search.展开更多
Problems existin similarity measurement and index tree construction which affect the performance of nearest neighbor search of high-dimensional data. The equidistance problem is solved using NPsim function to calculat...Problems existin similarity measurement and index tree construction which affect the performance of nearest neighbor search of high-dimensional data. The equidistance problem is solved using NPsim function to calculate similarity. And a sequential NPsim matrix is built to improve indexing performance. To sum up the above innovations,a nearest neighbor search algorithm of high-dimensional data based on sequential NPsim matrix is proposed in comparison with the nearest neighbor search algorithms based on KD-tree or SR-tree on Munsell spectral data set. Experimental results show that the proposed algorithm similarity is better than that of other algorithms and searching speed is more than thousands times of others. In addition,the slow construction speed of sequential NPsim matrix can be increased by using parallel computing.展开更多
Many search-based algorithms have been successfully applied in sev-eral software engineering activities.Genetic algorithms(GAs)are the most used in the scientific domains by scholars to solve software testing problems....Many search-based algorithms have been successfully applied in sev-eral software engineering activities.Genetic algorithms(GAs)are the most used in the scientific domains by scholars to solve software testing problems.They imi-tate the theory of natural selection and evolution.The harmony search algorithm(HSA)is one of the most recent search algorithms in the last years.It imitates the behavior of a musician tofind the best harmony.Scholars have estimated the simi-larities and the differences between genetic algorithms and the harmony search algorithm in diverse research domains.The test data generation process represents a critical task in software validation.Unfortunately,there is no work comparing the performance of genetic algorithms and the harmony search algorithm in the test data generation process.This paper studies the similarities and the differences between genetic algorithms and the harmony search algorithm based on the ability and speed offinding the required test data.The current research performs an empirical comparison of the HSA and the GAs,and then the significance of the results is estimated using the t-Test.The study investigates the efficiency of the harmony search algorithm and the genetic algorithms according to(1)the time performance,(2)the significance of the generated test data,and(3)the adequacy of the generated test data to satisfy a given testing criterion.The results showed that the harmony search algorithm is significantly faster than the genetic algo-rithms because the t-Test showed that the p-value of the time values is 0.026<α(αis the significance level=0.05 at 95%confidence level).In contrast,there is no significant difference between the two algorithms in generating the adequate test data because the t-Test showed that the p-value of thefitness values is 0.25>α.展开更多
Digital broadcasting is a novel paradigm for the next generation broadcasting. Its goal is to provide not only better quality of pictures but also a variety of services that is impossible in traditional airwaves broad...Digital broadcasting is a novel paradigm for the next generation broadcasting. Its goal is to provide not only better quality of pictures but also a variety of services that is impossible in traditional airwaves broadcasting. One of the important factors for this new broadcasting environment is the interoperability among broadcasting applications since the environment is distributed. Therefore the broadcasting metadata becomes increasingly important and one of the metadata standards for a digital broadcasting is TV-Anytime metadata. TV-Anytime metadata is defined using XML schema, so its instances are XML data. In order to fulfill interoperability, a standard query language is also required and XQuery is a natural choice. There are some researches for dealing with broadcasting metadata. In our previous study, we have proposed the method for efficiently managing the broadcasting metadata in a service provider. However, the environment of a Set-Top Box for digital broadcasting is limited such as low-cost and low-setting. Therefore there are some considerations to apply general approaches for managing the metadata into the Set-Top Box. This paper proposes a method for efficiently managing the broadcasting metadata based on the Set-Top Box and a prototype of metadata management system for evaluating our method. Our system consists of a storage engine to store the metadata and an XQuery engine to search the stored metadata and uses special index for storing and searching. Our two engines are designed independently with hardware platform therefore these engines can be used in any low-cost applications to manage broadcasting metadata.展开更多
This paper describes the nearest neighbor (NN) search algorithm on the GBD(generalized BD) tree. The GBD tree is a spatial data structure suitable for two-or three-dimensional data and has good performance characteris...This paper describes the nearest neighbor (NN) search algorithm on the GBD(generalized BD) tree. The GBD tree is a spatial data structure suitable for two-or three-dimensional data and has good performance characteristics with respect to the dynamic data environment. On GIS and CAD systems, the R-tree and its successors have been used. In addition, the NN search algorithm is also proposed in an attempt to obtain good performance from the R-tree. On the other hand, the GBD tree is superior to the R-tree with respect to exact match retrieval, because the GBD tree has auxiliary data that uniquely determines the position of the object in the structure. The proposed NN search algorithm depends on the property of the GBD tree described above. The NN search algorithm on the GBD tree was studied and the performance thereof was evaluated through experiments.展开更多
This study compares websites that take live data into account using search engine optimization(SEO).A series of steps called search engine optimization can help a website rank highly in search engine results.Static we...This study compares websites that take live data into account using search engine optimization(SEO).A series of steps called search engine optimization can help a website rank highly in search engine results.Static websites and dynamic websites are two different types of websites.Static websites must have the necessary expertise in programming compatible with SEO.Whereas in dynamic websites,one can utilize readily available plugins/modules.The fundamental issue of all website holders is the lower level of page rank,congestion,utilization,and exposure of the website on the search engine.Here,the authors have studied the live data of four websites as the real-time data would indicate how the SEO strategy may be applied to website page rank,page difficulty removal,and brand query,etc.It is also necessary to choose relevant keywords on any website.The right keyword might assist to increase the brand query while also lowering the page difficulty both on and off the page.In order to calculate Off-page SEO,On-page SEO,and SEO Difficulty,the authors examined live data in this study and chose four well-known Indian university and institute websites for this study:www.caluniv.ac.in,www.jnu.ac.in,www.iima.ac.in,and www.iitb.ac.in.Using live data and SEO,the authors estimated the Off-page SEO,On-page SEO,and SEO Difficulty.It has been shown that the Off-page SEO of www.caluniv.ac.in is lower than that of www.jnu.ac.in,www.iima.ac.in,and www.iitb.ac.in by 9%,7%,and 7%,respectively.On-page SEO is,in comparison,4%,1%,and 1%more.Every university has continued to keep up its own brand query.Additionally,www.caluniv.ac.in has slightly less SEO Difficulty compared to other websites.The final computed results have been displayed and compared.展开更多
Unlike consumers in the mall or supermarkets, online consumers are “intangible” and their purchasing behaviors are affected by multiple factors, including product pricing, promotion and discounts, quality of product...Unlike consumers in the mall or supermarkets, online consumers are “intangible” and their purchasing behaviors are affected by multiple factors, including product pricing, promotion and discounts, quality of products and brands, and the platforms where they search for the product. In this research, I study the relationship between product sales and consumer characteristics, the relationship between product sales and product qualities, demand curve analysis, and the search friction effect for different platforms. I utilized data from a randomized field experiment involving more than 400 thousand customers and 30 thousand products on JD.com, one of the world’s largest online retailing platforms. There are two focuses of the research: 1) how different consumer characteristics affect sales;2) how to set price and possible search friction for different channels. I find that JD plus membership, education level and age have no significant relationship with product sales, and higher user level leads to higher sales. Sales are highly skewed, with very high numbers of products sold making up only a small percentage of the total. Consumers living in more industrialized cities have more purchasing power. Women and singles lead to higher spending. Also, the better the product performs, the more it sells. Moderate pricing can increase product sales. Based on the research results of search volume in different channels, it is suggested that it is better to focus on app sales. By knowing the results, producers can adjust target consumers for different products and do target advertisements in order to maximize the sales. Also, an appropriate price for a product is also crucial to a seller. By the way, knowing the search friction of different channels can help producers to rearrange platform layout so that search friction can be reduced and more potential deals may be made.展开更多
With the development of information technology,the online retrieval of remote electronic data has become an important method for investigative agencies to collect evidence.In the current normative documents,the online...With the development of information technology,the online retrieval of remote electronic data has become an important method for investigative agencies to collect evidence.In the current normative documents,the online retrieval of electronic data is positioned as a new type of arbitrary investigative measure.However,study of its actual operation has found that the online retrieval of electronic data does not fully comply with the characteristics of arbitrary investigative measures.The root cause is its inaccurately defined nature due to analogy errors,an emphasis on the authenticity of electronic data at the cost of rights protection,insufficient effectiveness of normative documents to break through the boundaries of law,and superficial inconsistency found in the mechanical comparison with the nature of existing investigative measures causes.The nature of electronic data retrieved online should be defined according to different circumstances.The retrieval of electronic data disclosed on the Internet is an arbitrary investigative measure,and following procedural specifications should be sufficient.When investigators conceal their true identities and enter the cyberspace of the suspected crime through a registered account to extract dynamic electronic data for criminal activities,it is essentially a covert investigation in cyberspace,and they should follow the normative requirements for covert investigations.The retrieval of dynamic electronic data from private spaces is a technical investigative measure and should be implemented in accordance with the technical investigative procedures.Retrieval of remote“non-public electronic data involving privacy”is a mandatory investigative measure,and is essentially a search in the virtual space.Therefore,procedural specifications should be set in accordance with the standards of searching.展开更多
The architecture and working principle of coordinated search and rescue system of unmanned/manned aircraft,which is composed of manned/unmanned aircraft and manned aircraft,were first introduced,and they can cooperate...The architecture and working principle of coordinated search and rescue system of unmanned/manned aircraft,which is composed of manned/unmanned aircraft and manned aircraft,were first introduced,and they can cooperate with each other to complete a search and rescue task.Secondly,a threat assessment method based on meteorological data was proposed,and potential meteorological threats,such as storms and rainfall,can be predicted by collecting and analyzing meteorological data.Finally,an experiment was carried out to evaluate the performance of the proposed method in different scenarios.The experimental results show that the coordinated search and rescue system of unmanned/manned aircraft can be used to effectively assess meteorological threats and provide accurate search and rescue guidance.展开更多
基金supported by the open research fund of the Key Laboratory of Agri-informatics,Ministry of Agriculture and the fund of Outstanding Agricultural Researcher,Ministry of Agriculture,China
文摘By employing the unique phenological feature of winter wheat extracted from peak before winter (PBW) and the advantages of moderate resolution imaging spectroradiometer (MODIS) data with high temporal resolution and intermediate spatial resolution, a remote sensing-based model for mapping winter wheat on the North China Plain was built through integration with Landsat images and land-use data. First, a phenological window, PBW was drawn from time-series MODIS data. Next, feature extraction was performed for the PBW to reduce feature dimension and enhance its information. Finally, a regression model was built to model the relationship of the phenological feature and the sample data. The amount of information of the PBW was evaluated and compared with that of the main peak (MP). The relative precision of the mapping reached up to 92% in comparison to the Landsat sample data, and ranged between 87 and 96% in comparison to the statistical data. These results were sufficient to satisfy the accuracy requirements for winter wheat mapping at a large scale. Moreover, the proposed method has the ability to obtain the distribution information for winter wheat in an earlier period than previous studies. This study could throw light on the monitoring of winter wheat in China by using unique phenological feature of winter wheat.
文摘Clustering is used to gain an intuition of the struc tures in the data.Most of the current clustering algorithms pro duce a clustering structure even on data that do not possess such structure.In these cases,the algorithms force a structure in the data instead of discovering one.To avoid false structures in the relations of data,a novel clusterability assessment method called density-based clusterability measure is proposed in this paper.I measures the prominence of clustering structure in the data to evaluate whether a cluster analysis could produce a meaningfu insight to the relationships in the data.This is especially useful in time-series data since visualizing the structure in time-series data is hard.The performance of the clusterability measure is evalu ated against several synthetic data sets and time-series data sets which illustrate that the density-based clusterability measure can successfully indicate clustering structure of time-series data.
基金funded by the Ministry-level Scientific and Technological Key Programs of Ministry of Natural Resources and Environment of Viet Nam "Application of thermal infrared remote sensing and GIS for mapping underground coal fires in Quang Ninh coal basin" (Grant No. TNMT.2017.08.06)
文摘Underground coal fires are one of the most common and serious geohazards in most coal producing countries in the world. Monitoring their spatio-temporal changes plays an important role in controlling and preventing the effects of coal fires, and their environmental impact. In this study, the spatio-temporal changes of underground coal fires in Khanh Hoa coal field(North-East of Viet Nam) were analyzed using Landsat time-series data during the 2008-2016 period. Based on land surface temperatures retrieved from Landsat thermal data, underground coal fires related to thermal anomalies were identified using the MEDIAN+1.5×IQR(IQR: Interquartile range) threshold technique. The locations of underground coal fires were validated using a coal fire map produced by the field survey data and cross-validated using the daytime ASTER thermal infrared imagery. Based on the fires extracted from seven Landsat thermal imageries, the spatiotemporal changes of underground coal fire areas were analyzed. The results showed that the thermalanomalous zones have been correlated with known coal fires. Cross-validation of coal fires using ASTER TIR data showed a high consistency of 79.3%. The largest coal fire area of 184.6 hectares was detected in 2010, followed by 2014(181.1 hectares) and 2016(178.5 hectares). The smaller coal fire areas were extracted with areas of 133.6 and 152.5 hectares in 2011 and 2009 respectively. Underground coal fires were mainly detected in the northern and southern part, and tend to spread to north-west of the coal field.
基金the Frontier Program of the Knowledge Innovation Program of Chinese Academy of Sciences
文摘Based on the 16d-composite MODIS (moderate resolution imaging spectroradiometer)-NDVI(normalized difference vegetation index) time-series data in 2004, vegetation in North Tibet Plateau was classified and seasonal variations on the pixels selected from different vegetation type were analyzed. The Savitzky-Golay filtering algorithm was applied to perform a filtration processing for MODIS-NDVI time-series data. The processed time-series curves can reflect a real variation trend of vegetation growth. The NDVI time-series curves of coniferous forest, high-cold meadow, high-cold meadow steppe and high-cold steppe all appear a mono-peak model during vegetation growth with the maximum peak occurring in August. A decision-tree classification model was established according to either NDVI time-series data or land surface temperature data. And then, both classifying and processing for vegetations were carried out through the model based on NDVI time-series curves. An accuracy test illustrates that classification results are of high accuracy and credibility and the model is conducive for studying a climate variation and estimating a vegetation production at regional even global scale.
基金The National Natural Science Foundation of China(No.61231002,61273266,61571106)the Foundation of the Department of Science and Technology of Guizhou Province(No.[2015]7637)
文摘In order to effectively conduct emotion recognition from spontaneous, non-prototypical and unsegmented speech so as to create a more natural human-machine interaction; a novel speech emotion recognition algorithm based on the combination of the emotional data field (EDF) and the ant colony search (ACS) strategy, called the EDF-ACS algorithm, is proposed. More specifically, the inter- relationship among the turn-based acoustic feature vectors of different labels are established by using the potential function in the EDF. To perform the spontaneous speech emotion recognition, the artificial colony is used to mimic the turn- based acoustic feature vectors. Then, the canonical ACS strategy is used to investigate the movement direction of each artificial ant in the EDF, which is regarded as the emotional label of the corresponding turn-based acoustic feature vector. The proposed EDF-ACS algorithm is evaluated on the continueous audio)'visual emotion challenge (AVEC) 2012 dataset, which contains the spontaneous, non-prototypical and unsegmented speech emotion data. The experimental results show that the proposed EDF-ACS algorithm outperforms the existing state-of-the-art algorithm in turn-based speech emotion recognition.
基金supported by the State Key R&D Project (Grant No. 2016YFE0122200)the Civil Aerospace Scientific Research Project “Data calibration and validation for CSES, ”the Central-Level Public Welfare Research Projects of the Institute of Crustal Dynamics Institute, China Earthquake Administration (Grant No. ZDJ2017-21)
文摘Four levels of the data from the search coil magnetometer(SCM) onboard the China Seismo-Electromagnetic Satellite(CSES)are defined and described. The data in different levels all contain three components of the waveform and/or spectrum of the induced magnetic field around the orbit in the frequency range of 10 Hz to 20 kHz; these are divided into an ultra-low-frequency band(ULF,10–200 Hz), an extremely low frequency band(ELF, 200–2200 Hz), and a very low frequency band(VLF, 1.8–20 kHz). Examples of data products for Level-2, Level-3, and Level-4 are presented. The initial results obtained in the commission test phase demonstrated that the SCM was in a normal operational status and that the data are of high enough quality to reliably capture most space weather events related to low-frequency geomagnetic disturbances.
文摘Influenza is a kind of infectious disease, which spreads quickly and widely. The outbreak of influenza has brought huge losses to society. In this paper, four major categories of flu keywords, “prevention phase”, “symptom phase”, “treatment phase”, and “commonly-used phrase” were set. Python web crawler was used to obtain relevant influenza data from the National Influenza Center’s influenza surveillance weekly report and Baidu Index. The establishment of support vector regression (SVR), least absolute shrinkage and selection operator (LASSO), convolutional neural networks (CNN) prediction models through machine learning, took into account the seasonal characteristics of the influenza, also established the time series model (ARMA). The results show that, it is feasible to predict influenza based on web search data. Machine learning shows a certain forecast effect in the prediction of influenza based on web search data. In the future, it will have certain reference value in influenza prediction. The ARMA(3,0) model predicts better results and has greater generalization. Finally, the lack of research in this paper and future research directions are given.
文摘Multi-level searching is called Drill down search.Right now,no drill down search feature is available in the existing search engines like Google,Yahoo,Bing and Baidu.Drill down search is very much useful for the end user tofind the exact search results among the huge paginated search results.Higher level of drill down search with category based search feature leads to get the most accurate search results but it increases the number and size of thefile system.The purpose of this manuscript is to implement a big data storage reduction binaryfile system model for category based drill down search engine that offers fast multi-levelfiltering capability.The basic methodology of the proposed model stores the search engine data in the binaryfile system model.To verify the effectiveness of the proposedfile system model,5 million unique keyword data are stored into a binaryfile,thereby analysing the proposedfile system with efficiency.Some experimental results are also provided based on real data that show our storage model speed and superiority.Experiments demonstrated that ourfile system expansion ratio is constant and it reduces the disk storage space up to 30%with conventional database/file system and it also increases the search performance for any levels of search.To discuss deeply,the paper starts with the short introduction of drill down search followed by the discussion of important technologies used to implement big data storage reduction system in detail.
文摘The study investigated user experience, display complexity, display type (tables versus graphs), and task difficulty as variables affecting the user’s ability to navigate through complex visual data. A total of 64 participants, 39 undergraduate students (novice users) and 25 graduate students (intermediate-level users) participated in the study. The experimental design was 2 × 2 × 2 × 3 mixed design using two between-subject variables (display complexity, user experience) and two within-subject variables (display format, question difficulty). The results indicated that response time was superior for graphs (relative to tables), especially when the questions were difficult. The intermediate users seemed to adopt more extensive search strategies than novices, as revealed by an analysis of the number of changes they made to the display prior to answering questions. It was concluded that designers of data displays should consider the (a) type of display, (b) difficulty of the task, and (c) expertise level of the user to obtain optimal levels of performance.
文摘A search strategy over encrypted cloud data based on keywords has been improved and has presented a method using different strategies on the client and the server to improve the search efficiency in this paper. The client uses the Chinese and English to achieve the synonym construction of the keywords, the establishment of the fuzzy-syllable words and synonyms set of keywords and the implementation of fuzzy search strategy over the encryption of cloud data based on keywords. The server side through the analysis of the user’s query request provides keywords for users to choose and topic words and secondary words are picked out. System will match topic words with historical inquiry in time order, and then the new query result of the request is directly gained. The analysis of the simulation experiment shows that the fuzzy search strategy can make better use of historical results on the basis of privacy protection for the realization of efficient data search, saving the search time and improving the efficiency of search.
基金Supported by the National Natural Science Foundation of China(No.61300078)the Importation and Development of High-Caliber Talents Project of Beijing Municipal Institutions(No.CIT&TCD201504039)+1 种基金Funding Project for Academic Human Resources Development in Beijing Union University(No.BPHR2014A03,Rk100201510)"New Start"Academic Research Projects of Beijing Union University(No.Hzk10201501)
文摘Problems existin similarity measurement and index tree construction which affect the performance of nearest neighbor search of high-dimensional data. The equidistance problem is solved using NPsim function to calculate similarity. And a sequential NPsim matrix is built to improve indexing performance. To sum up the above innovations,a nearest neighbor search algorithm of high-dimensional data based on sequential NPsim matrix is proposed in comparison with the nearest neighbor search algorithms based on KD-tree or SR-tree on Munsell spectral data set. Experimental results show that the proposed algorithm similarity is better than that of other algorithms and searching speed is more than thousands times of others. In addition,the slow construction speed of sequential NPsim matrix can be increased by using parallel computing.
文摘Many search-based algorithms have been successfully applied in sev-eral software engineering activities.Genetic algorithms(GAs)are the most used in the scientific domains by scholars to solve software testing problems.They imi-tate the theory of natural selection and evolution.The harmony search algorithm(HSA)is one of the most recent search algorithms in the last years.It imitates the behavior of a musician tofind the best harmony.Scholars have estimated the simi-larities and the differences between genetic algorithms and the harmony search algorithm in diverse research domains.The test data generation process represents a critical task in software validation.Unfortunately,there is no work comparing the performance of genetic algorithms and the harmony search algorithm in the test data generation process.This paper studies the similarities and the differences between genetic algorithms and the harmony search algorithm based on the ability and speed offinding the required test data.The current research performs an empirical comparison of the HSA and the GAs,and then the significance of the results is estimated using the t-Test.The study investigates the efficiency of the harmony search algorithm and the genetic algorithms according to(1)the time performance,(2)the significance of the generated test data,and(3)the adequacy of the generated test data to satisfy a given testing criterion.The results showed that the harmony search algorithm is significantly faster than the genetic algo-rithms because the t-Test showed that the p-value of the time values is 0.026<α(αis the significance level=0.05 at 95%confidence level).In contrast,there is no significant difference between the two algorithms in generating the adequate test data because the t-Test showed that the p-value of thefitness values is 0.25>α.
文摘Digital broadcasting is a novel paradigm for the next generation broadcasting. Its goal is to provide not only better quality of pictures but also a variety of services that is impossible in traditional airwaves broadcasting. One of the important factors for this new broadcasting environment is the interoperability among broadcasting applications since the environment is distributed. Therefore the broadcasting metadata becomes increasingly important and one of the metadata standards for a digital broadcasting is TV-Anytime metadata. TV-Anytime metadata is defined using XML schema, so its instances are XML data. In order to fulfill interoperability, a standard query language is also required and XQuery is a natural choice. There are some researches for dealing with broadcasting metadata. In our previous study, we have proposed the method for efficiently managing the broadcasting metadata in a service provider. However, the environment of a Set-Top Box for digital broadcasting is limited such as low-cost and low-setting. Therefore there are some considerations to apply general approaches for managing the metadata into the Set-Top Box. This paper proposes a method for efficiently managing the broadcasting metadata based on the Set-Top Box and a prototype of metadata management system for evaluating our method. Our system consists of a storage engine to store the metadata and an XQuery engine to search the stored metadata and uses special index for storing and searching. Our two engines are designed independently with hardware platform therefore these engines can be used in any low-cost applications to manage broadcasting metadata.
文摘This paper describes the nearest neighbor (NN) search algorithm on the GBD(generalized BD) tree. The GBD tree is a spatial data structure suitable for two-or three-dimensional data and has good performance characteristics with respect to the dynamic data environment. On GIS and CAD systems, the R-tree and its successors have been used. In addition, the NN search algorithm is also proposed in an attempt to obtain good performance from the R-tree. On the other hand, the GBD tree is superior to the R-tree with respect to exact match retrieval, because the GBD tree has auxiliary data that uniquely determines the position of the object in the structure. The proposed NN search algorithm depends on the property of the GBD tree described above. The NN search algorithm on the GBD tree was studied and the performance thereof was evaluated through experiments.
文摘This study compares websites that take live data into account using search engine optimization(SEO).A series of steps called search engine optimization can help a website rank highly in search engine results.Static websites and dynamic websites are two different types of websites.Static websites must have the necessary expertise in programming compatible with SEO.Whereas in dynamic websites,one can utilize readily available plugins/modules.The fundamental issue of all website holders is the lower level of page rank,congestion,utilization,and exposure of the website on the search engine.Here,the authors have studied the live data of four websites as the real-time data would indicate how the SEO strategy may be applied to website page rank,page difficulty removal,and brand query,etc.It is also necessary to choose relevant keywords on any website.The right keyword might assist to increase the brand query while also lowering the page difficulty both on and off the page.In order to calculate Off-page SEO,On-page SEO,and SEO Difficulty,the authors examined live data in this study and chose four well-known Indian university and institute websites for this study:www.caluniv.ac.in,www.jnu.ac.in,www.iima.ac.in,and www.iitb.ac.in.Using live data and SEO,the authors estimated the Off-page SEO,On-page SEO,and SEO Difficulty.It has been shown that the Off-page SEO of www.caluniv.ac.in is lower than that of www.jnu.ac.in,www.iima.ac.in,and www.iitb.ac.in by 9%,7%,and 7%,respectively.On-page SEO is,in comparison,4%,1%,and 1%more.Every university has continued to keep up its own brand query.Additionally,www.caluniv.ac.in has slightly less SEO Difficulty compared to other websites.The final computed results have been displayed and compared.
文摘Unlike consumers in the mall or supermarkets, online consumers are “intangible” and their purchasing behaviors are affected by multiple factors, including product pricing, promotion and discounts, quality of products and brands, and the platforms where they search for the product. In this research, I study the relationship between product sales and consumer characteristics, the relationship between product sales and product qualities, demand curve analysis, and the search friction effect for different platforms. I utilized data from a randomized field experiment involving more than 400 thousand customers and 30 thousand products on JD.com, one of the world’s largest online retailing platforms. There are two focuses of the research: 1) how different consumer characteristics affect sales;2) how to set price and possible search friction for different channels. I find that JD plus membership, education level and age have no significant relationship with product sales, and higher user level leads to higher sales. Sales are highly skewed, with very high numbers of products sold making up only a small percentage of the total. Consumers living in more industrialized cities have more purchasing power. Women and singles lead to higher spending. Also, the better the product performs, the more it sells. Moderate pricing can increase product sales. Based on the research results of search volume in different channels, it is suggested that it is better to focus on app sales. By knowing the results, producers can adjust target consumers for different products and do target advertisements in order to maximize the sales. Also, an appropriate price for a product is also crucial to a seller. By the way, knowing the search friction of different channels can help producers to rearrange platform layout so that search friction can be reduced and more potential deals may be made.
基金the phased research result of the Supreme People’s Procuratorate’s procuratorial theory research program“Research on the Governance Problems of the Crime of Aiding Information Network Criminal Activities”(Project Approval Number GJ2023D28)。
文摘With the development of information technology,the online retrieval of remote electronic data has become an important method for investigative agencies to collect evidence.In the current normative documents,the online retrieval of electronic data is positioned as a new type of arbitrary investigative measure.However,study of its actual operation has found that the online retrieval of electronic data does not fully comply with the characteristics of arbitrary investigative measures.The root cause is its inaccurately defined nature due to analogy errors,an emphasis on the authenticity of electronic data at the cost of rights protection,insufficient effectiveness of normative documents to break through the boundaries of law,and superficial inconsistency found in the mechanical comparison with the nature of existing investigative measures causes.The nature of electronic data retrieved online should be defined according to different circumstances.The retrieval of electronic data disclosed on the Internet is an arbitrary investigative measure,and following procedural specifications should be sufficient.When investigators conceal their true identities and enter the cyberspace of the suspected crime through a registered account to extract dynamic electronic data for criminal activities,it is essentially a covert investigation in cyberspace,and they should follow the normative requirements for covert investigations.The retrieval of dynamic electronic data from private spaces is a technical investigative measure and should be implemented in accordance with the technical investigative procedures.Retrieval of remote“non-public electronic data involving privacy”is a mandatory investigative measure,and is essentially a search in the virtual space.Therefore,procedural specifications should be set in accordance with the standards of searching.
基金the Study on the Impact of the Construction and Development of Southwest Plateau Airport on the Ecological Environment(CZKY2023032).
文摘The architecture and working principle of coordinated search and rescue system of unmanned/manned aircraft,which is composed of manned/unmanned aircraft and manned aircraft,were first introduced,and they can cooperate with each other to complete a search and rescue task.Secondly,a threat assessment method based on meteorological data was proposed,and potential meteorological threats,such as storms and rainfall,can be predicted by collecting and analyzing meteorological data.Finally,an experiment was carried out to evaluate the performance of the proposed method in different scenarios.The experimental results show that the coordinated search and rescue system of unmanned/manned aircraft can be used to effectively assess meteorological threats and provide accurate search and rescue guidance.