Influenza is a kind of infectious disease, which spreads quickly and widely. The outbreak of influenza has brought huge losses to society. In this paper, four major categories of flu keywords, “prevention phase”, “...Influenza is a kind of infectious disease, which spreads quickly and widely. The outbreak of influenza has brought huge losses to society. In this paper, four major categories of flu keywords, “prevention phase”, “symptom phase”, “treatment phase”, and “commonly-used phrase” were set. Python web crawler was used to obtain relevant influenza data from the National Influenza Center’s influenza surveillance weekly report and Baidu Index. The establishment of support vector regression (SVR), least absolute shrinkage and selection operator (LASSO), convolutional neural networks (CNN) prediction models through machine learning, took into account the seasonal characteristics of the influenza, also established the time series model (ARMA). The results show that, it is feasible to predict influenza based on web search data. Machine learning shows a certain forecast effect in the prediction of influenza based on web search data. In the future, it will have certain reference value in influenza prediction. The ARMA(3,0) model predicts better results and has greater generalization. Finally, the lack of research in this paper and future research directions are given.展开更多
Four levels of the data from the search coil magnetometer(SCM) onboard the China Seismo-Electromagnetic Satellite(CSES)are defined and described. The data in different levels all contain three components of the wavefo...Four levels of the data from the search coil magnetometer(SCM) onboard the China Seismo-Electromagnetic Satellite(CSES)are defined and described. The data in different levels all contain three components of the waveform and/or spectrum of the induced magnetic field around the orbit in the frequency range of 10 Hz to 20 kHz; these are divided into an ultra-low-frequency band(ULF,10–200 Hz), an extremely low frequency band(ELF, 200–2200 Hz), and a very low frequency band(VLF, 1.8–20 kHz). Examples of data products for Level-2, Level-3, and Level-4 are presented. The initial results obtained in the commission test phase demonstrated that the SCM was in a normal operational status and that the data are of high enough quality to reliably capture most space weather events related to low-frequency geomagnetic disturbances.展开更多
A search strategy over encrypted cloud data based on keywords has been improved and has presented a method using different strategies on the client and the server to improve the search efficiency in this paper. The cl...A search strategy over encrypted cloud data based on keywords has been improved and has presented a method using different strategies on the client and the server to improve the search efficiency in this paper. The client uses the Chinese and English to achieve the synonym construction of the keywords, the establishment of the fuzzy-syllable words and synonyms set of keywords and the implementation of fuzzy search strategy over the encryption of cloud data based on keywords. The server side through the analysis of the user’s query request provides keywords for users to choose and topic words and secondary words are picked out. System will match topic words with historical inquiry in time order, and then the new query result of the request is directly gained. The analysis of the simulation experiment shows that the fuzzy search strategy can make better use of historical results on the basis of privacy protection for the realization of efficient data search, saving the search time and improving the efficiency of search.展开更多
The study investigated user experience, display complexity, display type (tables versus graphs), and task difficulty as variables affecting the user’s ability to navigate through complex visual data. A total of 64 pa...The study investigated user experience, display complexity, display type (tables versus graphs), and task difficulty as variables affecting the user’s ability to navigate through complex visual data. A total of 64 participants, 39 undergraduate students (novice users) and 25 graduate students (intermediate-level users) participated in the study. The experimental design was 2 × 2 × 2 × 3 mixed design using two between-subject variables (display complexity, user experience) and two within-subject variables (display format, question difficulty). The results indicated that response time was superior for graphs (relative to tables), especially when the questions were difficult. The intermediate users seemed to adopt more extensive search strategies than novices, as revealed by an analysis of the number of changes they made to the display prior to answering questions. It was concluded that designers of data displays should consider the (a) type of display, (b) difficulty of the task, and (c) expertise level of the user to obtain optimal levels of performance.展开更多
This paper describes the nearest neighbor (NN) search algorithm on the GBD(generalized BD) tree. The GBD tree is a spatial data structure suitable for two-or three-dimensional data and has good performance characteris...This paper describes the nearest neighbor (NN) search algorithm on the GBD(generalized BD) tree. The GBD tree is a spatial data structure suitable for two-or three-dimensional data and has good performance characteristics with respect to the dynamic data environment. On GIS and CAD systems, the R-tree and its successors have been used. In addition, the NN search algorithm is also proposed in an attempt to obtain good performance from the R-tree. On the other hand, the GBD tree is superior to the R-tree with respect to exact match retrieval, because the GBD tree has auxiliary data that uniquely determines the position of the object in the structure. The proposed NN search algorithm depends on the property of the GBD tree described above. The NN search algorithm on the GBD tree was studied and the performance thereof was evaluated through experiments.展开更多
Digital broadcasting is a novel paradigm for the next generation broadcasting. Its goal is to provide not only better quality of pictures but also a variety of services that is impossible in traditional airwaves broad...Digital broadcasting is a novel paradigm for the next generation broadcasting. Its goal is to provide not only better quality of pictures but also a variety of services that is impossible in traditional airwaves broadcasting. One of the important factors for this new broadcasting environment is the interoperability among broadcasting applications since the environment is distributed. Therefore the broadcasting metadata becomes increasingly important and one of the metadata standards for a digital broadcasting is TV-Anytime metadata. TV-Anytime metadata is defined using XML schema, so its instances are XML data. In order to fulfill interoperability, a standard query language is also required and XQuery is a natural choice. There are some researches for dealing with broadcasting metadata. In our previous study, we have proposed the method for efficiently managing the broadcasting metadata in a service provider. However, the environment of a Set-Top Box for digital broadcasting is limited such as low-cost and low-setting. Therefore there are some considerations to apply general approaches for managing the metadata into the Set-Top Box. This paper proposes a method for efficiently managing the broadcasting metadata based on the Set-Top Box and a prototype of metadata management system for evaluating our method. Our system consists of a storage engine to store the metadata and an XQuery engine to search the stored metadata and uses special index for storing and searching. Our two engines are designed independently with hardware platform therefore these engines can be used in any low-cost applications to manage broadcasting metadata.展开更多
Unlike consumers in the mall or supermarkets, online consumers are “intangible” and their purchasing behaviors are affected by multiple factors, including product pricing, promotion and discounts, quality of product...Unlike consumers in the mall or supermarkets, online consumers are “intangible” and their purchasing behaviors are affected by multiple factors, including product pricing, promotion and discounts, quality of products and brands, and the platforms where they search for the product. In this research, I study the relationship between product sales and consumer characteristics, the relationship between product sales and product qualities, demand curve analysis, and the search friction effect for different platforms. I utilized data from a randomized field experiment involving more than 400 thousand customers and 30 thousand products on JD.com, one of the world’s largest online retailing platforms. There are two focuses of the research: 1) how different consumer characteristics affect sales;2) how to set price and possible search friction for different channels. I find that JD plus membership, education level and age have no significant relationship with product sales, and higher user level leads to higher sales. Sales are highly skewed, with very high numbers of products sold making up only a small percentage of the total. Consumers living in more industrialized cities have more purchasing power. Women and singles lead to higher spending. Also, the better the product performs, the more it sells. Moderate pricing can increase product sales. Based on the research results of search volume in different channels, it is suggested that it is better to focus on app sales. By knowing the results, producers can adjust target consumers for different products and do target advertisements in order to maximize the sales. Also, an appropriate price for a product is also crucial to a seller. By the way, knowing the search friction of different channels can help producers to rearrange platform layout so that search friction can be reduced and more potential deals may be made.展开更多
文摘Influenza is a kind of infectious disease, which spreads quickly and widely. The outbreak of influenza has brought huge losses to society. In this paper, four major categories of flu keywords, “prevention phase”, “symptom phase”, “treatment phase”, and “commonly-used phrase” were set. Python web crawler was used to obtain relevant influenza data from the National Influenza Center’s influenza surveillance weekly report and Baidu Index. The establishment of support vector regression (SVR), least absolute shrinkage and selection operator (LASSO), convolutional neural networks (CNN) prediction models through machine learning, took into account the seasonal characteristics of the influenza, also established the time series model (ARMA). The results show that, it is feasible to predict influenza based on web search data. Machine learning shows a certain forecast effect in the prediction of influenza based on web search data. In the future, it will have certain reference value in influenza prediction. The ARMA(3,0) model predicts better results and has greater generalization. Finally, the lack of research in this paper and future research directions are given.
基金supported by the State Key R&D Project (Grant No. 2016YFE0122200)the Civil Aerospace Scientific Research Project “Data calibration and validation for CSES, ”the Central-Level Public Welfare Research Projects of the Institute of Crustal Dynamics Institute, China Earthquake Administration (Grant No. ZDJ2017-21)
文摘Four levels of the data from the search coil magnetometer(SCM) onboard the China Seismo-Electromagnetic Satellite(CSES)are defined and described. The data in different levels all contain three components of the waveform and/or spectrum of the induced magnetic field around the orbit in the frequency range of 10 Hz to 20 kHz; these are divided into an ultra-low-frequency band(ULF,10–200 Hz), an extremely low frequency band(ELF, 200–2200 Hz), and a very low frequency band(VLF, 1.8–20 kHz). Examples of data products for Level-2, Level-3, and Level-4 are presented. The initial results obtained in the commission test phase demonstrated that the SCM was in a normal operational status and that the data are of high enough quality to reliably capture most space weather events related to low-frequency geomagnetic disturbances.
文摘A search strategy over encrypted cloud data based on keywords has been improved and has presented a method using different strategies on the client and the server to improve the search efficiency in this paper. The client uses the Chinese and English to achieve the synonym construction of the keywords, the establishment of the fuzzy-syllable words and synonyms set of keywords and the implementation of fuzzy search strategy over the encryption of cloud data based on keywords. The server side through the analysis of the user’s query request provides keywords for users to choose and topic words and secondary words are picked out. System will match topic words with historical inquiry in time order, and then the new query result of the request is directly gained. The analysis of the simulation experiment shows that the fuzzy search strategy can make better use of historical results on the basis of privacy protection for the realization of efficient data search, saving the search time and improving the efficiency of search.
文摘The study investigated user experience, display complexity, display type (tables versus graphs), and task difficulty as variables affecting the user’s ability to navigate through complex visual data. A total of 64 participants, 39 undergraduate students (novice users) and 25 graduate students (intermediate-level users) participated in the study. The experimental design was 2 × 2 × 2 × 3 mixed design using two between-subject variables (display complexity, user experience) and two within-subject variables (display format, question difficulty). The results indicated that response time was superior for graphs (relative to tables), especially when the questions were difficult. The intermediate users seemed to adopt more extensive search strategies than novices, as revealed by an analysis of the number of changes they made to the display prior to answering questions. It was concluded that designers of data displays should consider the (a) type of display, (b) difficulty of the task, and (c) expertise level of the user to obtain optimal levels of performance.
文摘This paper describes the nearest neighbor (NN) search algorithm on the GBD(generalized BD) tree. The GBD tree is a spatial data structure suitable for two-or three-dimensional data and has good performance characteristics with respect to the dynamic data environment. On GIS and CAD systems, the R-tree and its successors have been used. In addition, the NN search algorithm is also proposed in an attempt to obtain good performance from the R-tree. On the other hand, the GBD tree is superior to the R-tree with respect to exact match retrieval, because the GBD tree has auxiliary data that uniquely determines the position of the object in the structure. The proposed NN search algorithm depends on the property of the GBD tree described above. The NN search algorithm on the GBD tree was studied and the performance thereof was evaluated through experiments.
文摘Digital broadcasting is a novel paradigm for the next generation broadcasting. Its goal is to provide not only better quality of pictures but also a variety of services that is impossible in traditional airwaves broadcasting. One of the important factors for this new broadcasting environment is the interoperability among broadcasting applications since the environment is distributed. Therefore the broadcasting metadata becomes increasingly important and one of the metadata standards for a digital broadcasting is TV-Anytime metadata. TV-Anytime metadata is defined using XML schema, so its instances are XML data. In order to fulfill interoperability, a standard query language is also required and XQuery is a natural choice. There are some researches for dealing with broadcasting metadata. In our previous study, we have proposed the method for efficiently managing the broadcasting metadata in a service provider. However, the environment of a Set-Top Box for digital broadcasting is limited such as low-cost and low-setting. Therefore there are some considerations to apply general approaches for managing the metadata into the Set-Top Box. This paper proposes a method for efficiently managing the broadcasting metadata based on the Set-Top Box and a prototype of metadata management system for evaluating our method. Our system consists of a storage engine to store the metadata and an XQuery engine to search the stored metadata and uses special index for storing and searching. Our two engines are designed independently with hardware platform therefore these engines can be used in any low-cost applications to manage broadcasting metadata.
文摘Unlike consumers in the mall or supermarkets, online consumers are “intangible” and their purchasing behaviors are affected by multiple factors, including product pricing, promotion and discounts, quality of products and brands, and the platforms where they search for the product. In this research, I study the relationship between product sales and consumer characteristics, the relationship between product sales and product qualities, demand curve analysis, and the search friction effect for different platforms. I utilized data from a randomized field experiment involving more than 400 thousand customers and 30 thousand products on JD.com, one of the world’s largest online retailing platforms. There are two focuses of the research: 1) how different consumer characteristics affect sales;2) how to set price and possible search friction for different channels. I find that JD plus membership, education level and age have no significant relationship with product sales, and higher user level leads to higher sales. Sales are highly skewed, with very high numbers of products sold making up only a small percentage of the total. Consumers living in more industrialized cities have more purchasing power. Women and singles lead to higher spending. Also, the better the product performs, the more it sells. Moderate pricing can increase product sales. Based on the research results of search volume in different channels, it is suggested that it is better to focus on app sales. By knowing the results, producers can adjust target consumers for different products and do target advertisements in order to maximize the sales. Also, an appropriate price for a product is also crucial to a seller. By the way, knowing the search friction of different channels can help producers to rearrange platform layout so that search friction can be reduced and more potential deals may be made.