The composition of base oils affects the performance of lubricants made from them.This paper proposes a hybrid model based on gradient-boosted decision tree(GBDT)to analyze the effect of different ratios of KN4010,PAO...The composition of base oils affects the performance of lubricants made from them.This paper proposes a hybrid model based on gradient-boosted decision tree(GBDT)to analyze the effect of different ratios of KN4010,PAO40,and PriEco3000 component in a composite base oil system on the performance of lubricants.The study was conducted under small laboratory sample conditions,and a data expansion method using the Gaussian Copula function was proposed to improve the prediction ability of the hybrid model.The study also compared four optimization algorithms,sticky mushroom algorithm(SMA),genetic algorithm(GA),whale optimization algorithm(WOA),and seagull optimization algorithm(SOA),to predict the kinematic viscosity at 40℃,kinematic viscosity at 100℃,viscosity index,and oxidation induction time performance of the lubricant.The results showed that the Gaussian Copula function data expansion method improved the prediction ability of the hybrid model in the case of small samples.The SOA-GBDT hybrid model had the fastest convergence speed for the samples and the best prediction effect,with determination coefficients(R^(2))for the four indicators of lubricants reaching 0.98,0.99,0.96 and 0.96,respectively.Thus,this model can significantly reduce the model’s prediction error and has good prediction ability.展开更多
The COVID-19 pandemic has a significant impact on the global economy and health.While the pandemic continues to cause casualties in millions,many countries have gone under lockdown.During this period,people have to st...The COVID-19 pandemic has a significant impact on the global economy and health.While the pandemic continues to cause casualties in millions,many countries have gone under lockdown.During this period,people have to stay within walls and become more addicted towards social networks.They express their emotions and sympathy via these online platforms.Thus,popular social media(Twitter and Facebook)have become rich sources of information for Opinion Mining and Sentiment Analysis on COVID-19-related issues.We have used Aspect Based Sentiment Analysis to anticipate the polarity of public opinion underlying different aspects from Twitter during lockdown and stepwise unlock phases.The goal of this study is to find the feelings of Indians about the lockdown initiative taken by the Government of India to stop the spread of Coronavirus.India-specific COVID-19 tweets have been annotated,for analysing the sentiment of common public.To classify the Twitter data set a deep learning model has been proposed which has achieved accuracies of 82.35%for Lockdown and 83.33%for Unlock data set.The suggested method outperforms many of the contemporary approaches(long shortterm memory,Bi-directional long short-term memory,Gated Recurrent Unit etc.).This study highlights the public sentiment on lockdown and stepwise unlocks,imposed by the Indian Government on various aspects during the Corona outburst.展开更多
Cloud storage is widely used by large companies to store vast amounts of data and files,offering flexibility,financial savings,and security.However,information shoplifting poses significant threats,potentially leading...Cloud storage is widely used by large companies to store vast amounts of data and files,offering flexibility,financial savings,and security.However,information shoplifting poses significant threats,potentially leading to poor performance and privacy breaches.Blockchain-based cognitive computing can help protect and maintain information security and privacy in cloud platforms,ensuring businesses can focus on business development.To ensure data security in cloud platforms,this research proposed a blockchain-based Hybridized Data Driven Cognitive Computing(HD2C)model.However,the proposed HD2C framework addresses breaches of the privacy information of mixed participants of the Internet of Things(IoT)in the cloud.HD2C is developed by combining Federated Learning(FL)with a Blockchain consensus algorithm to connect smart contracts with Proof of Authority.The“Data Island”problem can be solved by FL’s emphasis on privacy and lightning-fast processing,while Blockchain provides a decentralized incentive structure that is impervious to poisoning.FL with Blockchain allows quick consensus through smart member selection and verification.The HD2C paradigm significantly improves the computational processing efficiency of intelligent manufacturing.Extensive analysis results derived from IIoT datasets confirm HD2C superiority.When compared to other consensus algorithms,the Blockchain PoA’s foundational cost is significant.The accuracy and memory utilization evaluation results predict the total benefits of the system.In comparison to the values 0.004 and 0.04,the value of 0.4 achieves good accuracy.According to the experiment results,the number of transactions per second has minimal impact on memory requirements.The findings of this study resulted in the development of a brand-new IIoT framework based on blockchain technology.展开更多
1 Introduction Information technology has been playing an ever-increasing role in geoscience.Sphisicated database platforms are essential for geological data storage,analysis and exchange of Big Data(Feblowitz,2013;Zh...1 Introduction Information technology has been playing an ever-increasing role in geoscience.Sphisicated database platforms are essential for geological data storage,analysis and exchange of Big Data(Feblowitz,2013;Zhang et al.,2016;Teng et al.,2016;Tian and Li,2018).The United States has built an information-sharing platform for state-owned scientific data as a national strategy.展开更多
This paper first puts forward a case based system framework based on data mining techniques. Then the paper examines the possibility of using neural networks as a method of retrieval in such a case based system. In ...This paper first puts forward a case based system framework based on data mining techniques. Then the paper examines the possibility of using neural networks as a method of retrieval in such a case based system. In this system we propose data mining algorithms to discover case knowledge and other algorithms.展开更多
Since web based GIS processes large size spatial geographic information on internet, we should try to improve the efficiency of spatial data query processing and transmission. This paper presents two efficient metho...Since web based GIS processes large size spatial geographic information on internet, we should try to improve the efficiency of spatial data query processing and transmission. This paper presents two efficient methods for this purpose: division transmission and progressive transmission methods. In division transmission method, a map can be divided into several parts, called “tiles”, and only tiles can be transmitted at the request of a client. In progressive transmission method, a map can be split into several phase views based on the significance of vertices, and a server produces a target object and then transmits it progressively when this spatial object is requested from a client. In order to achieve these methods, the algorithms, “tile division”, “priority order estimation” and the strategies for data transmission are proposed in this paper, respectively. Compared with such traditional methods as “map total transmission” and “layer transmission”, the web based GIS data transmission, proposed in this paper, is advantageous in the increase of the data transmission efficiency by a great margin.展开更多
Cyber Threat Intelligence(CTI)is a valuable resource for cybersecurity defense,but it also poses challenges due to its multi-source and heterogeneous nature.Security personnel may be unable to use CTI effectively to u...Cyber Threat Intelligence(CTI)is a valuable resource for cybersecurity defense,but it also poses challenges due to its multi-source and heterogeneous nature.Security personnel may be unable to use CTI effectively to understand the condition and trend of a cyberattack and respond promptly.To address these challenges,we propose a novel approach that consists of three steps.First,we construct the attack and defense analysis of the cybersecurity ontology(ADACO)model by integrating multiple cybersecurity databases.Second,we develop the threat evolution prediction algorithm(TEPA),which can automatically detect threats at device nodes,correlate and map multisource threat information,and dynamically infer the threat evolution process.TEPA leverages knowledge graphs to represent comprehensive threat scenarios and achieves better performance in simulated experiments by combining structural and textual features of entities.Third,we design the intelligent defense decision algorithm(IDDA),which can provide intelligent recommendations for security personnel regarding the most suitable defense techniques.IDDA outperforms the baseline methods in the comparative experiment.展开更多
Sensor nodes in a wireless sensor network (WSN) are typically powered by batteries, thus the energy is constrained. It is our design goal to efficiently utilize the energy of each sensor node to extend its lifetime,...Sensor nodes in a wireless sensor network (WSN) are typically powered by batteries, thus the energy is constrained. It is our design goal to efficiently utilize the energy of each sensor node to extend its lifetime, so as to prolong the lifetime of the whole WSN. In this paper, we propose a path-based data aggregation scheme (PBDAS) for grid-based wireless sensor networks. In order to extend the lifetime of a WSN, we construct a grid infrastructure by partitioning the whole sensor field into a grid of cells. Each cell has a head responsible for aggregating its own data with the data sensed by the others in the same cell and then transmitting out. In order to efficiently and rapidly transmit the data to the base station (BS), we link each cell head to form a chain. Each cell head on the chain takes turn becoming the chain leader responsible for transmitting data to the BS. Aggregated data moves from head to head along the chain, and finally the chain leader transmits to the BS. In PBDAS, only the cell heads need to transmit data toward the BS. Therefore, the data transmissions to the BS substantially decrease. Besides, the cell heads and chain leader are designated in turn according to the energy level so that the energy depletion of nodes is evenly distributed. Simulation results show that the proposed PBDAS extends the lifetime of sensor nodes, so as to make the lifetime of the whole network longer.展开更多
An 8×10 GHz receiver optical sub-assembly (ROSA) consisting of an 8-channel arrayed waveguide grating (AWG) and an 8-channel PIN photodetector (PD) array is designed and fabricated based on silica hybrid in...An 8×10 GHz receiver optical sub-assembly (ROSA) consisting of an 8-channel arrayed waveguide grating (AWG) and an 8-channel PIN photodetector (PD) array is designed and fabricated based on silica hybrid integration technology. Multimode output waveguides in the silica AWG with 2% refractive index difference are used to obtain fiat-top spectra. The output waveguide facet is polished to 45° bevel to change the light propagation direction into the mesa-type PIN PD, which simplifies the packaging process. The experimentM results show that the single channel I dB bandwidth of AWG ranges from 2.12nm to 3.06nm, the ROSA responsivity ranges from 0.097 A/W to 0.158A/W, and the 3dB bandwidth is up to 11 GHz. It is promising to be applied in the eight-lane WDM transmission system in data center interconnection.展开更多
Wireless sensor Mobile ad hoc networks have excellent potential in moving and monitoring disaster area networks on real-time basis.The recent challenges faced in Mobile Ad Hoc Networks(MANETs)include scalability,local...Wireless sensor Mobile ad hoc networks have excellent potential in moving and monitoring disaster area networks on real-time basis.The recent challenges faced in Mobile Ad Hoc Networks(MANETs)include scalability,localization,heterogeneous network,self-organization,and self-sufficient operation.In this background,the current study focuses on specially-designed communication link establishment for high connection stability of wireless mobile sensor networks,especially in disaster area network.Existing protocols focus on location-dependent communications and use networks based on typically-used Internet Protocol(IP)architecture.However,IP-based communications have a few limitations such as inefficient bandwidth utilization,high processing,less transfer speeds,and excessive memory intake.To overcome these challenges,the number of neighbors(Node Density)is minimized and high Mobility Nodes(Node Speed)are avoided.The proposed Geographic Drone Based Route Optimization(GDRO)method reduces the entire overhead to a considerable level in an efficient manner and significantly improves the overall performance by identifying the disaster region.This drone communicates with anchor node periodically and shares the information to it so as to introduce a drone-based disaster network in an area.Geographic routing is a promising approach to enhance the routing efficiency in MANET.This algorithm helps in reaching the anchor(target)node with the help of Geographical Graph-Based Mapping(GGM).Global Positioning System(GPS)is enabled on mobile network of the anchor node which regularly broadcasts its location information that helps in finding the location.In first step,the node searches for local and remote anticipated Expected Transmission Count(ETX),thereby calculating the estimated distance.Received Signal Strength Indicator(RSSI)results are stored in the local memory of the node.Then,the node calculates the least remote anticipated ETX,Link Loss Rate,and information to the new location.Freeway Heuristic algorithm improves the data speed,efficiency and determines the path and optimization problem.In comparison with other models,the proposed method yielded an efficient communication,increased the throughput,and reduced the end-to-end delay,energy consumption and packet loss performance in disaster area networks.展开更多
In opportunistic networks, most existing buffer management policies including scheduling and passive dropping policies are mainly for routing protocols. In this paper, we proposed a Utility-based Buffer Management str...In opportunistic networks, most existing buffer management policies including scheduling and passive dropping policies are mainly for routing protocols. In this paper, we proposed a Utility-based Buffer Management strategy(UBM) for data dissemination in opportunistic networks. In UBM, we first design a method of computing the utility values of caching messages according to the interest of nodes and the delivery probability of messages, and then propose an overall buffer management policy based on the utility. UBM driven by receivers completely implements not only caching policies, passive and proactive dropping policies, but also scheduling policies of senders. Simulation results show that, compared with some classical dropping strategies, UBM can obtain higher delivery ratio and lower delay latency by using smaller network cost.展开更多
The traditional threat score based on fixed thresholds for precipitation verification is sensitive to intensity forecast bias. In this study, the neighborhood precipitation threat score is modified by defining the thr...The traditional threat score based on fixed thresholds for precipitation verification is sensitive to intensity forecast bias. In this study, the neighborhood precipitation threat score is modified by defining the thresholds in terms of the percentiles of overall precipitation instead of fixed threshold values. The impact of intensity forecast bias on the calculated threat score is reduced. The method is tested with the forecasts of a tropical storm that re-intensified after making landfall and caused heavy flooding. The forecasts are produced with and without radar data assimilation. The forecast with assimilation of both radial velocity and reflectivity produce precipitation patterns that better match observations but have large positive intensity bias. When using fixed thresholds, the neighborhood threat scores fail to yield high scores for forecasts that have good pattern match with observations, due to large intensity bias. In contrast, the percentile-based neighborhood method yields the highest score for the forecast with the best pattern match and the smallest position error. The percentile-based method also yields scores that are more consistent with object-based verifications, which are less sensitive to intensity bias, demonstrating the potential value of percentile-based verification.展开更多
In this paper, we propose a rule management system for data cleaning that is based on knowledge. This system combines features of both rule based systems and rule based data cleaning frameworks. The important advantag...In this paper, we propose a rule management system for data cleaning that is based on knowledge. This system combines features of both rule based systems and rule based data cleaning frameworks. The important advantages of our system are threefold. First, it aims at proposing a strong and unified rule form based on first order structure that permits the representation and management of all the types of rules and their quality via some characteristics. Second, it leads to increase the quality of rules which conditions the quality of data cleaning. Third, it uses an appropriate knowledge acquisition process, which is the weakest task in the current rule and knowledge based systems. As several research works have shown that data cleaning is rather driven by domain knowledge than by data, we have identified and analyzed the properties that distinguish knowledge and rules from data for better determining the most components of the proposed system. In order to illustrate our system, we also present a first experiment with a case study at health sector where we demonstrate how the system is useful for the improvement of data quality. The autonomy, extensibility and platform-independency of the proposed rule management system facilitate its incorporation in any system that is interested in data quality management.展开更多
In studies of HIV, interval-censored data occur naturally. HIV infection time is not usually known exactly, only that it occurred before the survey, within some time interval or has not occurred at the time of the sur...In studies of HIV, interval-censored data occur naturally. HIV infection time is not usually known exactly, only that it occurred before the survey, within some time interval or has not occurred at the time of the survey. Infections are often clustered within geographical areas such as enumerator areas (EAs) and thus inducing unobserved frailty. In this paper we consider an approach for estimating parameters when infection time is unknown and assumed correlated within an EA where dependency is modeled as frailties assuming a normal distribution for frailties and a Weibull distribution for baseline hazards. The data was from a household based population survey that used a multi-stage stratified sample design to randomly select 23,275 interviewed individuals from 10,584 households of whom 15,851 interviewed individuals were further tested for HIV (crude prevalence = 9.1%). A further test conducted among those that tested HIV positive found 181 (12.5%) recently infected. Results show high degree of heterogeneity in HIV distribution between EAs translating to a modest correlation of 0.198. Intervention strategies should target geographical areas that contribute disproportionately to the epidemic of HIV. Further research needs to identify such hot spot areas and understand what factors make these areas prone to HIV.展开更多
Freebase is a large collaborative knowledge base and database of general, structured information for public use. Its structured data had been harvested from many sources, including individual, user-submitted wiki cont...Freebase is a large collaborative knowledge base and database of general, structured information for public use. Its structured data had been harvested from many sources, including individual, user-submitted wiki contributions. Its aim is to create a global resource so that people (and machines) can access common information more effectively which is mostly available in English. In this research work, we have tried to build the technique of creating the Freebase for Bengali language. Today the number of Bengali articles on the internet is growing day by day. So it has become a necessary to have a structured data store in Bengali. It consists of different types of concepts (topics) and relationships between those topics. These include different types of areas like popular culture (e.g. films, music, books, sports, television), location information (restaurants, geolocations, businesses), scholarly information (linguistics, biology, astronomy), birth place of (poets, politicians, actor, actress) and general knowledge (Wikipedia). It will be much more helpful for relation extraction or any kind of Natural Language Processing (NLP) works on Bengali language. In this work, we identified the technique of creating the Bengali Freebase and made a collection of Bengali data. We applied SPARQL query language to extract information from natural language (Bengali) documents such as Wikidata which is typically in RDF (Resource Description Format) triple format.展开更多
Painting is done according to the artist’s style.The most representative of the style is the texture and shape of the brush stroke.Computer simulations allow the artist’s painting to be produced by taking this strok...Painting is done according to the artist’s style.The most representative of the style is the texture and shape of the brush stroke.Computer simulations allow the artist’s painting to be produced by taking this stroke and pasting it onto the image.This is called stroke-based rendering.The quality of the result depends on the number or quality of this stroke,since the stroke is taken to create the image.It is not easy to render using a large amount of information,as there is a limit to having a stroke scanned.In this work,we intend to produce rendering results using mass data that produces large amounts of strokes by expanding existing strokes through warping.Through this,we have produced results that have higher quality than conventional studies.Finally,we also compare the correlation between the amount of data and the results.展开更多
Over the last few years, the Internet of Things (IoT) has become an omnipresent term. The IoT expands the existing common concepts, anytime and anyplace to the connectivity for anything. The proliferation in IoT offer...Over the last few years, the Internet of Things (IoT) has become an omnipresent term. The IoT expands the existing common concepts, anytime and anyplace to the connectivity for anything. The proliferation in IoT offers opportunities but may also bear risks. A hitherto neglected aspect is the possible increase in power consumption as smart devices in IoT applications are expected to be reachable by other devices at all times. This implies that the device is consuming electrical energy even when it is not in use for its primary function. Many researchers’ communities have started addressing storage ability like cache memory of smart devices using the concept called—Named Data Networking (NDN) to achieve better energy efficient communication model. In NDN, memory or buffer overflow is the common challenge especially when internal memory of node exceeds its limit and data with highest degree of freshness may not be accommodated and entire scenarios behaves like a traditional network. In such case, Data Caching is not performed by intermediate nodes to guarantee highest degree of freshness. On the periodical updates sent from data producers, it is exceedingly demanded that data consumers must get up to date information at cost of lease energy. Consequently, there is challenge in maintaining tradeoff between freshness energy consumption during Publisher-Subscriber interaction. In our work, we proposed the architecture to overcome cache strategy issue by Smart Caching Algorithm for improvement in memory management and data freshness. The smart caching strategy updates the data at precise interval by keeping garbage data into consideration. It is also observed from experiment that data redundancy can be easily obtained by ignoring/dropping data packets for the information which is not of interest by other participating nodes in network, ultimately leading to optimizing tradeoff between freshness and energy required.展开更多
Extracting and parameterizing ionospheric waves globally and statistically is a longstanding problem. Based on the multichannel maximum entropy method(MMEM) used for studying ionospheric waves by previous work, we c...Extracting and parameterizing ionospheric waves globally and statistically is a longstanding problem. Based on the multichannel maximum entropy method(MMEM) used for studying ionospheric waves by previous work, we calculate the parameters of ionospheric waves by applying the MMEM to numerously temporally approximate and spatially close global-positioning-system radio occultation total electron content profile triples provided by the unique clustered satellites flight between years 2006 and 2007 right after the constellation observing system for meteorology, ionosphere, and climate(COSMIC) mission launch. The results show that the amplitude of ionospheric waves increases at the low and high latitudes(~0.15 TECU) and decreases in the mid-latitudes(~0.05 TECU). The vertical wavelength of the ionospheric waves increases in the mid-latitudes(e.g., ~50 km at altitudes of 200–250 km) and decreases at the low and high latitudes(e.g., ~35 km at altitudes of 200–250 km).The horizontal wavelength shows a similar result(e.g., ~1400 km in the mid-latitudes and ~800 km at the low and high latitudes).展开更多
The Moon-based Ultraviolet Telescope (MUVT) is one of the payloads on the Chang'e-3 (CE-3) lunar lander. Because of the advantages of having no at- mospheric disturbances and the slow rotation of the Moon, we can...The Moon-based Ultraviolet Telescope (MUVT) is one of the payloads on the Chang'e-3 (CE-3) lunar lander. Because of the advantages of having no at- mospheric disturbances and the slow rotation of the Moon, we can make long-term continuous observations of a series of important celestial objects in the near ultra- violet band (245-340 nm), and perform a sky survey of selected areas, which can- not be completed on Earth. We can find characteristic changes in celestial brightness with time by analyzing image data from the MUVT, and deduce the radiation mech- anism and physical properties of these celestial objects after comparing with a phys- ical model. In order to explain the scientific purposes of MUVT, this article analyzes the preprocessing of MUVT image data and makes a preliminary evaluation of data quality. The results demonstrate that the methods used for data collection and prepro- cessing are effective, and the Level 2A and 2B image data satisfy the requirements of follow-up scientific researches.展开更多
基金financial support extended for this academic work by the Beijing Natural Science Foundation(Grant 2232066)the Open Project Foundation of State Key Laboratory of Solid Lubrication(Grant LSL-2212).
文摘The composition of base oils affects the performance of lubricants made from them.This paper proposes a hybrid model based on gradient-boosted decision tree(GBDT)to analyze the effect of different ratios of KN4010,PAO40,and PriEco3000 component in a composite base oil system on the performance of lubricants.The study was conducted under small laboratory sample conditions,and a data expansion method using the Gaussian Copula function was proposed to improve the prediction ability of the hybrid model.The study also compared four optimization algorithms,sticky mushroom algorithm(SMA),genetic algorithm(GA),whale optimization algorithm(WOA),and seagull optimization algorithm(SOA),to predict the kinematic viscosity at 40℃,kinematic viscosity at 100℃,viscosity index,and oxidation induction time performance of the lubricant.The results showed that the Gaussian Copula function data expansion method improved the prediction ability of the hybrid model in the case of small samples.The SOA-GBDT hybrid model had the fastest convergence speed for the samples and the best prediction effect,with determination coefficients(R^(2))for the four indicators of lubricants reaching 0.98,0.99,0.96 and 0.96,respectively.Thus,this model can significantly reduce the model’s prediction error and has good prediction ability.
文摘The COVID-19 pandemic has a significant impact on the global economy and health.While the pandemic continues to cause casualties in millions,many countries have gone under lockdown.During this period,people have to stay within walls and become more addicted towards social networks.They express their emotions and sympathy via these online platforms.Thus,popular social media(Twitter and Facebook)have become rich sources of information for Opinion Mining and Sentiment Analysis on COVID-19-related issues.We have used Aspect Based Sentiment Analysis to anticipate the polarity of public opinion underlying different aspects from Twitter during lockdown and stepwise unlock phases.The goal of this study is to find the feelings of Indians about the lockdown initiative taken by the Government of India to stop the spread of Coronavirus.India-specific COVID-19 tweets have been annotated,for analysing the sentiment of common public.To classify the Twitter data set a deep learning model has been proposed which has achieved accuracies of 82.35%for Lockdown and 83.33%for Unlock data set.The suggested method outperforms many of the contemporary approaches(long shortterm memory,Bi-directional long short-term memory,Gated Recurrent Unit etc.).This study highlights the public sentiment on lockdown and stepwise unlocks,imposed by the Indian Government on various aspects during the Corona outburst.
文摘Cloud storage is widely used by large companies to store vast amounts of data and files,offering flexibility,financial savings,and security.However,information shoplifting poses significant threats,potentially leading to poor performance and privacy breaches.Blockchain-based cognitive computing can help protect and maintain information security and privacy in cloud platforms,ensuring businesses can focus on business development.To ensure data security in cloud platforms,this research proposed a blockchain-based Hybridized Data Driven Cognitive Computing(HD2C)model.However,the proposed HD2C framework addresses breaches of the privacy information of mixed participants of the Internet of Things(IoT)in the cloud.HD2C is developed by combining Federated Learning(FL)with a Blockchain consensus algorithm to connect smart contracts with Proof of Authority.The“Data Island”problem can be solved by FL’s emphasis on privacy and lightning-fast processing,while Blockchain provides a decentralized incentive structure that is impervious to poisoning.FL with Blockchain allows quick consensus through smart member selection and verification.The HD2C paradigm significantly improves the computational processing efficiency of intelligent manufacturing.Extensive analysis results derived from IIoT datasets confirm HD2C superiority.When compared to other consensus algorithms,the Blockchain PoA’s foundational cost is significant.The accuracy and memory utilization evaluation results predict the total benefits of the system.In comparison to the values 0.004 and 0.04,the value of 0.4 achieves good accuracy.According to the experiment results,the number of transactions per second has minimal impact on memory requirements.The findings of this study resulted in the development of a brand-new IIoT framework based on blockchain technology.
基金granted by the National Science&Technology Major Projects of China(Grant No.2016ZX05033).
文摘1 Introduction Information technology has been playing an ever-increasing role in geoscience.Sphisicated database platforms are essential for geological data storage,analysis and exchange of Big Data(Feblowitz,2013;Zhang et al.,2016;Teng et al.,2016;Tian and Li,2018).The United States has built an information-sharing platform for state-owned scientific data as a national strategy.
基金Supported by the National Science of China(6 0 0 75 0 15 ) and Key Project of Scientific and Technological Departmentin Anhui
文摘This paper first puts forward a case based system framework based on data mining techniques. Then the paper examines the possibility of using neural networks as a method of retrieval in such a case based system. In this system we propose data mining algorithms to discover case knowledge and other algorithms.
文摘Since web based GIS processes large size spatial geographic information on internet, we should try to improve the efficiency of spatial data query processing and transmission. This paper presents two efficient methods for this purpose: division transmission and progressive transmission methods. In division transmission method, a map can be divided into several parts, called “tiles”, and only tiles can be transmitted at the request of a client. In progressive transmission method, a map can be split into several phase views based on the significance of vertices, and a server produces a target object and then transmits it progressively when this spatial object is requested from a client. In order to achieve these methods, the algorithms, “tile division”, “priority order estimation” and the strategies for data transmission are proposed in this paper, respectively. Compared with such traditional methods as “map total transmission” and “layer transmission”, the web based GIS data transmission, proposed in this paper, is advantageous in the increase of the data transmission efficiency by a great margin.
文摘Cyber Threat Intelligence(CTI)is a valuable resource for cybersecurity defense,but it also poses challenges due to its multi-source and heterogeneous nature.Security personnel may be unable to use CTI effectively to understand the condition and trend of a cyberattack and respond promptly.To address these challenges,we propose a novel approach that consists of three steps.First,we construct the attack and defense analysis of the cybersecurity ontology(ADACO)model by integrating multiple cybersecurity databases.Second,we develop the threat evolution prediction algorithm(TEPA),which can automatically detect threats at device nodes,correlate and map multisource threat information,and dynamically infer the threat evolution process.TEPA leverages knowledge graphs to represent comprehensive threat scenarios and achieves better performance in simulated experiments by combining structural and textual features of entities.Third,we design the intelligent defense decision algorithm(IDDA),which can provide intelligent recommendations for security personnel regarding the most suitable defense techniques.IDDA outperforms the baseline methods in the comparative experiment.
基金supported by the NSC under Grant No.NSC-101-2221-E-239-032 and NSC-102-2221-E-239-020
文摘Sensor nodes in a wireless sensor network (WSN) are typically powered by batteries, thus the energy is constrained. It is our design goal to efficiently utilize the energy of each sensor node to extend its lifetime, so as to prolong the lifetime of the whole WSN. In this paper, we propose a path-based data aggregation scheme (PBDAS) for grid-based wireless sensor networks. In order to extend the lifetime of a WSN, we construct a grid infrastructure by partitioning the whole sensor field into a grid of cells. Each cell has a head responsible for aggregating its own data with the data sensed by the others in the same cell and then transmitting out. In order to efficiently and rapidly transmit the data to the base station (BS), we link each cell head to form a chain. Each cell head on the chain takes turn becoming the chain leader responsible for transmitting data to the BS. Aggregated data moves from head to head along the chain, and finally the chain leader transmits to the BS. In PBDAS, only the cell heads need to transmit data toward the BS. Therefore, the data transmissions to the BS substantially decrease. Besides, the cell heads and chain leader are designated in turn according to the energy level so that the energy depletion of nodes is evenly distributed. Simulation results show that the proposed PBDAS extends the lifetime of sensor nodes, so as to make the lifetime of the whole network longer.
基金Supported by the National High Technology Research and Development Program of China under Grant No 2015AA016902the National Natural Science Foundation of China under Grant Nos 61435013 and 61405188the K.C.Wong Education Foundation
文摘An 8×10 GHz receiver optical sub-assembly (ROSA) consisting of an 8-channel arrayed waveguide grating (AWG) and an 8-channel PIN photodetector (PD) array is designed and fabricated based on silica hybrid integration technology. Multimode output waveguides in the silica AWG with 2% refractive index difference are used to obtain fiat-top spectra. The output waveguide facet is polished to 45° bevel to change the light propagation direction into the mesa-type PIN PD, which simplifies the packaging process. The experimentM results show that the single channel I dB bandwidth of AWG ranges from 2.12nm to 3.06nm, the ROSA responsivity ranges from 0.097 A/W to 0.158A/W, and the 3dB bandwidth is up to 11 GHz. It is promising to be applied in the eight-lane WDM transmission system in data center interconnection.
文摘Wireless sensor Mobile ad hoc networks have excellent potential in moving and monitoring disaster area networks on real-time basis.The recent challenges faced in Mobile Ad Hoc Networks(MANETs)include scalability,localization,heterogeneous network,self-organization,and self-sufficient operation.In this background,the current study focuses on specially-designed communication link establishment for high connection stability of wireless mobile sensor networks,especially in disaster area network.Existing protocols focus on location-dependent communications and use networks based on typically-used Internet Protocol(IP)architecture.However,IP-based communications have a few limitations such as inefficient bandwidth utilization,high processing,less transfer speeds,and excessive memory intake.To overcome these challenges,the number of neighbors(Node Density)is minimized and high Mobility Nodes(Node Speed)are avoided.The proposed Geographic Drone Based Route Optimization(GDRO)method reduces the entire overhead to a considerable level in an efficient manner and significantly improves the overall performance by identifying the disaster region.This drone communicates with anchor node periodically and shares the information to it so as to introduce a drone-based disaster network in an area.Geographic routing is a promising approach to enhance the routing efficiency in MANET.This algorithm helps in reaching the anchor(target)node with the help of Geographical Graph-Based Mapping(GGM).Global Positioning System(GPS)is enabled on mobile network of the anchor node which regularly broadcasts its location information that helps in finding the location.In first step,the node searches for local and remote anticipated Expected Transmission Count(ETX),thereby calculating the estimated distance.Received Signal Strength Indicator(RSSI)results are stored in the local memory of the node.Then,the node calculates the least remote anticipated ETX,Link Loss Rate,and information to the new location.Freeway Heuristic algorithm improves the data speed,efficiency and determines the path and optimization problem.In comparison with other models,the proposed method yielded an efficient communication,increased the throughput,and reduced the end-to-end delay,energy consumption and packet loss performance in disaster area networks.
基金supported by the National Natural Science Fund of China under Grant No. 61472097the Education Ministry Doctoral Research Foundation of China (20132304110017)the International Exchange Program of Harbin Engineering University for Innovation-oriented Talents Cultivation
文摘In opportunistic networks, most existing buffer management policies including scheduling and passive dropping policies are mainly for routing protocols. In this paper, we proposed a Utility-based Buffer Management strategy(UBM) for data dissemination in opportunistic networks. In UBM, we first design a method of computing the utility values of caching messages according to the interest of nodes and the delivery probability of messages, and then propose an overall buffer management policy based on the utility. UBM driven by receivers completely implements not only caching policies, passive and proactive dropping policies, but also scheduling policies of senders. Simulation results show that, compared with some classical dropping strategies, UBM can obtain higher delivery ratio and lower delay latency by using smaller network cost.
基金Supported by National Natural Science Foundation of China (61304079, 61125306, 61034002), the Open Research Project from SKLMCCS (20120106), the Fundamental Research Funds for the Central Universities (FRF-TP-13-018A), and the China Postdoctoral Science. Foundation (201_3M_ 5305_27)_ _ _
文摘为有致动器浸透和未知动力学的分离时间的系统的一个班的一个新奇最佳的追踪控制方法在这份报纸被建议。计划基于反复的适应动态编程(自动数据处理) 算法。以便实现控制计划,一个 data-based 标识符首先为未知系统动力学被构造。由介绍 M 网络,稳定的控制的明确的公式被完成。以便消除致动器浸透的效果, nonquadratic 表演功能被介绍,然后一个反复的自动数据处理算法被建立与集中分析完成最佳的追踪控制解决方案。为实现最佳的控制方法,神经网络被用来建立 data-based 标识符,计算性能索引功能,近似最佳的控制政策并且分别地解决稳定的控制。模拟例子被提供验证介绍最佳的追踪的控制计划的有效性。
基金primarily supported by the National 973 Fundamental Research Program of China(Grant No.2013CB430103)the Department of Transportation Federal Aviation Administration(Grant No.NA17RJ1227)through the National Oceanic and Atmospheric Administration+1 种基金supported by the National Science Foundation of China(Grant No.41405100)the Fundamental Research Funds for the Central Universities(Grant No.20620140343)
文摘The traditional threat score based on fixed thresholds for precipitation verification is sensitive to intensity forecast bias. In this study, the neighborhood precipitation threat score is modified by defining the thresholds in terms of the percentiles of overall precipitation instead of fixed threshold values. The impact of intensity forecast bias on the calculated threat score is reduced. The method is tested with the forecasts of a tropical storm that re-intensified after making landfall and caused heavy flooding. The forecasts are produced with and without radar data assimilation. The forecast with assimilation of both radial velocity and reflectivity produce precipitation patterns that better match observations but have large positive intensity bias. When using fixed thresholds, the neighborhood threat scores fail to yield high scores for forecasts that have good pattern match with observations, due to large intensity bias. In contrast, the percentile-based neighborhood method yields the highest score for the forecast with the best pattern match and the smallest position error. The percentile-based method also yields scores that are more consistent with object-based verifications, which are less sensitive to intensity bias, demonstrating the potential value of percentile-based verification.
文摘In this paper, we propose a rule management system for data cleaning that is based on knowledge. This system combines features of both rule based systems and rule based data cleaning frameworks. The important advantages of our system are threefold. First, it aims at proposing a strong and unified rule form based on first order structure that permits the representation and management of all the types of rules and their quality via some characteristics. Second, it leads to increase the quality of rules which conditions the quality of data cleaning. Third, it uses an appropriate knowledge acquisition process, which is the weakest task in the current rule and knowledge based systems. As several research works have shown that data cleaning is rather driven by domain knowledge than by data, we have identified and analyzed the properties that distinguish knowledge and rules from data for better determining the most components of the proposed system. In order to illustrate our system, we also present a first experiment with a case study at health sector where we demonstrate how the system is useful for the improvement of data quality. The autonomy, extensibility and platform-independency of the proposed rule management system facilitate its incorporation in any system that is interested in data quality management.
文摘In studies of HIV, interval-censored data occur naturally. HIV infection time is not usually known exactly, only that it occurred before the survey, within some time interval or has not occurred at the time of the survey. Infections are often clustered within geographical areas such as enumerator areas (EAs) and thus inducing unobserved frailty. In this paper we consider an approach for estimating parameters when infection time is unknown and assumed correlated within an EA where dependency is modeled as frailties assuming a normal distribution for frailties and a Weibull distribution for baseline hazards. The data was from a household based population survey that used a multi-stage stratified sample design to randomly select 23,275 interviewed individuals from 10,584 households of whom 15,851 interviewed individuals were further tested for HIV (crude prevalence = 9.1%). A further test conducted among those that tested HIV positive found 181 (12.5%) recently infected. Results show high degree of heterogeneity in HIV distribution between EAs translating to a modest correlation of 0.198. Intervention strategies should target geographical areas that contribute disproportionately to the epidemic of HIV. Further research needs to identify such hot spot areas and understand what factors make these areas prone to HIV.
文摘Freebase is a large collaborative knowledge base and database of general, structured information for public use. Its structured data had been harvested from many sources, including individual, user-submitted wiki contributions. Its aim is to create a global resource so that people (and machines) can access common information more effectively which is mostly available in English. In this research work, we have tried to build the technique of creating the Freebase for Bengali language. Today the number of Bengali articles on the internet is growing day by day. So it has become a necessary to have a structured data store in Bengali. It consists of different types of concepts (topics) and relationships between those topics. These include different types of areas like popular culture (e.g. films, music, books, sports, television), location information (restaurants, geolocations, businesses), scholarly information (linguistics, biology, astronomy), birth place of (poets, politicians, actor, actress) and general knowledge (Wikipedia). It will be much more helpful for relation extraction or any kind of Natural Language Processing (NLP) works on Bengali language. In this work, we identified the technique of creating the Bengali Freebase and made a collection of Bengali data. We applied SPARQL query language to extract information from natural language (Bengali) documents such as Wikidata which is typically in RDF (Resource Description Format) triple format.
基金This research was supported by the Chung-Ang University Research Scholarship Grants in 2017.
文摘Painting is done according to the artist’s style.The most representative of the style is the texture and shape of the brush stroke.Computer simulations allow the artist’s painting to be produced by taking this stroke and pasting it onto the image.This is called stroke-based rendering.The quality of the result depends on the number or quality of this stroke,since the stroke is taken to create the image.It is not easy to render using a large amount of information,as there is a limit to having a stroke scanned.In this work,we intend to produce rendering results using mass data that produces large amounts of strokes by expanding existing strokes through warping.Through this,we have produced results that have higher quality than conventional studies.Finally,we also compare the correlation between the amount of data and the results.
文摘Over the last few years, the Internet of Things (IoT) has become an omnipresent term. The IoT expands the existing common concepts, anytime and anyplace to the connectivity for anything. The proliferation in IoT offers opportunities but may also bear risks. A hitherto neglected aspect is the possible increase in power consumption as smart devices in IoT applications are expected to be reachable by other devices at all times. This implies that the device is consuming electrical energy even when it is not in use for its primary function. Many researchers’ communities have started addressing storage ability like cache memory of smart devices using the concept called—Named Data Networking (NDN) to achieve better energy efficient communication model. In NDN, memory or buffer overflow is the common challenge especially when internal memory of node exceeds its limit and data with highest degree of freshness may not be accommodated and entire scenarios behaves like a traditional network. In such case, Data Caching is not performed by intermediate nodes to guarantee highest degree of freshness. On the periodical updates sent from data producers, it is exceedingly demanded that data consumers must get up to date information at cost of lease energy. Consequently, there is challenge in maintaining tradeoff between freshness energy consumption during Publisher-Subscriber interaction. In our work, we proposed the architecture to overcome cache strategy issue by Smart Caching Algorithm for improvement in memory management and data freshness. The smart caching strategy updates the data at precise interval by keeping garbage data into consideration. It is also observed from experiment that data redundancy can be easily obtained by ignoring/dropping data packets for the information which is not of interest by other participating nodes in network, ultimately leading to optimizing tradeoff between freshness and energy required.
基金Supported by the National Natural Science Foundation of China under Grant Nos 41774158,41474129 and 41704148the Chinese Meridian Projectthe Youth Innovation Promotion Association of the Chinese Academy of Sciences under Grant No2011324
文摘Extracting and parameterizing ionospheric waves globally and statistically is a longstanding problem. Based on the multichannel maximum entropy method(MMEM) used for studying ionospheric waves by previous work, we calculate the parameters of ionospheric waves by applying the MMEM to numerously temporally approximate and spatially close global-positioning-system radio occultation total electron content profile triples provided by the unique clustered satellites flight between years 2006 and 2007 right after the constellation observing system for meteorology, ionosphere, and climate(COSMIC) mission launch. The results show that the amplitude of ionospheric waves increases at the low and high latitudes(~0.15 TECU) and decreases in the mid-latitudes(~0.05 TECU). The vertical wavelength of the ionospheric waves increases in the mid-latitudes(e.g., ~50 km at altitudes of 200–250 km) and decreases at the low and high latitudes(e.g., ~35 km at altitudes of 200–250 km).The horizontal wavelength shows a similar result(e.g., ~1400 km in the mid-latitudes and ~800 km at the low and high latitudes).
文摘The Moon-based Ultraviolet Telescope (MUVT) is one of the payloads on the Chang'e-3 (CE-3) lunar lander. Because of the advantages of having no at- mospheric disturbances and the slow rotation of the Moon, we can make long-term continuous observations of a series of important celestial objects in the near ultra- violet band (245-340 nm), and perform a sky survey of selected areas, which can- not be completed on Earth. We can find characteristic changes in celestial brightness with time by analyzing image data from the MUVT, and deduce the radiation mech- anism and physical properties of these celestial objects after comparing with a phys- ical model. In order to explain the scientific purposes of MUVT, this article analyzes the preprocessing of MUVT image data and makes a preliminary evaluation of data quality. The results demonstrate that the methods used for data collection and prepro- cessing are effective, and the Level 2A and 2B image data satisfy the requirements of follow-up scientific researches.