This paper focuses on developing a system that allows presentation authors to effectively retrieve presentation slides for reuse from a large volume of existing presentation materials. We assume episodic memories of t...This paper focuses on developing a system that allows presentation authors to effectively retrieve presentation slides for reuse from a large volume of existing presentation materials. We assume episodic memories of the authors can be used as contextual keywords in query expressions to efficiently dig out the expected slides for reuse rather than using only the part-of-slide-descriptions-based keyword queries. As a system, a new slide repository is proposed, composed of slide material collections, slide content data and pieces of information from authors' episodic memories related to each slide and presentation together with a slide retrieval application enabling authors to use the episodic memories as part of queries. The result of our experiment shows that the episodic memory-used queries can give more discoverability than the keyword-based queries. Additionally, an improvement model is discussed on the slide retrieval for further slide-finding efficiency by expanding the episodic memories model in the repository taking in the links with the author-and-slide-related data and events having been post on the private and social media sites.展开更多
This paper investigates the problem of ranking linked data from relational databases using a rank-ing framework. The core idea is to group relationships by their types, then rank the types, and finally rank the instan...This paper investigates the problem of ranking linked data from relational databases using a rank-ing framework. The core idea is to group relationships by their types, then rank the types, and finally rank the instances attached to each type. The ranking criteria for each step considers the mapping rules and heterogeneous graph structure of the data web. Tests based on a social network dataset show that the linked data ranking is effective and easier for people to understand. This approach benefits from utilizing relationships deduced from mapping rules based on table schemas and distinguishing the relationship types, which results in better ranking and visualization of the linked data.展开更多
Treatment plan selection is a complex process because it sometimes needs sufficient experience and clinical information.Nowadays it is even harder for doctors to select an appropriate treatment plan for certain patien...Treatment plan selection is a complex process because it sometimes needs sufficient experience and clinical information.Nowadays it is even harder for doctors to select an appropriate treatment plan for certain patients since doctors might encounter difficulties in obtaining the right information and analyzing the diverse clinical data.In order to improve the effectiveness of clinical decision making in complicated information system environments,we first propose a linked data-based approach for treatment plan selection.The approach integrates the patients’clinical records in hospitals with open linked data sources out of hospitals.Then,based on the linked data net,treatment plan selection is carried on aided by similar historical therapy cases.Finally,we reorganize the electronic medical records of 97 colon cancer patients using the linked data model and count the similarity of these records to help treatment selecting.The experiment shows the usability of our method in supporting clinical decisions.展开更多
Abundant sensor data are now available online from a wealth of sources,which greatly enhance research efforts on the Digital Earth.The combination of distributed sensor networks and expanding citizen-sensing capabilit...Abundant sensor data are now available online from a wealth of sources,which greatly enhance research efforts on the Digital Earth.The combination of distributed sensor networks and expanding citizen-sensing capabilities provides a more synchronized image of earth’s social and physical landscapes.However,it remains difficult for researchers to use such heterogeneous Sensor Webs for scientific applications since data are published by following different standards and protocols and are in arbitrary formats.In this paper,we investigate the core challenges faced when consuming multiple sources for environmental applications using the Linked Data approach.We design and implement a system to achieve better data interoperability and integration by republishing real-world data into linked geo-sensor data.Our contributions include presenting:(1)best practices of re-using and matching the W3C Semantic Sensor Network(SSN)ontology and other popular ontologies for heterogeneous data modeling in the water resources application domain,(2)a newly developed spatial analysis tool for creating links,and(3)a set of RESTful OGC Sensor Observation Service(SOS)like Linked Data APIs.Our results show how a Linked Sensor Web can be built and used within the integrated water resource decision support application domain.展开更多
How to query Linked Data effectively is a challenge due to its heterogeneous datasets.There are three types of heterogeneities,i.e.,different structures representing entities,different predicates with the same meaning...How to query Linked Data effectively is a challenge due to its heterogeneous datasets.There are three types of heterogeneities,i.e.,different structures representing entities,different predicates with the same meaning and different literal formats used in objects.Approaches based on ontology mapping or Information Retrieval(IR) cannot deal with all types of heterogeneities.Facing these limitations,we propose a hierarchical multi-hop language model(HMPM).It discriminates among three types of predicates,descriptive predicates,out-associated predicates and in-associated predicates,and generates multi-hop models for them respectively.All predicates' similarities between the query and entity are organized into a hierarchy,with predicate types on the first level and predicates of this type on the second level.All candidates are ranked in ascending order.We evaluated HMPM in three datasets,DBpedia,Linked MDB and Yago.The results of experiments show that the effectiveness and generality of HMPM outperform the existing approaches.展开更多
With the rise of linked data and knowledge graphs,the need becomes compelling to find suitable solutions to increase the coverage and correctness of data sets,to add missing knowledge and to identify and remove errors...With the rise of linked data and knowledge graphs,the need becomes compelling to find suitable solutions to increase the coverage and correctness of data sets,to add missing knowledge and to identify and remove errors.Several approaches-mostly relying on machine learning and natural language processing techniques-have been proposed to address this refinement goal;they usually need a partial gold standard,i.e.,some“ground truth”to train automatic models.Gold standards are manually constructed,either by involving domain experts or by adopting crowdsourcing and human computation solutions.In this paper,we present an open source software framework to build Games with a Purpose for linked data refinement,i.e.,Web applications to crowdsource partial ground truth,by motivating user participation through fun incentive.We detail the impact of this new resource by explaining the specific data linking“purposes”supported by the framework(creation,ranking and validation of links)and by defining the respective crowdsourcing tasks to achieve those goals.We also introduce our approach for incremental truth inference over the contributions provided by players of Games with a Purpose(also abbreviated as GWAP):we motivate the need for such a method with the specificity of GWAP vs.traditional crowdsourcing;we explain and formalize the proposed process,explain its positive consequences and illustrate the results of an experimental comparison with state-of-the-art approaches.To show this resource’s versatility,we describe a set of diverse applications that we built on top of it;to demonstrate its reusability and extensibility potential,we provide references to detailed documentation,including an entire tutorial which in a few hours guides new adopters to customize and adapt the framework to a new use case.展开更多
Purpose:To develop a set of metrics and identify criteria for assessing the functionality of LOD KOS products while providing common guiding principles that can be used by LOD KOS producers and users to maximize the f...Purpose:To develop a set of metrics and identify criteria for assessing the functionality of LOD KOS products while providing common guiding principles that can be used by LOD KOS producers and users to maximize the functions and usages of LOD KOS products.Design/methodology/approach:Data collection and analysis were conducted at three time periods in 2015–16,2017 and 2019.The sample data used in the comprehensive data analysis comprises all datasets tagged as types of KOS in the Datahub and extracted through their respective SPARQL endpoints.A comparative study of the LOD KOS collected from terminology services Linked Open Vocabularies(LOV)and BioPortal was also performed.Findings:The study proposes a set of Functional,Impactful and Transformable(FIT)metrics for LOD KOS as value vocabularies.The FAIR principles,with additional recommendations,are presented for LOD KOS as open data.Research limitations:The metrics need to be further tested and aligned with the best practices and international standards of both open data and various types of KOS.Practical implications:Assessment performed with FAIR and FIT metrics support the creation and delivery of user-friendly,discoverable and interoperable LOD KOS datasets which can be used for innovative applications,act as a knowledge base,become a foundation of semantic analysis and entity extractions and enhance research in science and the humanities.Originality/value:Our research provides best practice guidelines for LOD KOS as value vocabularies.展开更多
Linked data is a decentralized space of interlinked Resource Description Framework(RDF) graphs that are published,accessed,and manipulated by a multitude of Web agents.Here,we present a multi-agent framework for minin...Linked data is a decentralized space of interlinked Resource Description Framework(RDF) graphs that are published,accessed,and manipulated by a multitude of Web agents.Here,we present a multi-agent framework for mining hypothetical semantic relations from linked data,in which the discovery,management,and validation of relations can be carried out independently by different agents.These agents collaborate in relation mining by publishing and exchanging inter-dependent knowledge elements,e.g.,hypotheses,evidence,and proofs,giving rise to an evidentiary network that connects and ranks diverse knowledge elements.Simulation results show that the framework is scalable in a multi-agent environment.Real-world applications show that the framework is suitable for interdisciplinary and collaborative relation discovery tasks in social domains.展开更多
In light of the escalating demand and intricacy of services in contemporary terrestrial,maritime,and aerial combat operations,there is a compelling need for enhanced service quality and efficiency in airborne cluster ...In light of the escalating demand and intricacy of services in contemporary terrestrial,maritime,and aerial combat operations,there is a compelling need for enhanced service quality and efficiency in airborne cluster communication networks.Software-Defined Networking(SDN)proffers a viable solution for the multifaceted task of cooperative communication transmission and management across different operational domains within complex combat contexts,due to its intrinsic ability to flexibly allocate and centrally administer network resources.This study pivots around the optimization of SDN controller deployment within airborne data link clusters.A collaborative multi-controller architecture predicated on airborne data link clusters is thus proposed.Within this architectural framework,the controller deployment issue is reframed as a two-fold problem:subdomain partition-ing and central interaction node selection.We advocate a subdomain segmentation approach grounded in node value ranking(NDVR)and a central interaction node selection methodology predicated on an enhanced Artificial Fish Swarm Algorithm(AFSA).The advanced NDVR-AFSA(Node value ranking-Improved artificial fish swarm algorithm)algorithm makes use of a chaos algorithm for population initialization,boosting population diversity and circumventing premature algorithm convergence.By the integration of adaptive strategies and incorporation of the genetic algorithm’s crossover and mutation operations,the algorithm’s search range adaptability is enhanced,thereby increasing the possibility of obtaining globally optimal solutions,while concurrently augmenting cluster reliability.The simulation results verify the advantages of the NDVR-IAFSA algorithm,achieve a better load balancing effect,improve the reliability of aviation data link cluster,and significantly reduce the average propagation delay and disconnection rate,respectively,by 12.8%and 11.7%.This shows that the optimization scheme has important significance in practical application,and can meet the high requirements of modern sea,land,and air operations to aviation airborne communication networks.展开更多
Based on the M-ary spread spectrum (M-ary-SS), direct sequence spread spectrum (DS-SS), and orthogonal frequency division multiplex (OFDM), a novel anti-jamming scheme, named orthogonal code time division multi-...Based on the M-ary spread spectrum (M-ary-SS), direct sequence spread spectrum (DS-SS), and orthogonal frequency division multiplex (OFDM), a novel anti-jamming scheme, named orthogonal code time division multi-subchannels spread spectrum modulation (OC-TDMSCSSM), is proposed to enhance the anti-jamming ability of the unmanned aerial vehicle (UAV) data link. The anti-jamming system with its mathematical model is presented first, and then the signal formats of transmitter and receiver are derived. The receiver's bit error rate (BER) is demonstrated and anti-jamming performance analysis is carded out in an additive white Ganssian noise (AWGN) channel. Theoretical research and simulation results show the anti-jamming performance of the proposed scheme better than that of the hybrid direct sequence frequency hopping spread spectrum (DS/FH SS) system. The jamming margin of the OC-TDMSCSSM system is 5 dB higher than that of DS/FH SS system under the condition of Rician channel and full-band jamming, and 6 dB higher under the condition of Rician channel environment and partial-band jamming.展开更多
Purpose:The interdisciplinary nature and rapid development of the Semantic Web led to the mass publication of RDF data in a large number of widely accepted serialization formats,thus developing out the necessity for R...Purpose:The interdisciplinary nature and rapid development of the Semantic Web led to the mass publication of RDF data in a large number of widely accepted serialization formats,thus developing out the necessity for RDF data processing with specific purposes.The paper reports on an assessment of chief RDF data endpoint challenges and introduces the RDF Adaptor,a set of plugins for RDF data processing which covers the whole life-cycle with high efficiency.Design/methodology/approach:The RDFAdaptor is designed based on the prominent ETL tool—Pentaho Data Integration—which provides a user-friendly and intuitive interface and allows connect to various data sources and formats,and reuses the Java framework RDF4J as middleware that realizes access to data repositories,SPARQL endpoints and all leading RDF database solutions with SPARQL 1.1 support.It can support effortless services with various configuration templates in multi-scenario applications,and help extend data process tasks in other services or tools to complement missing functions.Findings:The proposed comprehensive RDF ETL solution—RDFAdaptor—provides an easy-to-use and intuitive interface,supports data integration and federation over multi-source heterogeneous repositories or endpoints,as well as manage linked data in hybrid storage mode.Research limitations:The plugin set can support several application scenarios of RDF data process,but error detection/check and interaction with other graph repositories remain to be improved.Practical implications:The plugin set can provide user interface and configuration templates which enable its usability in various applications of RDF data generation,multi-format data conversion,remote RDF data migration,and RDF graph update in semantic query process.Originality/value:This is the first attempt to develop components instead of systems that can include extract,consolidate,and store RDF data on the basis of an ecologically mature data warehousing environment.展开更多
Detecting sophisticated cyberattacks,mainly Distributed Denial of Service(DDoS)attacks,with unexpected patterns remains challenging in modern networks.Traditional detection systems often struggle to mitigate such atta...Detecting sophisticated cyberattacks,mainly Distributed Denial of Service(DDoS)attacks,with unexpected patterns remains challenging in modern networks.Traditional detection systems often struggle to mitigate such attacks in conventional and software-defined networking(SDN)environments.While Machine Learning(ML)models can distinguish between benign and malicious traffic,their limited feature scope hinders the detection of new zero-day or low-rate DDoS attacks requiring frequent retraining.In this paper,we propose a novel DDoS detection framework that combines Machine Learning(ML)and Ensemble Learning(EL)techniques to improve DDoS attack detection and mitigation in SDN environments.Our model leverages the“DDoS SDN”dataset for training and evaluation and employs a dynamic feature selection mechanism that enhances detection accuracy by focusing on the most relevant features.This adaptive approach addresses the limitations of conventional ML models and provides more accurate detection of various DDoS attack scenarios.Our proposed ensemble model introduces an additional layer of detection,increasing reliability through the innovative application of ensemble techniques.The proposed solution significantly enhances the model’s ability to identify and respond to dynamic threats in SDNs.It provides a strong foundation for proactive DDoS detection and mitigation,enhancing network defenses against evolving threats.Our comprehensive runtime analysis of Simultaneous Multi-Threading(SMT)on identical configurations shows superior accuracy and efficiency,with significantly reduced computational time,making it ideal for real-time DDoS detection in dynamic,rapidly changing SDNs.Experimental results demonstrate that our model achieves outstanding performance,outperforming traditional algorithms with 99%accuracy using Random Forest(RF)and K-Nearest Neighbors(KNN)and 98%accuracy using XGBoost.展开更多
In order to test the anti-interference ability of an Unmanned Aerial Vehicle(UAV) data link in a complex electromagnetic environment,a method for simulating the dynamic electromagnetic interference of an indoor wirele...In order to test the anti-interference ability of an Unmanned Aerial Vehicle(UAV) data link in a complex electromagnetic environment,a method for simulating the dynamic electromagnetic interference of an indoor wireless environment is proposed.This method can estimate the relational degree between the actual face of an UAV data link in an interface environment and the simulation scenarios in an anechoic chamber by using the Grey Relational Analysis(GRA) theory.The dynamic drive of the microwave instrument produces a real-time corresponding interference signal and realises scene mapping.The experimental results show that the maximal correlation between the interference signal in the real scene and the angular domain of the radiation antenna in the anechoic chamber is 0.959 3.Further,the relational degree of the Signal-toInterference Ratio(SIR) of the UAV at its reception terminal indoors and in the anechoic chamber is 0.996 8,and the time of instrument drive is only approximately 10 μs.All of the above illustrates that this method can achieve a simulation close to a real field dynamic electromagnetic interference signal of an indoor UAV data link.展开更多
Tactical Data Link(TDL)is a communication system that utilizes a particular message format and a protocol to transmit data via wireless channels in an instant,automatic,and secure way.So far,TDL has shown its excellen...Tactical Data Link(TDL)is a communication system that utilizes a particular message format and a protocol to transmit data via wireless channels in an instant,automatic,and secure way.So far,TDL has shown its excellence in military applications.Current TDL adopts a distributed architecture to enhance anti-destruction capacity.However,It still faces a problem of data inconsistency and thus cannot well support cooperation across multiple militarily domains.To tackle this problem,we propose to leverage blockchain to build an automatic and adaptive data transmission control scheme for TDL.It achieves automatic data transmission and realizes information consistency among different TDL entities.Besides,applying smart contracts based on blockchain further enables adjusting data transmission policies automatically.Security analysis and experimental results based on simulations illustrate the effectiveness and efficiency of our proposed scheme.展开更多
This paper describes how data records can be matched across large datasets using a technique called the Identity Correlation Approach (ICA). The ICA technique is then compared with a string matching exercise. Both t...This paper describes how data records can be matched across large datasets using a technique called the Identity Correlation Approach (ICA). The ICA technique is then compared with a string matching exercise. Both the string matching exercise and the ICA technique were employed for a big data project carried out by the CSO. The project was called the SESADP (Structure of Earnings Survey Administrative Data Project) and involved linking the Irish Census dataset 2011 to a large Public Sector Dataset. The ICA technique provides a mathematical tool to link the datasets and the matching rate for an exact match can be calculated before the matching process begins. Based on the number of variables and the size of the population, the matching rate is calculated in the ICA approach from the MRUI (Matching Rate for Unique Identifier) formula, and false positives are eliminated. No string matching is used in the ICA, therefore names are not required on the dataset, making the data more secure & ensuring confidentiality. The SESADP Project was highly successful using the ICA technique. A comparison of the results using a string matching exercise for the SESADP and the ICA are discussed here.展开更多
Privacy protection for big data linking is discussed here in relation to the Central Statistics Office (CSO), Ireland's, big data linking project titled the 'Structure of Earnings Survey - Administrative Data Proj...Privacy protection for big data linking is discussed here in relation to the Central Statistics Office (CSO), Ireland's, big data linking project titled the 'Structure of Earnings Survey - Administrative Data Project' (SESADP). The result of the project was the creation of datasets and statistical outputs for the years 2011 to 2014 to meet Eurostat's annual earnings statistics requirements and the Structure of Earnings Survey (SES) Regulation. Record linking across the Census and various public sector datasets enabled the necessary information to be acquired to meet the Eurostat earnings requirements. However, the risk of statistical disclosure (i.e. identifying an individual on the dataset) is high unless privacy and confidentiality safe-guards are built into the data matching process. This paper looks at the three methods of linking records on big datasets employed on the SESADP, and how to anonymise the data to protect the identity of the individuals, where potentially disclosive variables exist.展开更多
According to the analysis of the very high frequency (VHF) self organized time division multiple access (S TDMA) aviation data link, a new dynamic slot assignment scheme is proposed in this paper, which adopts var...According to the analysis of the very high frequency (VHF) self organized time division multiple access (S TDMA) aviation data link, a new dynamic slot assignment scheme is proposed in this paper, which adopts variable data frame structure and can eliminate the effect of the idle slot on message delay. By using queue theory, the analysis models of the new scheme and previous scheme are presented, and the performance of message delay and that of system throughput are analyzed under two schemes. The simulation results show that the new scheme has a better performance than the previous one in the message delay and system throughput.展开更多
Data link communication requires data communication process must have reliability, availability, confidentiality, availability, integrity, non-repudiation, controllability.This has a great effect to ensure the normal ...Data link communication requires data communication process must have reliability, availability, confidentiality, availability, integrity, non-repudiation, controllability.This has a great effect to ensure the normal communication functions.In this paper the author on the basis of many years of work experience, first discussed the establishment of a risk assessment system data link,then focused on the problem of index weight assessment.To data communication security, this research article will provide some references.展开更多
Aiming at the problem that only some types of SPARQL ( simple protocal and resource description framework query language) queries can be answered by using the current resource description framework link traversal ba...Aiming at the problem that only some types of SPARQL ( simple protocal and resource description framework query language) queries can be answered by using the current resource description framework link traversal based query execution (RDF-LTE) approach, this paper discusses how the execution order of the triple pattern affects the query results and cost based on concrete SPARQL queries, and analyzes two properties of the web of linked data, missing backward links and missing contingency solution. Then three heuristic principles for logic query plan optimization, namely, the filtered basic graph pattern (FBGP) principle, the triple pattern chain principle and the seed URIs principle, are proposed. The three principles contribute to decrease the intermediate solutions and increase the types of queries that can be answered. The effectiveness and feasibility of the proposed approach is evaluated. The experimental results show that more query results can be returned with less cost, thus enabling users to develop the full potential of the web of linked data.展开更多
文摘This paper focuses on developing a system that allows presentation authors to effectively retrieve presentation slides for reuse from a large volume of existing presentation materials. We assume episodic memories of the authors can be used as contextual keywords in query expressions to efficiently dig out the expected slides for reuse rather than using only the part-of-slide-descriptions-based keyword queries. As a system, a new slide repository is proposed, composed of slide material collections, slide content data and pieces of information from authors' episodic memories related to each slide and presentation together with a slide retrieval application enabling authors to use the episodic memories as part of queries. The result of our experiment shows that the episodic memory-used queries can give more discoverability than the keyword-based queries. Additionally, an improvement model is discussed on the slide retrieval for further slide-finding efficiency by expanding the episodic memories model in the repository taking in the links with the author-and-slide-related data and events having been post on the private and social media sites.
文摘This paper investigates the problem of ranking linked data from relational databases using a rank-ing framework. The core idea is to group relationships by their types, then rank the types, and finally rank the instances attached to each type. The ranking criteria for each step considers the mapping rules and heterogeneous graph structure of the data web. Tests based on a social network dataset show that the linked data ranking is effective and easier for people to understand. This approach benefits from utilizing relationships deduced from mapping rules based on table schemas and distinguishing the relationship types, which results in better ranking and visualization of the linked data.
基金This work was supported by the National Natural Science Foundation of China,[grant number 71171132,61373030].
文摘Treatment plan selection is a complex process because it sometimes needs sufficient experience and clinical information.Nowadays it is even harder for doctors to select an appropriate treatment plan for certain patients since doctors might encounter difficulties in obtaining the right information and analyzing the diverse clinical data.In order to improve the effectiveness of clinical decision making in complicated information system environments,we first propose a linked data-based approach for treatment plan selection.The approach integrates the patients’clinical records in hospitals with open linked data sources out of hospitals.Then,based on the linked data net,treatment plan selection is carried on aided by similar historical therapy cases.Finally,we reorganize the electronic medical records of 97 colon cancer patients using the linked data model and count the similarity of these records to help treatment selecting.The experiment shows the usability of our method in supporting clinical decisions.
文摘Abundant sensor data are now available online from a wealth of sources,which greatly enhance research efforts on the Digital Earth.The combination of distributed sensor networks and expanding citizen-sensing capabilities provides a more synchronized image of earth’s social and physical landscapes.However,it remains difficult for researchers to use such heterogeneous Sensor Webs for scientific applications since data are published by following different standards and protocols and are in arbitrary formats.In this paper,we investigate the core challenges faced when consuming multiple sources for environmental applications using the Linked Data approach.We design and implement a system to achieve better data interoperability and integration by republishing real-world data into linked geo-sensor data.Our contributions include presenting:(1)best practices of re-using and matching the W3C Semantic Sensor Network(SSN)ontology and other popular ontologies for heterogeneous data modeling in the water resources application domain,(2)a newly developed spatial analysis tool for creating links,and(3)a set of RESTful OGC Sensor Observation Service(SOS)like Linked Data APIs.Our results show how a Linked Sensor Web can be built and used within the integrated water resource decision support application domain.
文摘How to query Linked Data effectively is a challenge due to its heterogeneous datasets.There are three types of heterogeneities,i.e.,different structures representing entities,different predicates with the same meaning and different literal formats used in objects.Approaches based on ontology mapping or Information Retrieval(IR) cannot deal with all types of heterogeneities.Facing these limitations,we propose a hierarchical multi-hop language model(HMPM).It discriminates among three types of predicates,descriptive predicates,out-associated predicates and in-associated predicates,and generates multi-hop models for them respectively.All predicates' similarities between the query and entity are organized into a hierarchy,with predicate types on the first level and predicates of this type on the second level.All candidates are ranked in ascending order.We evaluated HMPM in three datasets,DBpedia,Linked MDB and Yago.The results of experiments show that the effectiveness and generality of HMPM outperform the existing approaches.
基金This work was partially supported by the STARS4ALL project(H2020-688135)co-funded by the European Commission.
文摘With the rise of linked data and knowledge graphs,the need becomes compelling to find suitable solutions to increase the coverage and correctness of data sets,to add missing knowledge and to identify and remove errors.Several approaches-mostly relying on machine learning and natural language processing techniques-have been proposed to address this refinement goal;they usually need a partial gold standard,i.e.,some“ground truth”to train automatic models.Gold standards are manually constructed,either by involving domain experts or by adopting crowdsourcing and human computation solutions.In this paper,we present an open source software framework to build Games with a Purpose for linked data refinement,i.e.,Web applications to crowdsource partial ground truth,by motivating user participation through fun incentive.We detail the impact of this new resource by explaining the specific data linking“purposes”supported by the framework(creation,ranking and validation of links)and by defining the respective crowdsourcing tasks to achieve those goals.We also introduce our approach for incremental truth inference over the contributions provided by players of Games with a Purpose(also abbreviated as GWAP):we motivate the need for such a method with the specificity of GWAP vs.traditional crowdsourcing;we explain and formalize the proposed process,explain its positive consequences and illustrate the results of an experimental comparison with state-of-the-art approaches.To show this resource’s versatility,we describe a set of diverse applications that we built on top of it;to demonstrate its reusability and extensibility potential,we provide references to detailed documentation,including an entire tutorial which in a few hours guides new adopters to customize and adapt the framework to a new use case.
基金College of Communication and Information(CCI)Research and Creative Activity Fund,Kent State University
文摘Purpose:To develop a set of metrics and identify criteria for assessing the functionality of LOD KOS products while providing common guiding principles that can be used by LOD KOS producers and users to maximize the functions and usages of LOD KOS products.Design/methodology/approach:Data collection and analysis were conducted at three time periods in 2015–16,2017 and 2019.The sample data used in the comprehensive data analysis comprises all datasets tagged as types of KOS in the Datahub and extracted through their respective SPARQL endpoints.A comparative study of the LOD KOS collected from terminology services Linked Open Vocabularies(LOV)and BioPortal was also performed.Findings:The study proposes a set of Functional,Impactful and Transformable(FIT)metrics for LOD KOS as value vocabularies.The FAIR principles,with additional recommendations,are presented for LOD KOS as open data.Research limitations:The metrics need to be further tested and aligned with the best practices and international standards of both open data and various types of KOS.Practical implications:Assessment performed with FAIR and FIT metrics support the creation and delivery of user-friendly,discoverable and interoperable LOD KOS datasets which can be used for innovative applications,act as a knowledge base,become a foundation of semantic analysis and entity extractions and enhance research in science and the humanities.Originality/value:Our research provides best practice guidelines for LOD KOS as value vocabularies.
基金supported by the National Natural Science Foundation of China (Nos.61070156 and 61100183)the Natural Science Foundation of Zhejiang Province,China (No.Y1110477)
文摘Linked data is a decentralized space of interlinked Resource Description Framework(RDF) graphs that are published,accessed,and manipulated by a multitude of Web agents.Here,we present a multi-agent framework for mining hypothetical semantic relations from linked data,in which the discovery,management,and validation of relations can be carried out independently by different agents.These agents collaborate in relation mining by publishing and exchanging inter-dependent knowledge elements,e.g.,hypotheses,evidence,and proofs,giving rise to an evidentiary network that connects and ranks diverse knowledge elements.Simulation results show that the framework is scalable in a multi-agent environment.Real-world applications show that the framework is suitable for interdisciplinary and collaborative relation discovery tasks in social domains.
基金supported by the following funds:Defense Industrial Technology Development Program Grant:G20210513Shaanxi Provincal Department of Science and Technology Grant:2021KW-07Shaanxi Provincal Department of Science and Technology Grant:2022 QFY01-14.
文摘In light of the escalating demand and intricacy of services in contemporary terrestrial,maritime,and aerial combat operations,there is a compelling need for enhanced service quality and efficiency in airborne cluster communication networks.Software-Defined Networking(SDN)proffers a viable solution for the multifaceted task of cooperative communication transmission and management across different operational domains within complex combat contexts,due to its intrinsic ability to flexibly allocate and centrally administer network resources.This study pivots around the optimization of SDN controller deployment within airborne data link clusters.A collaborative multi-controller architecture predicated on airborne data link clusters is thus proposed.Within this architectural framework,the controller deployment issue is reframed as a two-fold problem:subdomain partition-ing and central interaction node selection.We advocate a subdomain segmentation approach grounded in node value ranking(NDVR)and a central interaction node selection methodology predicated on an enhanced Artificial Fish Swarm Algorithm(AFSA).The advanced NDVR-AFSA(Node value ranking-Improved artificial fish swarm algorithm)algorithm makes use of a chaos algorithm for population initialization,boosting population diversity and circumventing premature algorithm convergence.By the integration of adaptive strategies and incorporation of the genetic algorithm’s crossover and mutation operations,the algorithm’s search range adaptability is enhanced,thereby increasing the possibility of obtaining globally optimal solutions,while concurrently augmenting cluster reliability.The simulation results verify the advantages of the NDVR-IAFSA algorithm,achieve a better load balancing effect,improve the reliability of aviation data link cluster,and significantly reduce the average propagation delay and disconnection rate,respectively,by 12.8%and 11.7%.This shows that the optimization scheme has important significance in practical application,and can meet the high requirements of modern sea,land,and air operations to aviation airborne communication networks.
基金Aeronautical Science Foundation of China (2007ZC53030)
文摘Based on the M-ary spread spectrum (M-ary-SS), direct sequence spread spectrum (DS-SS), and orthogonal frequency division multiplex (OFDM), a novel anti-jamming scheme, named orthogonal code time division multi-subchannels spread spectrum modulation (OC-TDMSCSSM), is proposed to enhance the anti-jamming ability of the unmanned aerial vehicle (UAV) data link. The anti-jamming system with its mathematical model is presented first, and then the signal formats of transmitter and receiver are derived. The receiver's bit error rate (BER) is demonstrated and anti-jamming performance analysis is carded out in an additive white Ganssian noise (AWGN) channel. Theoretical research and simulation results show the anti-jamming performance of the proposed scheme better than that of the hybrid direct sequence frequency hopping spread spectrum (DS/FH SS) system. The jamming margin of the OC-TDMSCSSM system is 5 dB higher than that of DS/FH SS system under the condition of Rician channel and full-band jamming, and 6 dB higher under the condition of Rician channel environment and partial-band jamming.
基金This work is supported by“National Social Science Foundation in China”Project(19BTQ061)“Integration and Development on A Next Generation of Open Knowledge Services System and Key Technologies”project(2020XM05).
文摘Purpose:The interdisciplinary nature and rapid development of the Semantic Web led to the mass publication of RDF data in a large number of widely accepted serialization formats,thus developing out the necessity for RDF data processing with specific purposes.The paper reports on an assessment of chief RDF data endpoint challenges and introduces the RDF Adaptor,a set of plugins for RDF data processing which covers the whole life-cycle with high efficiency.Design/methodology/approach:The RDFAdaptor is designed based on the prominent ETL tool—Pentaho Data Integration—which provides a user-friendly and intuitive interface and allows connect to various data sources and formats,and reuses the Java framework RDF4J as middleware that realizes access to data repositories,SPARQL endpoints and all leading RDF database solutions with SPARQL 1.1 support.It can support effortless services with various configuration templates in multi-scenario applications,and help extend data process tasks in other services or tools to complement missing functions.Findings:The proposed comprehensive RDF ETL solution—RDFAdaptor—provides an easy-to-use and intuitive interface,supports data integration and federation over multi-source heterogeneous repositories or endpoints,as well as manage linked data in hybrid storage mode.Research limitations:The plugin set can support several application scenarios of RDF data process,but error detection/check and interaction with other graph repositories remain to be improved.Practical implications:The plugin set can provide user interface and configuration templates which enable its usability in various applications of RDF data generation,multi-format data conversion,remote RDF data migration,and RDF graph update in semantic query process.Originality/value:This is the first attempt to develop components instead of systems that can include extract,consolidate,and store RDF data on the basis of an ecologically mature data warehousing environment.
文摘Detecting sophisticated cyberattacks,mainly Distributed Denial of Service(DDoS)attacks,with unexpected patterns remains challenging in modern networks.Traditional detection systems often struggle to mitigate such attacks in conventional and software-defined networking(SDN)environments.While Machine Learning(ML)models can distinguish between benign and malicious traffic,their limited feature scope hinders the detection of new zero-day or low-rate DDoS attacks requiring frequent retraining.In this paper,we propose a novel DDoS detection framework that combines Machine Learning(ML)and Ensemble Learning(EL)techniques to improve DDoS attack detection and mitigation in SDN environments.Our model leverages the“DDoS SDN”dataset for training and evaluation and employs a dynamic feature selection mechanism that enhances detection accuracy by focusing on the most relevant features.This adaptive approach addresses the limitations of conventional ML models and provides more accurate detection of various DDoS attack scenarios.Our proposed ensemble model introduces an additional layer of detection,increasing reliability through the innovative application of ensemble techniques.The proposed solution significantly enhances the model’s ability to identify and respond to dynamic threats in SDNs.It provides a strong foundation for proactive DDoS detection and mitigation,enhancing network defenses against evolving threats.Our comprehensive runtime analysis of Simultaneous Multi-Threading(SMT)on identical configurations shows superior accuracy and efficiency,with significantly reduced computational time,making it ideal for real-time DDoS detection in dynamic,rapidly changing SDNs.Experimental results demonstrate that our model achieves outstanding performance,outperforming traditional algorithms with 99%accuracy using Random Forest(RF)and K-Nearest Neighbors(KNN)and 98%accuracy using XGBoost.
基金supported by a certain Ministry Foundation under Grant No.20212HK03010
文摘In order to test the anti-interference ability of an Unmanned Aerial Vehicle(UAV) data link in a complex electromagnetic environment,a method for simulating the dynamic electromagnetic interference of an indoor wireless environment is proposed.This method can estimate the relational degree between the actual face of an UAV data link in an interface environment and the simulation scenarios in an anechoic chamber by using the Grey Relational Analysis(GRA) theory.The dynamic drive of the microwave instrument produces a real-time corresponding interference signal and realises scene mapping.The experimental results show that the maximal correlation between the interference signal in the real scene and the angular domain of the radiation antenna in the anechoic chamber is 0.959 3.Further,the relational degree of the Signal-toInterference Ratio(SIR) of the UAV at its reception terminal indoors and in the anechoic chamber is 0.996 8,and the time of instrument drive is only approximately 10 μs.All of the above illustrates that this method can achieve a simulation close to a real field dynamic electromagnetic interference signal of an indoor UAV data link.
基金This work is sponsored by the open grant of the Tactical Data Link Lab of the 20th Research Institute of China Electronics Technology Group Corporation,P.R.China(Grant CLDL-20182119)the National Natural Science Foundation of China under Grants 61672410 and 61802293+2 种基金the Key Lab of Information Network Security,Ministry of Public Security(Grant C18614)the Academy of Finland(Grants 308087,314203,and 335262)the Shaanxi Innovation Team project under grant 2018TD-007,and the 111 project under grant B16037.
文摘Tactical Data Link(TDL)is a communication system that utilizes a particular message format and a protocol to transmit data via wireless channels in an instant,automatic,and secure way.So far,TDL has shown its excellence in military applications.Current TDL adopts a distributed architecture to enhance anti-destruction capacity.However,It still faces a problem of data inconsistency and thus cannot well support cooperation across multiple militarily domains.To tackle this problem,we propose to leverage blockchain to build an automatic and adaptive data transmission control scheme for TDL.It achieves automatic data transmission and realizes information consistency among different TDL entities.Besides,applying smart contracts based on blockchain further enables adjusting data transmission policies automatically.Security analysis and experimental results based on simulations illustrate the effectiveness and efficiency of our proposed scheme.
文摘This paper describes how data records can be matched across large datasets using a technique called the Identity Correlation Approach (ICA). The ICA technique is then compared with a string matching exercise. Both the string matching exercise and the ICA technique were employed for a big data project carried out by the CSO. The project was called the SESADP (Structure of Earnings Survey Administrative Data Project) and involved linking the Irish Census dataset 2011 to a large Public Sector Dataset. The ICA technique provides a mathematical tool to link the datasets and the matching rate for an exact match can be calculated before the matching process begins. Based on the number of variables and the size of the population, the matching rate is calculated in the ICA approach from the MRUI (Matching Rate for Unique Identifier) formula, and false positives are eliminated. No string matching is used in the ICA, therefore names are not required on the dataset, making the data more secure & ensuring confidentiality. The SESADP Project was highly successful using the ICA technique. A comparison of the results using a string matching exercise for the SESADP and the ICA are discussed here.
文摘Privacy protection for big data linking is discussed here in relation to the Central Statistics Office (CSO), Ireland's, big data linking project titled the 'Structure of Earnings Survey - Administrative Data Project' (SESADP). The result of the project was the creation of datasets and statistical outputs for the years 2011 to 2014 to meet Eurostat's annual earnings statistics requirements and the Structure of Earnings Survey (SES) Regulation. Record linking across the Census and various public sector datasets enabled the necessary information to be acquired to meet the Eurostat earnings requirements. However, the risk of statistical disclosure (i.e. identifying an individual on the dataset) is high unless privacy and confidentiality safe-guards are built into the data matching process. This paper looks at the three methods of linking records on big datasets employed on the SESADP, and how to anonymise the data to protect the identity of the individuals, where potentially disclosive variables exist.
基金Aeronautical Science F oundation of China !( N o.98E5 1116)
文摘According to the analysis of the very high frequency (VHF) self organized time division multiple access (S TDMA) aviation data link, a new dynamic slot assignment scheme is proposed in this paper, which adopts variable data frame structure and can eliminate the effect of the idle slot on message delay. By using queue theory, the analysis models of the new scheme and previous scheme are presented, and the performance of message delay and that of system throughput are analyzed under two schemes. The simulation results show that the new scheme has a better performance than the previous one in the message delay and system throughput.
文摘Data link communication requires data communication process must have reliability, availability, confidentiality, availability, integrity, non-repudiation, controllability.This has a great effect to ensure the normal communication functions.In this paper the author on the basis of many years of work experience, first discussed the establishment of a risk assessment system data link,then focused on the problem of index weight assessment.To data communication security, this research article will provide some references.
基金The National Natural Science Foundation of China(No.61070170)the Natural Science Foundation of Higher Education Institutions of Jiangsu Province(No.11KJB520017)Suzhou Application Foundation Research Project(No.SYG201238)
文摘Aiming at the problem that only some types of SPARQL ( simple protocal and resource description framework query language) queries can be answered by using the current resource description framework link traversal based query execution (RDF-LTE) approach, this paper discusses how the execution order of the triple pattern affects the query results and cost based on concrete SPARQL queries, and analyzes two properties of the web of linked data, missing backward links and missing contingency solution. Then three heuristic principles for logic query plan optimization, namely, the filtered basic graph pattern (FBGP) principle, the triple pattern chain principle and the seed URIs principle, are proposed. The three principles contribute to decrease the intermediate solutions and increase the types of queries that can be answered. The effectiveness and feasibility of the proposed approach is evaluated. The experimental results show that more query results can be returned with less cost, thus enabling users to develop the full potential of the web of linked data.