An attempt has been made to develop a distributed software infrastructure model for onboard data fusion system simulation, which is also applied to netted radar systems, onboard distributed detection systems and advan...An attempt has been made to develop a distributed software infrastructure model for onboard data fusion system simulation, which is also applied to netted radar systems, onboard distributed detection systems and advanced C3I systems. Two architectures are provided and verified: one is based on pure TCP/IP protocol and C/S model, and implemented with Winsock, the other is based on CORBA (common object request broker architecture). The performance of data fusion simulation system, i.e. reliability, flexibility and scalability, is improved and enhanced by two models. The study of them makes valuable explore on incorporating the distributed computation concepts into radar system simulation techniques.展开更多
This paper presents a data fusion method in distributed multi-sensor system including GPS and INS sensors’ data processing. First, a residual χ 2 \|test strategy with the corresponding algorithm is designed. Then a ...This paper presents a data fusion method in distributed multi-sensor system including GPS and INS sensors’ data processing. First, a residual χ 2 \|test strategy with the corresponding algorithm is designed. Then a coefficient matrices calculation method of the information sharing principle is derived. Finally, the federated Kalman filter is used to combine these independent, parallel, real\|time data. A pseudolite (PL) simulation example is given.展开更多
It is difficult to parallelize a subsistent sequential algorithm. Through analyzing the sequential algorithm of a Global Atmospheric Data Objective Analysis System, this article puts forward a distributed parallel alg...It is difficult to parallelize a subsistent sequential algorithm. Through analyzing the sequential algorithm of a Global Atmospheric Data Objective Analysis System, this article puts forward a distributed parallel algorithm that statically distributes data on a massively parallel processing (MPP) computer. The algorithm realizes distributed parailelization by extracting the analysis boxes and model grid point Iatitude rows with leaped steps, and by distributing the data to different processors. The parallel algorithm achieves good load balancing, high parallel efficiency, and low parallel cost. Performance experiments on a MPP computer arc also presented.展开更多
The security of Federated Learning(FL)/Distributed Machine Learning(DML)is gravely threatened by data poisoning attacks,which destroy the usability of the model by contaminating training samples,so such attacks are ca...The security of Federated Learning(FL)/Distributed Machine Learning(DML)is gravely threatened by data poisoning attacks,which destroy the usability of the model by contaminating training samples,so such attacks are called causative availability indiscriminate attacks.Facing the problem that existing data sanitization methods are hard to apply to real-time applications due to their tedious process and heavy computations,we propose a new supervised batch detection method for poison,which can fleetly sanitize the training dataset before the local model training.We design a training dataset generation method that helps to enhance accuracy and uses data complexity features to train a detection model,which will be used in an efficient batch hierarchical detection process.Our model stockpiles knowledge about poison,which can be expanded by retraining to adapt to new attacks.Being neither attack-specific nor scenario-specific,our method is applicable to FL/DML or other online or offline scenarios.展开更多
This paper describes the architecture of global distributed storage system for data grid. It focue on the management and the capability for the maximum users and maximum resources on the Internet, as well as performan...This paper describes the architecture of global distributed storage system for data grid. It focue on the management and the capability for the maximum users and maximum resources on the Internet, as well as performance and other issues.展开更多
The scale and complexity of big data are growing continuously,posing severe challenges to traditional data processing methods,especially in the field of clustering analysis.To address this issue,this paper introduces ...The scale and complexity of big data are growing continuously,posing severe challenges to traditional data processing methods,especially in the field of clustering analysis.To address this issue,this paper introduces a new method named Big Data Tensor Multi-Cluster Distributed Incremental Update(BDTMCDIncreUpdate),which combines distributed computing,storage technology,and incremental update techniques to provide an efficient and effective means for clustering analysis.Firstly,the original dataset is divided into multiple subblocks,and distributed computing resources are utilized to process the sub-blocks in parallel,enhancing efficiency.Then,initial clustering is performed on each sub-block using tensor-based multi-clustering techniques to obtain preliminary results.When new data arrives,incremental update technology is employed to update the core tensor and factor matrix,ensuring that the clustering model can adapt to changes in data.Finally,by combining the updated core tensor and factor matrix with historical computational results,refined clustering results are obtained,achieving real-time adaptation to dynamic data.Through experimental simulation on the Aminer dataset,the BDTMCDIncreUpdate method has demonstrated outstanding performance in terms of accuracy(ACC)and normalized mutual information(NMI)metrics,achieving an accuracy rate of 90%and an NMI score of 0.85,which outperforms existing methods such as TClusInitUpdate and TKLClusUpdate in most scenarios.Therefore,the BDTMCDIncreUpdate method offers an innovative solution to the field of big data analysis,integrating distributed computing,incremental updates,and tensor-based multi-clustering techniques.It not only improves the efficiency and scalability in processing large-scale high-dimensional datasets but also has been validated for its effectiveness and accuracy through experiments.This method shows great potential in real-world applications where dynamic data growth is common,and it is of significant importance for advancing the development of data analysis technology.展开更多
Due to the restricted satellite payloads in LEO mega-constellation networks(LMCNs),remote sensing image analysis,online learning and other big data services desirably need onboard distributed processing(OBDP).In exist...Due to the restricted satellite payloads in LEO mega-constellation networks(LMCNs),remote sensing image analysis,online learning and other big data services desirably need onboard distributed processing(OBDP).In existing technologies,the efficiency of big data applications(BDAs)in distributed systems hinges on the stable-state and low-latency links between worker nodes.However,LMCNs with high-dynamic nodes and long-distance links can not provide the above conditions,which makes the performance of OBDP hard to be intuitively measured.To bridge this gap,a multidimensional simulation platform is indispensable that can simulate the network environment of LMCNs and put BDAs in it for performance testing.Using STK's APIs and parallel computing framework,we achieve real-time simulation for thousands of satellite nodes,which are mapped as application nodes through software defined network(SDN)and container technologies.We elaborate the architecture and mechanism of the simulation platform,and take the Starlink and Hadoop as realistic examples for simulations.The results indicate that LMCNs have dynamic end-to-end latency which fluctuates periodically with the constellation movement.Compared to ground data center networks(GDCNs),LMCNs deteriorate the computing and storage job throughput,which can be alleviated by the utilization of erasure codes and data flow scheduling of worker nodes.展开更多
Due to the limited scenes that synthetic aperture radar(SAR)satellites can detect,the full-track utilization rate is not high.Because of the computing and storage limitation of one satellite,it is difficult to process...Due to the limited scenes that synthetic aperture radar(SAR)satellites can detect,the full-track utilization rate is not high.Because of the computing and storage limitation of one satellite,it is difficult to process large amounts of data of spaceborne synthetic aperture radars.It is proposed to use a new method of networked satellite data processing for improving the efficiency of data processing.A multi-satellite distributed SAR real-time processing method based on Chirp Scaling(CS)imaging algorithm is studied in this paper,and a distributed data processing system is built with field programmable gate array(FPGA)chips as the kernel.Different from the traditional CS algorithm processing,the system divides data processing into three stages.The computing tasks are reasonably allocated to different data processing units(i.e.,satellites)in each stage.The method effectively saves computing and storage resources of satellites,improves the utilization rate of a single satellite,and shortens the data processing time.Gaofen-3(GF-3)satellite SAR raw data is processed by the system,with the performance of the method verified.展开更多
Multimodal sentiment analysis utilizes multimodal data such as text,facial expressions and voice to detect people’s attitudes.With the advent of distributed data collection and annotation,we can easily obtain and sha...Multimodal sentiment analysis utilizes multimodal data such as text,facial expressions and voice to detect people’s attitudes.With the advent of distributed data collection and annotation,we can easily obtain and share such multimodal data.However,due to professional discrepancies among annotators and lax quality control,noisy labels might be introduced.Recent research suggests that deep neural networks(DNNs)will overfit noisy labels,leading to the poor performance of the DNNs.To address this challenging problem,we present a Multimodal Robust Meta Learning framework(MRML)for multimodal sentiment analysis to resist noisy labels and correlate distinct modalities simultaneously.Specifically,we propose a two-layer fusion net to deeply fuse different modalities and improve the quality of the multimodal data features for label correction and network training.Besides,a multiple meta-learner(label corrector)strategy is proposed to enhance the label correction approach and prevent models from overfitting to noisy labels.We conducted experiments on three popular multimodal datasets to verify the superiority of ourmethod by comparing it with four baselines.展开更多
Distributed Data Mining is expected to discover preciously unknown, implicit and valuable information from massive data set inherently distributed over a network. In recent years several approaches to distributed data...Distributed Data Mining is expected to discover preciously unknown, implicit and valuable information from massive data set inherently distributed over a network. In recent years several approaches to distributed data mining have been developed, but only a few of them make use of intelligent agents. This paper provides the reason for applying Multi-Agent Technology in Distributed Data Mining and presents a Distributed Data Mining System based on Multi-Agent Technology that deals with heterogeneity in such environment. Based on the advantages of both the CS model and agent-based model, the system is being able to address the specific concern of increasing scalability and enhancing performance.展开更多
A DMVOCC-MVDA (distributed multiversion optimistic concurrency control with multiversion dynamic adjustment) protocol was presented to process mobile distributed real-time transaction in mobile broadcast environment...A DMVOCC-MVDA (distributed multiversion optimistic concurrency control with multiversion dynamic adjustment) protocol was presented to process mobile distributed real-time transaction in mobile broadcast environments. At the mobile hosts, all transactions perform local pre-validation. The local pre-validation process is carried out against the committed transactions at the server in the last broadcast cycle. Transactions that survive in local pre-validation must be submitted to the server for local final validation. The new protocol eliminates conflicts between mobile read-only and mobile update transactions, and resolves data conflicts flexibly by using multiversion dynamic adjustment of serialization order to avoid unnecessary restarts of transactions. Mobile read-only transactions can be committed with no-blocking, and respond time of mobile read-only transactions is greatly shortened. The tolerance of mobile transactions of disconnections from the broadcast channel is increased. In global validation mobile distributed transactions have to do check to ensure distributed serializability in all participants. The simulation results show that the new concurrency control protocol proposed offers better performance than other protocols in terms of miss rate, restart rate, commit rate. Under high work load (think time is ls) the miss rate of DMVOCC-MVDA is only 14.6%, is significantly lower than that of other protocols. The restart rate of DMVOCC-MVDA is only 32.3%, showing that DMVOCC-MVDA can effectively reduce the restart rate of mobile transactions. And the commit rate of DMVOCC-MVDA is up to 61.2%, which is obviously higher than that of other protocols.展开更多
Aiming at the shortcomings in intrusion detection systems (IDSs) used incommercial and research fields, we propose the MA-IDS system, a distributed intrusion detectionsystem based on data mining. In this model, misuse...Aiming at the shortcomings in intrusion detection systems (IDSs) used incommercial and research fields, we propose the MA-IDS system, a distributed intrusion detectionsystem based on data mining. In this model, misuse intrusion detection system CM1DS) and anomalyintrusion de-lection system (AIDS) are combined. Data mining is applied to raise detectionperformance, and distributed mechanism is employed to increase the scalability and efficiency. Host-and network-based mining algorithms employ an improved. Bayes-ian decision theorem that suits forreal security environment to minimize the risks incurred by false decisions. We describe the overallarchitecture of the MA-IDS system, and discuss specific design and implementation issue.展开更多
HT-7 is the first superconducting tokamak device for fusion research in China. Many experiments have been done in the machine since 1994, and lots of satisfactory results have been achieved in the fusion research fiel...HT-7 is the first superconducting tokamak device for fusion research in China. Many experiments have been done in the machine since 1994, and lots of satisfactory results have been achieved in the fusion research field on HT-7 tokamak [1]. With the development of fusion research, remote control of experiment becomes more and more important to improve experimental efficiency and expand research results. This paper will describe a RCS (Remote Control System), the combined model of Browser/Server and Client/Server, based on Internet of HT-7 distributed data acquisition system (HT7DAS). By means of RCS, authorized users all over the world can control and configure HT7DAS remotely. The RCS is designed to improve the flexibility, opening, reliability and efficiency of HT7DAS. In the paper, the whole process of design along with implementation of the system and some key items are discussed in detail. The System has been successfully operated during HT-7 experiment in 2002 campaign period.展开更多
A distributed processing system (DPS) contains many autonomous nodes, which contribute their own computing power. DPS is considered a unified logical structure, operating in a distributed manner;the processing tasks a...A distributed processing system (DPS) contains many autonomous nodes, which contribute their own computing power. DPS is considered a unified logical structure, operating in a distributed manner;the processing tasks are divided into fragments and assigned to various nodes for processing. That type of operation requires and involves a great deal of communication. We propose to use the decentralized approach, based on a distributed hash table, to reduce the communication overhead and remove the server unit, thus avoiding having a single point of failure in the system. This paper proposes a mathematical model and algorithms that are implemented in a dedicated experimental system. Using the decentralized approach, this study demonstrates the efficient operation of a decentralized system which results in a reduced energy emission.展开更多
Product data management (PDM) has been accepted as an important tool for the manufacturing industries. In recent years, more and mor e researches have been conducted in the development of PDM. Their research area s in...Product data management (PDM) has been accepted as an important tool for the manufacturing industries. In recent years, more and mor e researches have been conducted in the development of PDM. Their research area s include system design, integration of object-oriented technology, data distri bution, collaborative and distributed manufacturing working environment, secur ity, and web-based integration. However, there are limitations on their rese arches. In particular, they cannot cater for PDM in distributed manufacturing e nvironment. This is especially true in South China, where many Hong Kong (HK) ma nufacturers have moved their production plants to different locations in Pearl R iver Delta for cost reduction. However, they retain their main offices in HK. Development of PDM system is inherently complex. Product related data cover prod uct name, product part number (product identification), drawings, material speci fications, dimension requirement, quality specification, test result, log size, production schedules, product data version and date of release, special tooling (e.g. jig and fixture), mould design, project engineering in charge, cost spread sheets, while process data includes engineering release, engineering change info rmation management, and other workflow related to the process information. Accor ding to Cornelissen et al., the contemporary PDM system should contains manageme nt functions in structure, retrieval, release, change, and workflow. In system design, development and implementation, a formal specification is nece ssary. However, there is no formal representation model for PDM system. Theref ore a graphical representation model is constructed to express the various scena rios of interactions between users and the PDM system. Statechart is then used to model the operations of PDM system, Fig.1. Statechart model bridges the curr ent gap between requirements, scenarios, and the initial design specifications o f PDM system. After properly analyzing the PDM system, a new distributed PDM (DPDM) system is proposed. Both graphical representation and statechart models are constructed f or the new DPDM system, Fig.2. New product data of DPDM and new system function s are then investigated to support product information flow in the new distribut ed environment. It is found that statecharts allow formal representations to capture the informa tion and control flows of both PDM and DPDM. In particular, statechart offers a dditional expressive power, when compared to conventional state transition diagr am, in terms of hierarchy, concurrency, history, and timing for DPDM behavioral modeling.展开更多
This report presents the design and implementation of a Distributed Data Acquisition、 Monitoring and Processing System (DDAMAP)。It is assumed that operations of a factory are organized into two-levels: client machin...This report presents the design and implementation of a Distributed Data Acquisition、 Monitoring and Processing System (DDAMAP)。It is assumed that operations of a factory are organized into two-levels: client machines at plant-level collect real-time raw data from sensors and measurement instrumentations and transfer them to a central processor over the Ethernets, and the central processor handles tasks of real-time data processing and monitoring. This system utilizes the computation power of Intel T2300 dual-core processor and parallel computations supported by multi-threading techniques. Our experiments show that these techniques can significantly improve the system performance and are viable solutions to real-time high-speed data processing.展开更多
Based on MATRIXx, a universal real-time visual distributed simulation system is developed. The system can receive different input data from network or local terminal. Application models in the simulation modules can a...Based on MATRIXx, a universal real-time visual distributed simulation system is developed. The system can receive different input data from network or local terminal. Application models in the simulation modules can automatically get such data to be analyzed and calculated, and then produce real-time simulation control information. Meanwhile, this paper designs relevant simulation components to implement the input and output data, which can guarantee the real-time and universal of the data transmission. Result of the experimental system shows that the real-time performance of the simulation is perfect.展开更多
There are two key issues in distributed intrusion detection system,that is,maintaining load balance of system and protecting data integrity.To address these issues,this paper proposes a new distributed intrusion detec...There are two key issues in distributed intrusion detection system,that is,maintaining load balance of system and protecting data integrity.To address these issues,this paper proposes a new distributed intrusion detection model for big data based on nondestructive partitioning and balanced allocation.A data allocation strategy based on capacity and workload is introduced to achieve local load balance,and a dynamic load adjustment strategy is adopted to maintain global load balance of cluster.Moreover,data integrity is protected by using session reassemble and session partitioning.The simulation results show that the new model enjoys favorable advantages such as good load balance,higher detection rate and detection efficiency.展开更多
In wastewater treatment process(WWTP), the accurate and real-time monitoring values of key variables are crucial for the operational strategies. However, most of the existing methods have difficulty in obtaining the r...In wastewater treatment process(WWTP), the accurate and real-time monitoring values of key variables are crucial for the operational strategies. However, most of the existing methods have difficulty in obtaining the real-time values of some key variables in the process. In order to handle this issue, a data-driven intelligent monitoring system, using the soft sensor technique and data distribution service, is developed to monitor the concentrations of effluent total phosphorous(TP) and ammonia nitrogen(NH_4-N). In this intelligent monitoring system, a fuzzy neural network(FNN) is applied for designing the soft sensor model, and a principal component analysis(PCA) method is used to select the input variables of the soft sensor model. Moreover, data transfer software is exploited to insert the soft sensor technique to the supervisory control and data acquisition(SCADA) system. Finally, this proposed intelligent monitoring system is tested in several real plants to demonstrate the reliability and effectiveness of the monitoring performance.展开更多
On 21 May 2021(UTC),an MW 7.4 earthquake jolted the east Bayan Har block in the Tibetan Plateau.The earthquake received widespread attention as it is the largest event in the Tibetan Plateau and its surroundings since...On 21 May 2021(UTC),an MW 7.4 earthquake jolted the east Bayan Har block in the Tibetan Plateau.The earthquake received widespread attention as it is the largest event in the Tibetan Plateau and its surroundings since the 2008 Wenchuan earthquake,and especially in proximity to the seismic gaps on the east Kunlun fault.Here we use satellite interferometric synthetic aperture radar data and subpixel offset observations along the range directions to characterize the coseismic deformation of the earthquake.Range offset displacements depict clear surface ruptures with a total length of~170 km involving two possible activated fault segments in the earthquake.Coseismic modeling results indicate that the earthquake was dominated by left-lateral strike-slip motions of up to 7 m within the top 12 km of the crust.The well-resolved slip variations are characterized by five major slip patches along strike and 64%of shallow slip deficit,suggesting a young seismogenic structure.Spatial-temporal changes of the postseismic deformation are mapped from early 6-day and 24-day InSAR observations,and are well explained by time-dependent afterslip models.Analysis of Global Navigation Satellite System(GNSS)velocity profiles and strain rates suggests that the eastward extrusion of plateau is diffusely distributed across the east Bayan Har block,but exhibits significant lateral heterogeneities,as evidenced by magnetotelluric observations.The block-wide distributed deformation of the east Bayan Har block along with the significant co-and post-seismic stress loadings from the Madoi earthquake imply high seismic risks along regional faults,especially the Tuosuo Lake and Maqên-Maqu segments of the Kunlun fault that are known as seismic gaps.展开更多
文摘An attempt has been made to develop a distributed software infrastructure model for onboard data fusion system simulation, which is also applied to netted radar systems, onboard distributed detection systems and advanced C3I systems. Two architectures are provided and verified: one is based on pure TCP/IP protocol and C/S model, and implemented with Winsock, the other is based on CORBA (common object request broker architecture). The performance of data fusion simulation system, i.e. reliability, flexibility and scalability, is improved and enhanced by two models. The study of them makes valuable explore on incorporating the distributed computation concepts into radar system simulation techniques.
文摘This paper presents a data fusion method in distributed multi-sensor system including GPS and INS sensors’ data processing. First, a residual χ 2 \|test strategy with the corresponding algorithm is designed. Then a coefficient matrices calculation method of the information sharing principle is derived. Finally, the federated Kalman filter is used to combine these independent, parallel, real\|time data. A pseudolite (PL) simulation example is given.
文摘It is difficult to parallelize a subsistent sequential algorithm. Through analyzing the sequential algorithm of a Global Atmospheric Data Objective Analysis System, this article puts forward a distributed parallel algorithm that statically distributes data on a massively parallel processing (MPP) computer. The algorithm realizes distributed parailelization by extracting the analysis boxes and model grid point Iatitude rows with leaped steps, and by distributing the data to different processors. The parallel algorithm achieves good load balancing, high parallel efficiency, and low parallel cost. Performance experiments on a MPP computer arc also presented.
基金supported in part by the“Pioneer”and“Leading Goose”R&D Program of Zhejiang(Grant No.2022C03174)the National Natural Science Foundation of China(No.92067103)+4 种基金the Key Research and Development Program of Shaanxi,China(No.2021ZDLGY06-02)the Natural Science Foundation of Shaanxi Province(No.2019ZDLGY12-02)the Shaanxi Innovation Team Project(No.2018TD-007)the Xi'an Science and technology Innovation Plan(No.201809168CX9JC10)the Fundamental Research Funds for the Central Universities(No.YJS2212)and National 111 Program of China B16037.
文摘The security of Federated Learning(FL)/Distributed Machine Learning(DML)is gravely threatened by data poisoning attacks,which destroy the usability of the model by contaminating training samples,so such attacks are called causative availability indiscriminate attacks.Facing the problem that existing data sanitization methods are hard to apply to real-time applications due to their tedious process and heavy computations,we propose a new supervised batch detection method for poison,which can fleetly sanitize the training dataset before the local model training.We design a training dataset generation method that helps to enhance accuracy and uses data complexity features to train a detection model,which will be used in an efficient batch hierarchical detection process.Our model stockpiles knowledge about poison,which can be expanded by retraining to adapt to new attacks.Being neither attack-specific nor scenario-specific,our method is applicable to FL/DML or other online or offline scenarios.
文摘This paper describes the architecture of global distributed storage system for data grid. It focue on the management and the capability for the maximum users and maximum resources on the Internet, as well as performance and other issues.
基金sponsored by the National Natural Science Foundation of China(Nos.61972208,62102194 and 62102196)National Natural Science Foundation of China(Youth Project)(No.62302237)+3 种基金Six Talent Peaks Project of Jiangsu Province(No.RJFW-111),China Postdoctoral Science Foundation Project(No.2018M640509)Postgraduate Research and Practice Innovation Program of Jiangsu Province(Nos.KYCX22_1019,KYCX23_1087,KYCX22_1027,KYCX23_1087,SJCX24_0339 and SJCX24_0346)Innovative Training Program for College Students of Nanjing University of Posts and Telecommunications(No.XZD2019116)Nanjing University of Posts and Telecommunications College Students Innovation Training Program(Nos.XZD2019116,XYB2019331).
文摘The scale and complexity of big data are growing continuously,posing severe challenges to traditional data processing methods,especially in the field of clustering analysis.To address this issue,this paper introduces a new method named Big Data Tensor Multi-Cluster Distributed Incremental Update(BDTMCDIncreUpdate),which combines distributed computing,storage technology,and incremental update techniques to provide an efficient and effective means for clustering analysis.Firstly,the original dataset is divided into multiple subblocks,and distributed computing resources are utilized to process the sub-blocks in parallel,enhancing efficiency.Then,initial clustering is performed on each sub-block using tensor-based multi-clustering techniques to obtain preliminary results.When new data arrives,incremental update technology is employed to update the core tensor and factor matrix,ensuring that the clustering model can adapt to changes in data.Finally,by combining the updated core tensor and factor matrix with historical computational results,refined clustering results are obtained,achieving real-time adaptation to dynamic data.Through experimental simulation on the Aminer dataset,the BDTMCDIncreUpdate method has demonstrated outstanding performance in terms of accuracy(ACC)and normalized mutual information(NMI)metrics,achieving an accuracy rate of 90%and an NMI score of 0.85,which outperforms existing methods such as TClusInitUpdate and TKLClusUpdate in most scenarios.Therefore,the BDTMCDIncreUpdate method offers an innovative solution to the field of big data analysis,integrating distributed computing,incremental updates,and tensor-based multi-clustering techniques.It not only improves the efficiency and scalability in processing large-scale high-dimensional datasets but also has been validated for its effectiveness and accuracy through experiments.This method shows great potential in real-world applications where dynamic data growth is common,and it is of significant importance for advancing the development of data analysis technology.
基金supported by National Natural Sciences Foundation of China(No.62271165,62027802,62201307)the Guangdong Basic and Applied Basic Research Foundation(No.2023A1515030297)+2 种基金the Shenzhen Science and Technology Program ZDSYS20210623091808025Stable Support Plan Program GXWD20231129102638002the Major Key Project of PCL(No.PCL2024A01)。
文摘Due to the restricted satellite payloads in LEO mega-constellation networks(LMCNs),remote sensing image analysis,online learning and other big data services desirably need onboard distributed processing(OBDP).In existing technologies,the efficiency of big data applications(BDAs)in distributed systems hinges on the stable-state and low-latency links between worker nodes.However,LMCNs with high-dynamic nodes and long-distance links can not provide the above conditions,which makes the performance of OBDP hard to be intuitively measured.To bridge this gap,a multidimensional simulation platform is indispensable that can simulate the network environment of LMCNs and put BDAs in it for performance testing.Using STK's APIs and parallel computing framework,we achieve real-time simulation for thousands of satellite nodes,which are mapped as application nodes through software defined network(SDN)and container technologies.We elaborate the architecture and mechanism of the simulation platform,and take the Starlink and Hadoop as realistic examples for simulations.The results indicate that LMCNs have dynamic end-to-end latency which fluctuates periodically with the constellation movement.Compared to ground data center networks(GDCNs),LMCNs deteriorate the computing and storage job throughput,which can be alleviated by the utilization of erasure codes and data flow scheduling of worker nodes.
基金Project(2017YFC1405600)supported by the National Key R&D Program of ChinaProject(18JK05032)supported by the Scientific Research Project of Education Department of Shaanxi Province,China。
文摘Due to the limited scenes that synthetic aperture radar(SAR)satellites can detect,the full-track utilization rate is not high.Because of the computing and storage limitation of one satellite,it is difficult to process large amounts of data of spaceborne synthetic aperture radars.It is proposed to use a new method of networked satellite data processing for improving the efficiency of data processing.A multi-satellite distributed SAR real-time processing method based on Chirp Scaling(CS)imaging algorithm is studied in this paper,and a distributed data processing system is built with field programmable gate array(FPGA)chips as the kernel.Different from the traditional CS algorithm processing,the system divides data processing into three stages.The computing tasks are reasonably allocated to different data processing units(i.e.,satellites)in each stage.The method effectively saves computing and storage resources of satellites,improves the utilization rate of a single satellite,and shortens the data processing time.Gaofen-3(GF-3)satellite SAR raw data is processed by the system,with the performance of the method verified.
基金supported by STI 2030-Major Projects 2021ZD0200400National Natural Science Foundation of China(62276233 and 62072405)Key Research Project of Zhejiang Province(2023C01048).
文摘Multimodal sentiment analysis utilizes multimodal data such as text,facial expressions and voice to detect people’s attitudes.With the advent of distributed data collection and annotation,we can easily obtain and share such multimodal data.However,due to professional discrepancies among annotators and lax quality control,noisy labels might be introduced.Recent research suggests that deep neural networks(DNNs)will overfit noisy labels,leading to the poor performance of the DNNs.To address this challenging problem,we present a Multimodal Robust Meta Learning framework(MRML)for multimodal sentiment analysis to resist noisy labels and correlate distinct modalities simultaneously.Specifically,we propose a two-layer fusion net to deeply fuse different modalities and improve the quality of the multimodal data features for label correction and network training.Besides,a multiple meta-learner(label corrector)strategy is proposed to enhance the label correction approach and prevent models from overfitting to noisy labels.We conducted experiments on three popular multimodal datasets to verify the superiority of ourmethod by comparing it with four baselines.
文摘Distributed Data Mining is expected to discover preciously unknown, implicit and valuable information from massive data set inherently distributed over a network. In recent years several approaches to distributed data mining have been developed, but only a few of them make use of intelligent agents. This paper provides the reason for applying Multi-Agent Technology in Distributed Data Mining and presents a Distributed Data Mining System based on Multi-Agent Technology that deals with heterogeneity in such environment. Based on the advantages of both the CS model and agent-based model, the system is being able to address the specific concern of increasing scalability and enhancing performance.
基金Project(20030533011)supported by the National Research Foundation for the Doctoral Program of Higher Education of China
文摘A DMVOCC-MVDA (distributed multiversion optimistic concurrency control with multiversion dynamic adjustment) protocol was presented to process mobile distributed real-time transaction in mobile broadcast environments. At the mobile hosts, all transactions perform local pre-validation. The local pre-validation process is carried out against the committed transactions at the server in the last broadcast cycle. Transactions that survive in local pre-validation must be submitted to the server for local final validation. The new protocol eliminates conflicts between mobile read-only and mobile update transactions, and resolves data conflicts flexibly by using multiversion dynamic adjustment of serialization order to avoid unnecessary restarts of transactions. Mobile read-only transactions can be committed with no-blocking, and respond time of mobile read-only transactions is greatly shortened. The tolerance of mobile transactions of disconnections from the broadcast channel is increased. In global validation mobile distributed transactions have to do check to ensure distributed serializability in all participants. The simulation results show that the new concurrency control protocol proposed offers better performance than other protocols in terms of miss rate, restart rate, commit rate. Under high work load (think time is ls) the miss rate of DMVOCC-MVDA is only 14.6%, is significantly lower than that of other protocols. The restart rate of DMVOCC-MVDA is only 32.3%, showing that DMVOCC-MVDA can effectively reduce the restart rate of mobile transactions. And the commit rate of DMVOCC-MVDA is up to 61.2%, which is obviously higher than that of other protocols.
文摘Aiming at the shortcomings in intrusion detection systems (IDSs) used incommercial and research fields, we propose the MA-IDS system, a distributed intrusion detectionsystem based on data mining. In this model, misuse intrusion detection system CM1DS) and anomalyintrusion de-lection system (AIDS) are combined. Data mining is applied to raise detectionperformance, and distributed mechanism is employed to increase the scalability and efficiency. Host-and network-based mining algorithms employ an improved. Bayes-ian decision theorem that suits forreal security environment to minimize the risks incurred by false decisions. We describe the overallarchitecture of the MA-IDS system, and discuss specific design and implementation issue.
基金The project supported by the Meg-science Engineering Project of the Chinese Academy of Sciences
文摘HT-7 is the first superconducting tokamak device for fusion research in China. Many experiments have been done in the machine since 1994, and lots of satisfactory results have been achieved in the fusion research field on HT-7 tokamak [1]. With the development of fusion research, remote control of experiment becomes more and more important to improve experimental efficiency and expand research results. This paper will describe a RCS (Remote Control System), the combined model of Browser/Server and Client/Server, based on Internet of HT-7 distributed data acquisition system (HT7DAS). By means of RCS, authorized users all over the world can control and configure HT7DAS remotely. The RCS is designed to improve the flexibility, opening, reliability and efficiency of HT7DAS. In the paper, the whole process of design along with implementation of the system and some key items are discussed in detail. The System has been successfully operated during HT-7 experiment in 2002 campaign period.
文摘A distributed processing system (DPS) contains many autonomous nodes, which contribute their own computing power. DPS is considered a unified logical structure, operating in a distributed manner;the processing tasks are divided into fragments and assigned to various nodes for processing. That type of operation requires and involves a great deal of communication. We propose to use the decentralized approach, based on a distributed hash table, to reduce the communication overhead and remove the server unit, thus avoiding having a single point of failure in the system. This paper proposes a mathematical model and algorithms that are implemented in a dedicated experimental system. Using the decentralized approach, this study demonstrates the efficient operation of a decentralized system which results in a reduced energy emission.
文摘Product data management (PDM) has been accepted as an important tool for the manufacturing industries. In recent years, more and mor e researches have been conducted in the development of PDM. Their research area s include system design, integration of object-oriented technology, data distri bution, collaborative and distributed manufacturing working environment, secur ity, and web-based integration. However, there are limitations on their rese arches. In particular, they cannot cater for PDM in distributed manufacturing e nvironment. This is especially true in South China, where many Hong Kong (HK) ma nufacturers have moved their production plants to different locations in Pearl R iver Delta for cost reduction. However, they retain their main offices in HK. Development of PDM system is inherently complex. Product related data cover prod uct name, product part number (product identification), drawings, material speci fications, dimension requirement, quality specification, test result, log size, production schedules, product data version and date of release, special tooling (e.g. jig and fixture), mould design, project engineering in charge, cost spread sheets, while process data includes engineering release, engineering change info rmation management, and other workflow related to the process information. Accor ding to Cornelissen et al., the contemporary PDM system should contains manageme nt functions in structure, retrieval, release, change, and workflow. In system design, development and implementation, a formal specification is nece ssary. However, there is no formal representation model for PDM system. Theref ore a graphical representation model is constructed to express the various scena rios of interactions between users and the PDM system. Statechart is then used to model the operations of PDM system, Fig.1. Statechart model bridges the curr ent gap between requirements, scenarios, and the initial design specifications o f PDM system. After properly analyzing the PDM system, a new distributed PDM (DPDM) system is proposed. Both graphical representation and statechart models are constructed f or the new DPDM system, Fig.2. New product data of DPDM and new system function s are then investigated to support product information flow in the new distribut ed environment. It is found that statecharts allow formal representations to capture the informa tion and control flows of both PDM and DPDM. In particular, statechart offers a dditional expressive power, when compared to conventional state transition diagr am, in terms of hierarchy, concurrency, history, and timing for DPDM behavioral modeling.
文摘This report presents the design and implementation of a Distributed Data Acquisition、 Monitoring and Processing System (DDAMAP)。It is assumed that operations of a factory are organized into two-levels: client machines at plant-level collect real-time raw data from sensors and measurement instrumentations and transfer them to a central processor over the Ethernets, and the central processor handles tasks of real-time data processing and monitoring. This system utilizes the computation power of Intel T2300 dual-core processor and parallel computations supported by multi-threading techniques. Our experiments show that these techniques can significantly improve the system performance and are viable solutions to real-time high-speed data processing.
文摘Based on MATRIXx, a universal real-time visual distributed simulation system is developed. The system can receive different input data from network or local terminal. Application models in the simulation modules can automatically get such data to be analyzed and calculated, and then produce real-time simulation control information. Meanwhile, this paper designs relevant simulation components to implement the input and output data, which can guarantee the real-time and universal of the data transmission. Result of the experimental system shows that the real-time performance of the simulation is perfect.
文摘There are two key issues in distributed intrusion detection system,that is,maintaining load balance of system and protecting data integrity.To address these issues,this paper proposes a new distributed intrusion detection model for big data based on nondestructive partitioning and balanced allocation.A data allocation strategy based on capacity and workload is introduced to achieve local load balance,and a dynamic load adjustment strategy is adopted to maintain global load balance of cluster.Moreover,data integrity is protected by using session reassemble and session partitioning.The simulation results show that the new model enjoys favorable advantages such as good load balance,higher detection rate and detection efficiency.
基金Supported by the National Natural Science Foundation of China(61622301,61533002)Beijing Natural Science Foundation(4172005)Major National Science and Technology Project(2017ZX07104)
文摘In wastewater treatment process(WWTP), the accurate and real-time monitoring values of key variables are crucial for the operational strategies. However, most of the existing methods have difficulty in obtaining the real-time values of some key variables in the process. In order to handle this issue, a data-driven intelligent monitoring system, using the soft sensor technique and data distribution service, is developed to monitor the concentrations of effluent total phosphorous(TP) and ammonia nitrogen(NH_4-N). In this intelligent monitoring system, a fuzzy neural network(FNN) is applied for designing the soft sensor model, and a principal component analysis(PCA) method is used to select the input variables of the soft sensor model. Moreover, data transfer software is exploited to insert the soft sensor technique to the supervisory control and data acquisition(SCADA) system. Finally, this proposed intelligent monitoring system is tested in several real plants to demonstrate the reliability and effectiveness of the monitoring performance.
基金supported by the Natural Science Foundation of Jiangsu Province(Grant No.SBK2020043202)by Key Laboratory of Geospace Environment and Geodesy,Ministry of Education,Wuhan University(No.19-01-08).
文摘On 21 May 2021(UTC),an MW 7.4 earthquake jolted the east Bayan Har block in the Tibetan Plateau.The earthquake received widespread attention as it is the largest event in the Tibetan Plateau and its surroundings since the 2008 Wenchuan earthquake,and especially in proximity to the seismic gaps on the east Kunlun fault.Here we use satellite interferometric synthetic aperture radar data and subpixel offset observations along the range directions to characterize the coseismic deformation of the earthquake.Range offset displacements depict clear surface ruptures with a total length of~170 km involving two possible activated fault segments in the earthquake.Coseismic modeling results indicate that the earthquake was dominated by left-lateral strike-slip motions of up to 7 m within the top 12 km of the crust.The well-resolved slip variations are characterized by five major slip patches along strike and 64%of shallow slip deficit,suggesting a young seismogenic structure.Spatial-temporal changes of the postseismic deformation are mapped from early 6-day and 24-day InSAR observations,and are well explained by time-dependent afterslip models.Analysis of Global Navigation Satellite System(GNSS)velocity profiles and strain rates suggests that the eastward extrusion of plateau is diffusely distributed across the east Bayan Har block,but exhibits significant lateral heterogeneities,as evidenced by magnetotelluric observations.The block-wide distributed deformation of the east Bayan Har block along with the significant co-and post-seismic stress loadings from the Madoi earthquake imply high seismic risks along regional faults,especially the Tuosuo Lake and Maqên-Maqu segments of the Kunlun fault that are known as seismic gaps.