Diagnosing multi-stage diseases typically requires doctors to consider multiple data sources,including clinical symptoms,physical signs,biochemical test results,imaging findings,pathological examination data,and even ...Diagnosing multi-stage diseases typically requires doctors to consider multiple data sources,including clinical symptoms,physical signs,biochemical test results,imaging findings,pathological examination data,and even genetic data.When applying machine learning modeling to predict and diagnose multi-stage diseases,several challenges need to be addressed.Firstly,the model needs to handle multimodal data,as the data used by doctors for diagnosis includes image data,natural language data,and structured data.Secondly,privacy of patients’data needs to be protected,as these data contain the most sensitive and private information.Lastly,considering the practicality of the model,the computational requirements should not be too high.To address these challenges,this paper proposes a privacy-preserving federated deep learning diagnostic method for multi-stage diseases.This method improves the forward and backward propagation processes of deep neural network modeling algorithms and introduces a homomorphic encryption step to design a federated modeling algorithm without the need for an arbiter.It also utilizes dedicated integrated circuits to implement the hardware Paillier algorithm,providing accelerated support for homomorphic encryption in modeling.Finally,this paper designs and conducts experiments to evaluate the proposed solution.The experimental results show that in privacy-preserving federated deep learning diagnostic modeling,the method in this paper achieves the same modeling performance as ordinary modeling without privacy protection,and has higher modeling speed compared to similar algorithms.展开更多
Decommissioning of offshore facilities involve changing risk profiles at different decommissioning phases.Bayesian Belief Networks(BBN)are used as part of the proposed risk assessment method to capture the multiple in...Decommissioning of offshore facilities involve changing risk profiles at different decommissioning phases.Bayesian Belief Networks(BBN)are used as part of the proposed risk assessment method to capture the multiple interactions of a decommissioning activity.The BBN is structured from the data learning of an accident database and a modification of the BBN nodes to incorporate human reliability and barrier performance modelling.The analysis covers one case study of one area of decommissioning operations by extrapolating well workover data to well plugging and abandonment.Initial analysis from well workover data,of a 5-node BBN provided insights on two different levels of severity of an accident,the’Accident’and’Incident’level,and on its respective profiles of the initiating events and the investigation-reported human causes.The initial results demonstrate that the data learnt from the database can be used to structure the BBN,give insights on how human reliability pertaining to well activities can be modelled,and that the relative frequencies from the count analysis can act as initial data input for the proposed nodes.It is also proposed that the integrated treatment of various sources of information(database and expert judgement)through a BBN model can support the risk assessment of a dynamic situation such as offshore decommissioning.展开更多
The Extreme Learning Machine(ELM) and its variants are effective in many machine learning applications such as Imbalanced Learning(IL) or Big Data(BD) learning. However, they are unable to solve both imbalanced ...The Extreme Learning Machine(ELM) and its variants are effective in many machine learning applications such as Imbalanced Learning(IL) or Big Data(BD) learning. However, they are unable to solve both imbalanced and large-volume data learning problems. This study addresses the IL problem in BD applications. The Distributed and Weighted ELM(DW-ELM) algorithm is proposed, which is based on the Map Reduce framework. To confirm the feasibility of parallel computation, first, the fact that matrix multiplication operators are decomposable is illustrated.Then, to further improve the computational efficiency, an Improved DW-ELM algorithm(IDW-ELM) is developed using only one Map Reduce job. The successful operations of the proposed DW-ELM and IDW-ELM algorithms are finally validated through experiments.展开更多
The available modelling data shortage issue makes it difficult to guarantee the performance of data-driven building energy prediction(BEP)models for both the newly built buildings and existing information-poor buildin...The available modelling data shortage issue makes it difficult to guarantee the performance of data-driven building energy prediction(BEP)models for both the newly built buildings and existing information-poor buildings.Both knowledge transfer learning(KTL)and data incremental learning(DIL)can address the data shortage issue of such buildings.For new building scenarios with continuous data accumulation,the performance of BEP models has not been fully investigated considering the data accumulation dynamics.DIL,which can learn dynamic features from accumulated data adapting to the developing trend of new building time-series data and extend BEP model's knowledge,has been rarely studied.Previous studies have shown that the performance of KTL models trained with fixed data can be further improved in scenarios with dynamically changing data.Hence,this study proposes an improved transfer learning cross-BEP strategy continuously updated using the coarse data incremental(CDI)manner.The hybrid KTL-DIL strategy(LSTM-DANN-CDI)uses domain adversarial neural network(DANN)for KLT and long short-term memory(LSTM)as the Baseline BEP model.Performance evaluation is conducted to systematically qualify the effectiveness and applicability of KTL and improved KTL-DIL.Real-world data from six-type 36 buildings of six types are adopted to evaluate the performance of KTL and KTL-DIL in data-driven BEP tasks considering factors like the model increment time interval,the available target and source building data volumes.Compared with LSTM,results indicate that KTL(LSTM-DANN)and the proposed KTL-DIL(LSTM-DANN-CDI)can significantly improve the BEP performance for new buildings with limited data.Compared with the pure KTL strategy LSTM-DANN,the improved KTL-DIL strategy LSTM-DANN-CDI has better prediction performance with an average performance improvement ratio of 60%.展开更多
Socioecological inequity in environmental data science—such as inequities deriving from data-driven approaches and machine learning(ML)—are current issues subject to debate and evolution.There is growing consensus a...Socioecological inequity in environmental data science—such as inequities deriving from data-driven approaches and machine learning(ML)—are current issues subject to debate and evolution.There is growing consensus around embedding equity throughout all research and design domains—from inception to administration,while also addressing procedural,distributive,and recognitional factors.Yet,practically doing so may seem onerous or daunting to some.The current perspective helps to alleviate these types of concerns by providing substantiation for the connection between environmental data science and socioecological inequity,using the Systemic Equity Framework,and provides the foundation for a paradigmatic shift toward normalizing the use of equity-centered approaches in environmental data science and ML settings.Bolstering the integrity of environmental data science and ML is just beginning from an equity-centered tool development and rigorous application standpoint.To this end,this perspective also provides relevant future directions and challenges by overviewing some meaningful tools and strategies—such as applying the Wells-Du Bois Protocol,employing fairness metrics,and systematically addressing irreproducibility;emerging needs and proposals—such as addressing data-proxy bias and supporting convergence research;and establishes a ten-step path forward.Afterall,the work that environmental scientists and engineers do ultimately affect the well-being of us all.展开更多
In the design process of berm breakwaters, their front slope recession has an inevitable rule in large number of model tests, and this parameter being studied. This research draws its data from Moghim's and Shekari'...In the design process of berm breakwaters, their front slope recession has an inevitable rule in large number of model tests, and this parameter being studied. This research draws its data from Moghim's and Shekari's experiment results. These experiments consist of two different 2D model tests in two wave flumes, in which the berm recession to different sea state and structural parameters have been studied. Irregular waves with a JONSWAP spectrum were used in both test series. A total of 412 test results were used to cover the impact of sea state conditions such as wave height, wave period, storm duration and water depth at the toe of the structure, and structural parameters such as berm elevation from still water level, berm width and stone diameter on berm recession parameters. In this paper, a new set of equations for berm recession is derived using the M5' model tree as a machine learning approach. A comparison is made between the estimations by the new formula and the formulae recently given by other researchers to show the preference of new M5' approach.展开更多
Detecting the boundaries of protein domains is an important and challenging task in both experimental and computational structural biology. In this paper, a promising method for detecting the domain structure of a pro...Detecting the boundaries of protein domains is an important and challenging task in both experimental and computational structural biology. In this paper, a promising method for detecting the domain structure of a protein from sequence information alone is presented. The method is based on analyzing multiple sequence alignments derived from a database search. Multiple measures are defined to quantify the domain information content of each position along the sequence. Then they are combined into a single predictor using support vector machine. What is more important, the domain detection is first taken as an imbal- anced data learning problem. A novel undersampling method is proposed on distance-based maximal entropy in the feature space of Support Vector Machine (SVM). The overall precision is about 80%. Simulation results demonstrate that the method can help not only in predicting the complete 3D structure of a protein but also in the machine learning system on general im- balanced datasets.展开更多
The 5 th generation(5 G)mobile networks has been put into services across a number of markets,which aims at providing subscribers with high bit rates,low latency,high capacity,many new services and vertical applicatio...The 5 th generation(5 G)mobile networks has been put into services across a number of markets,which aims at providing subscribers with high bit rates,low latency,high capacity,many new services and vertical applications.Therefore the research and development on 6 G have been put on the agenda.Regarding demands and characteristics of future 6 G,artificial intelligence(A),big data(B)and cloud computing(C)will play indispensable roles in achieving the highest efficiency and the largest benefits.Interestingly,the initials of these three aspects remind us the significance of vitamin ABC to human body.In this article we specifically expound on the three elements of ABC and relationships in between.We analyze the basic characteristics of wireless big data(WBD)and the corresponding technical action in A and C,which are the high dimensional feature and spatial separation,the predictive ability,and the characteristics of knowledge.Based on the abilities of WBD,a new learning approach for wireless AI called knowledge+data-driven deep learning(KD-DL)method,and a layered computing architecture of mobile network integrating cloud/edge/terminal computing,is proposed,and their achievable efficiency is discussed.These progress will be conducive to the development of future 6 G.展开更多
It is of vital importance to reduce injuries and economic losses by accurate forecasts of typhoon tracks. A huge amount of typhoon observations have been accumulated by the meteorological department, however, they are...It is of vital importance to reduce injuries and economic losses by accurate forecasts of typhoon tracks. A huge amount of typhoon observations have been accumulated by the meteorological department, however, they are yet to be adequately utilized. It is an effective method to employ machine learning to perform forecasts. A long short term memory(LSTM) neural network is trained based on the typhoon observations during 1949–2011 in China's Mainland, combined with big data and data mining technologies, and a forecast model based on machine learning for the prediction of typhoon tracks is developed. The results show that the employed algorithm produces desirable 6–24 h nowcasting of typhoon tracks with an improved precision.展开更多
The present aim is to update, upon arrival of new learning data, the parameters of a score constructed with an ensemble method involving linear discriminant analysis and logistic regression in an online setting, witho...The present aim is to update, upon arrival of new learning data, the parameters of a score constructed with an ensemble method involving linear discriminant analysis and logistic regression in an online setting, without the need to store all of the previously obtained data. Poisson bootstrap and stochastic approximation processes were used with online standardized data to avoid numerical explosions, the convergence of which has been established theoretically. This empirical convergence of online ensemble scores to a reference “batch” score was studied on five different datasets from which data streams were simulated, comparing six different processes to construct the online scores. For each score, 50 replications using a total of 10N observations (N being the size of the dataset) were performed to assess the convergence and the stability of the method, computing the mean and standard deviation of a convergence criterion. A complementary study using 100N observations was also performed. All tested processes on all datasets converged after N iterations, except for one process on one dataset. The best processes were averaged processes using online standardized data and a piecewise constant step-size.展开更多
Distributed secure quantum machine learning (DSQML) enables a classical client with little quantum technology to delegate a remote quantum machine learning to the quantum server with the privacy data preserved. More...Distributed secure quantum machine learning (DSQML) enables a classical client with little quantum technology to delegate a remote quantum machine learning to the quantum server with the privacy data preserved. Moreover, DSQML can be extended to a more general case that the client does not have enough data, and resorts both the remote quantum server and remote databases to perform the secure machi~ learning. Here we propose a DSQML protocol that the client can classify two-dimensional vectors to dif- ferent clusters, resorting to a remote small-scale photon quantum computation processor. The protocol is secure without leaking any relevant information to the Eve. Any eavesdropper who attempts to intercept and disturb the learning process can be noticed. In principle, this protocol can be used to classify high dimensional vectors and may provide a new viewpoint and application for future "big data".展开更多
With the explosive increase in mobile apps, more and more threats migrate from traditional PC client to mobile device. Compared with traditional Win+Intel alliance in PC, Android+ARM alliance dominates in Mobile Int...With the explosive increase in mobile apps, more and more threats migrate from traditional PC client to mobile device. Compared with traditional Win+Intel alliance in PC, Android+ARM alliance dominates in Mobile Internet, the apps replace the PC client software as the major target of malicious usage. In this paper, to improve the security status of current mobile apps, we propose a methodology to evaluate mobile apps based on cloud computing platform and data mining. We also present a prototype system named MobSafe to identify the mobile app's virulence or benignancy. Compared with traditional method, such as permission pattern based method, MobSafe combines the dynamic and static analysis methods to comprehensively evaluate an Android app. In the implementation, we adopt Android Security Evaluation Framework (ASEF) and Static Android Analysis Framework (SAAF), the two representative dynamic and static analysis methods, to evaluate the Android apps and estimate the total time needed to evaluate all the apps stored in one mobile app market. Based on the real trace from a commercial mobile app market called AppChina, we can collect the statistics of the number of active Android apps, the average number apps installed in one Android device, and the expanding ratio of mobile apps. As mobile app market serves as the main line of defence against mobile malwares, our evaluation results show that it is practical to use cloud computing platform and data mining to verify all stored apps routinely to filter out malware apps from mobile app markets. As the future work, MobSafe can extensively use machine learning to conduct automotive forensic analysis of mobile apps based on the generated multifaceted data in this stage.展开更多
基金funded by the National Natural Science Foundation,China(No.62172123)the Key Research and Development Program of Heilongjiang(Grant No.2022ZX01A36)+1 种基金the Special Projects for the Central Government to Guide the Development of Local Science and Technology,China(No.ZY20B11)the Harbin Manufacturing Technology Innovation Talent Project(No.CXRC20221104236).
文摘Diagnosing multi-stage diseases typically requires doctors to consider multiple data sources,including clinical symptoms,physical signs,biochemical test results,imaging findings,pathological examination data,and even genetic data.When applying machine learning modeling to predict and diagnose multi-stage diseases,several challenges need to be addressed.Firstly,the model needs to handle multimodal data,as the data used by doctors for diagnosis includes image data,natural language data,and structured data.Secondly,privacy of patients’data needs to be protected,as these data contain the most sensitive and private information.Lastly,considering the practicality of the model,the computational requirements should not be too high.To address these challenges,this paper proposes a privacy-preserving federated deep learning diagnostic method for multi-stage diseases.This method improves the forward and backward propagation processes of deep neural network modeling algorithms and introduces a homomorphic encryption step to design a federated modeling algorithm without the need for an arbiter.It also utilizes dedicated integrated circuits to implement the hardware Paillier algorithm,providing accelerated support for homomorphic encryption in modeling.Finally,this paper designs and conducts experiments to evaluate the proposed solution.The experimental results show that in privacy-preserving federated deep learning diagnostic modeling,the method in this paper achieves the same modeling performance as ordinary modeling without privacy protection,and has higher modeling speed compared to similar algorithms.
基金The authors would like to acknowledge the support of Lloyd’s Register Singapore,Lloyd’s Register Consulting Energy AB(Sweden),Nanyang Technological University,Singapore Institute of Technology and the Singapore Economic Development Board(EDB)under the Industrial Postgraduate Program in the undertaking of this work(RCA-15/424).
文摘Decommissioning of offshore facilities involve changing risk profiles at different decommissioning phases.Bayesian Belief Networks(BBN)are used as part of the proposed risk assessment method to capture the multiple interactions of a decommissioning activity.The BBN is structured from the data learning of an accident database and a modification of the BBN nodes to incorporate human reliability and barrier performance modelling.The analysis covers one case study of one area of decommissioning operations by extrapolating well workover data to well plugging and abandonment.Initial analysis from well workover data,of a 5-node BBN provided insights on two different levels of severity of an accident,the’Accident’and’Incident’level,and on its respective profiles of the initiating events and the investigation-reported human causes.The initial results demonstrate that the data learnt from the database can be used to structure the BBN,give insights on how human reliability pertaining to well activities can be modelled,and that the relative frequencies from the count analysis can act as initial data input for the proposed nodes.It is also proposed that the integrated treatment of various sources of information(database and expert judgement)through a BBN model can support the risk assessment of a dynamic situation such as offshore decommissioning.
基金partially supported by the National Natural Science Foundation of China(Nos.61402089,61472069,and 61501101)the Fundamental Research Funds for the Central Universities(Nos.N161904001,N161602003,and N150408001)+2 种基金the Natural Science Foundation of Liaoning Province(No.2015020553)the China Postdoctoral Science Foundation(No.2016M591447)the Postdoctoral Science Foundation of Northeastern University(No.20160203)
文摘The Extreme Learning Machine(ELM) and its variants are effective in many machine learning applications such as Imbalanced Learning(IL) or Big Data(BD) learning. However, they are unable to solve both imbalanced and large-volume data learning problems. This study addresses the IL problem in BD applications. The Distributed and Weighted ELM(DW-ELM) algorithm is proposed, which is based on the Map Reduce framework. To confirm the feasibility of parallel computation, first, the fact that matrix multiplication operators are decomposable is illustrated.Then, to further improve the computational efficiency, an Improved DW-ELM algorithm(IDW-ELM) is developed using only one Map Reduce job. The successful operations of the proposed DW-ELM and IDW-ELM algorithms are finally validated through experiments.
基金jointly supported by the Opening Fund of Key Laboratory of Low-grade Energy Utilization Technologies and Systems of Ministry of Education of China(Chongqing University)(LLEUTS-202305)the Opening Fund of State Key Laboratory of Green Building in Western China(LSKF202316)+4 种基金the open Foundation of Anhui Province Key Laboratory of Intelligent Building and Building Energy-saving(IBES2022KF11)“The 14th Five-Year Plan”Hubei Provincial advantaged characteristic disciplines(groups)project of Wuhan University of Science and Technology(2023D0504,2023D0501)the National Natural Science Foundation of China(51906181)the 2021 Construction Technology Plan Project of Hubei Province(2021-83)the Science and Technology Project of Guizhou Province:Integrated Support of Guizhou[2023]General 393.
文摘The available modelling data shortage issue makes it difficult to guarantee the performance of data-driven building energy prediction(BEP)models for both the newly built buildings and existing information-poor buildings.Both knowledge transfer learning(KTL)and data incremental learning(DIL)can address the data shortage issue of such buildings.For new building scenarios with continuous data accumulation,the performance of BEP models has not been fully investigated considering the data accumulation dynamics.DIL,which can learn dynamic features from accumulated data adapting to the developing trend of new building time-series data and extend BEP model's knowledge,has been rarely studied.Previous studies have shown that the performance of KTL models trained with fixed data can be further improved in scenarios with dynamically changing data.Hence,this study proposes an improved transfer learning cross-BEP strategy continuously updated using the coarse data incremental(CDI)manner.The hybrid KTL-DIL strategy(LSTM-DANN-CDI)uses domain adversarial neural network(DANN)for KLT and long short-term memory(LSTM)as the Baseline BEP model.Performance evaluation is conducted to systematically qualify the effectiveness and applicability of KTL and improved KTL-DIL.Real-world data from six-type 36 buildings of six types are adopted to evaluate the performance of KTL and KTL-DIL in data-driven BEP tasks considering factors like the model increment time interval,the available target and source building data volumes.Compared with LSTM,results indicate that KTL(LSTM-DANN)and the proposed KTL-DIL(LSTM-DANN-CDI)can significantly improve the BEP performance for new buildings with limited data.Compared with the pure KTL strategy LSTM-DANN,the improved KTL-DIL strategy LSTM-DANN-CDI has better prediction performance with an average performance improvement ratio of 60%.
基金the National Science Foundation of the USA for Facilitating Funded Network Building,Convergence Exploration,and Equity Concept Development(Nos.2115405,2241237,and 2115453)Each of these funded efforts were distinctly meaningful in the development of this perspective.
文摘Socioecological inequity in environmental data science—such as inequities deriving from data-driven approaches and machine learning(ML)—are current issues subject to debate and evolution.There is growing consensus around embedding equity throughout all research and design domains—from inception to administration,while also addressing procedural,distributive,and recognitional factors.Yet,practically doing so may seem onerous or daunting to some.The current perspective helps to alleviate these types of concerns by providing substantiation for the connection between environmental data science and socioecological inequity,using the Systemic Equity Framework,and provides the foundation for a paradigmatic shift toward normalizing the use of equity-centered approaches in environmental data science and ML settings.Bolstering the integrity of environmental data science and ML is just beginning from an equity-centered tool development and rigorous application standpoint.To this end,this perspective also provides relevant future directions and challenges by overviewing some meaningful tools and strategies—such as applying the Wells-Du Bois Protocol,employing fairness metrics,and systematically addressing irreproducibility;emerging needs and proposals—such as addressing data-proxy bias and supporting convergence research;and establishes a ten-step path forward.Afterall,the work that environmental scientists and engineers do ultimately affect the well-being of us all.
文摘In the design process of berm breakwaters, their front slope recession has an inevitable rule in large number of model tests, and this parameter being studied. This research draws its data from Moghim's and Shekari's experiment results. These experiments consist of two different 2D model tests in two wave flumes, in which the berm recession to different sea state and structural parameters have been studied. Irregular waves with a JONSWAP spectrum were used in both test series. A total of 412 test results were used to cover the impact of sea state conditions such as wave height, wave period, storm duration and water depth at the toe of the structure, and structural parameters such as berm elevation from still water level, berm width and stone diameter on berm recession parameters. In this paper, a new set of equations for berm recession is derived using the M5' model tree as a machine learning approach. A comparison is made between the estimations by the new formula and the formulae recently given by other researchers to show the preference of new M5' approach.
基金National Natural Science Foundation of China (Grant No. 60433020, 60673099, 60673023)"985" project of Jilin University
文摘Detecting the boundaries of protein domains is an important and challenging task in both experimental and computational structural biology. In this paper, a promising method for detecting the domain structure of a protein from sequence information alone is presented. The method is based on analyzing multiple sequence alignments derived from a database search. Multiple measures are defined to quantify the domain information content of each position along the sequence. Then they are combined into a single predictor using support vector machine. What is more important, the domain detection is first taken as an imbal- anced data learning problem. A novel undersampling method is proposed on distance-based maximal entropy in the feature space of Support Vector Machine (SVM). The overall precision is about 80%. Simulation results demonstrate that the method can help not only in predicting the complete 3D structure of a protein but also in the machine learning system on general im- balanced datasets.
基金supported by Key Program of Natural Science Foundation of China(Grant No.61631018)Anhui Provincial Natural Science Foundation(Grant No.1908085MF177)Huawei Technology Innovative Research(YBN2018095087)。
文摘The 5 th generation(5 G)mobile networks has been put into services across a number of markets,which aims at providing subscribers with high bit rates,low latency,high capacity,many new services and vertical applications.Therefore the research and development on 6 G have been put on the agenda.Regarding demands and characteristics of future 6 G,artificial intelligence(A),big data(B)and cloud computing(C)will play indispensable roles in achieving the highest efficiency and the largest benefits.Interestingly,the initials of these three aspects remind us the significance of vitamin ABC to human body.In this article we specifically expound on the three elements of ABC and relationships in between.We analyze the basic characteristics of wireless big data(WBD)and the corresponding technical action in A and C,which are the high dimensional feature and spatial separation,the predictive ability,and the characteristics of knowledge.Based on the abilities of WBD,a new learning approach for wireless AI called knowledge+data-driven deep learning(KD-DL)method,and a layered computing architecture of mobile network integrating cloud/edge/terminal computing,is proposed,and their achievable efficiency is discussed.These progress will be conducive to the development of future 6 G.
基金The National Natural Science Foundation of China under contract Nos 61273245 and 41306028the Beijing Natural Science Foundation under contract No.4152031+2 种基金the National Special Research Fund for Non-Profit Marine Sector under contract Nos201405022-3 and 2013418026-4the Ocean Science and Technology Program of North China Sea Branch of State Oceanic Administration under contract No.2017A01the Operational Marine Forecasting Program of State Oceanic Administration
文摘It is of vital importance to reduce injuries and economic losses by accurate forecasts of typhoon tracks. A huge amount of typhoon observations have been accumulated by the meteorological department, however, they are yet to be adequately utilized. It is an effective method to employ machine learning to perform forecasts. A long short term memory(LSTM) neural network is trained based on the typhoon observations during 1949–2011 in China's Mainland, combined with big data and data mining technologies, and a forecast model based on machine learning for the prediction of typhoon tracks is developed. The results show that the employed algorithm produces desirable 6–24 h nowcasting of typhoon tracks with an improved precision.
文摘The present aim is to update, upon arrival of new learning data, the parameters of a score constructed with an ensemble method involving linear discriminant analysis and logistic regression in an online setting, without the need to store all of the previously obtained data. Poisson bootstrap and stochastic approximation processes were used with online standardized data to avoid numerical explosions, the convergence of which has been established theoretically. This empirical convergence of online ensemble scores to a reference “batch” score was studied on five different datasets from which data streams were simulated, comparing six different processes to construct the online scores. For each score, 50 replications using a total of 10N observations (N being the size of the dataset) were performed to assess the convergence and the stability of the method, computing the mean and standard deviation of a convergence criterion. A complementary study using 100N observations was also performed. All tested processes on all datasets converged after N iterations, except for one process on one dataset. The best processes were averaged processes using online standardized data and a piecewise constant step-size.
基金supported by the National Natural Science Foundation of China(11474168 and 61401222)the Natural Science Foundation of Jiangsu Province(BK20151502)+1 种基金the Qing Lan Project in Jiangsu Provincea Project Funded by the Priority Academic Program Development of Jiangsu Higher Education Institutions
文摘Distributed secure quantum machine learning (DSQML) enables a classical client with little quantum technology to delegate a remote quantum machine learning to the quantum server with the privacy data preserved. Moreover, DSQML can be extended to a more general case that the client does not have enough data, and resorts both the remote quantum server and remote databases to perform the secure machi~ learning. Here we propose a DSQML protocol that the client can classify two-dimensional vectors to dif- ferent clusters, resorting to a remote small-scale photon quantum computation processor. The protocol is secure without leaking any relevant information to the Eve. Any eavesdropper who attempts to intercept and disturb the learning process can be noticed. In principle, this protocol can be used to classify high dimensional vectors and may provide a new viewpoint and application for future "big data".
基金the National Key Basic Research and Development (973) Program of China (Nos. 2012CB315801 and 2011CB302805)the National Natural Science Foundation of China (Nos. 61161140320 and 61233016)Intel Research Council with the title of Security Vulnerability Analysis based on Cloud Platform with Intel IA Architecture
文摘With the explosive increase in mobile apps, more and more threats migrate from traditional PC client to mobile device. Compared with traditional Win+Intel alliance in PC, Android+ARM alliance dominates in Mobile Internet, the apps replace the PC client software as the major target of malicious usage. In this paper, to improve the security status of current mobile apps, we propose a methodology to evaluate mobile apps based on cloud computing platform and data mining. We also present a prototype system named MobSafe to identify the mobile app's virulence or benignancy. Compared with traditional method, such as permission pattern based method, MobSafe combines the dynamic and static analysis methods to comprehensively evaluate an Android app. In the implementation, we adopt Android Security Evaluation Framework (ASEF) and Static Android Analysis Framework (SAAF), the two representative dynamic and static analysis methods, to evaluate the Android apps and estimate the total time needed to evaluate all the apps stored in one mobile app market. Based on the real trace from a commercial mobile app market called AppChina, we can collect the statistics of the number of active Android apps, the average number apps installed in one Android device, and the expanding ratio of mobile apps. As mobile app market serves as the main line of defence against mobile malwares, our evaluation results show that it is practical to use cloud computing platform and data mining to verify all stored apps routinely to filter out malware apps from mobile app markets. As the future work, MobSafe can extensively use machine learning to conduct automotive forensic analysis of mobile apps based on the generated multifaceted data in this stage.