Electrocardiogram(ECG)is a low-cost,simple,fast,and non-invasive test.It can reflect the heart’s electrical activity and provide valuable diagnostic clues about the health of the entire body.Therefore,ECG has been wi...Electrocardiogram(ECG)is a low-cost,simple,fast,and non-invasive test.It can reflect the heart’s electrical activity and provide valuable diagnostic clues about the health of the entire body.Therefore,ECG has been widely used in various biomedical applications such as arrhythmia detection,disease-specific detection,mortality prediction,and biometric recognition.In recent years,ECG-related studies have been carried out using a variety of publicly available datasets,with many differences in the datasets used,data preprocessing methods,targeted challenges,and modeling and analysis techniques.Here we systematically summarize and analyze the ECGbased automatic analysis methods and applications.Specifically,we first reviewed 22 commonly used ECG public datasets and provided an overview of data preprocessing processes.Then we described some of the most widely used applications of ECG signals and analyzed the advanced methods involved in these applications.Finally,we elucidated some of the challenges in ECG analysis and provided suggestions for further research.展开更多
At present,water pollution has become an important factor affecting and restricting national and regional economic development.Total phosphorus is one of the main sources of water pollution and eutrophication,so the p...At present,water pollution has become an important factor affecting and restricting national and regional economic development.Total phosphorus is one of the main sources of water pollution and eutrophication,so the prediction of total phosphorus in water quality has good research significance.This paper selects the total phosphorus and turbidity data for analysis by crawling the data of the water quality monitoring platform.By constructing the attribute object mapping relationship,the correlation between the two indicators was analyzed and used to predict the future data.Firstly,the monthly mean and daily mean concentrations of total phosphorus and turbidity outliers were calculated after cleaning,and the correlation between them was analyzed.Secondly,the correlation coefficients of different times and frequencies were used to predict the values for the next five days,and the data trend was predicted by python visualization.Finally,the real value was compared with the predicted value data,and the results showed that the correlation between total phosphorus and turbidity was useful in predicting the water quality.展开更多
Sparse representation is an effective data classification algorithm that depends on the known training samples to categorise the test sample.It has been widely used in various image classification tasks.Sparseness in ...Sparse representation is an effective data classification algorithm that depends on the known training samples to categorise the test sample.It has been widely used in various image classification tasks.Sparseness in sparse representation means that only a few of instances selected from all training samples can effectively convey the essential class-specific information of the test sample,which is very important for classification.For deformable images such as human faces,pixels at the same location of different images of the same subject usually have different intensities.Therefore,extracting features and correctly classifying such deformable objects is very hard.Moreover,the lighting,attitude and occlusion cause more difficulty.Considering the problems and challenges listed above,a novel image representation and classification algorithm is proposed.First,the authors’algorithm generates virtual samples by a non-linear variation method.This method can effectively extract the low-frequency information of space-domain features of the original image,which is very useful for representing deformable objects.The combination of the original and virtual samples is more beneficial to improve the clas-sification performance and robustness of the algorithm.Thereby,the authors’algorithm calculates the expression coefficients of the original and virtual samples separately using the sparse representation principle and obtains the final score by a designed efficient score fusion scheme.The weighting coefficients in the score fusion scheme are set entirely automatically.Finally,the algorithm classifies the samples based on the final scores.The experimental results show that our method performs better classification than conventional sparse representation algorithms.展开更多
Water is one of the basic resources for human survival.Water pollution monitoring and protection have been becoming a major problem for many countries all over the world.Most traditional water quality monitoring syste...Water is one of the basic resources for human survival.Water pollution monitoring and protection have been becoming a major problem for many countries all over the world.Most traditional water quality monitoring systems,however,generally focus only on water quality data collection,ignoring data analysis and data mining.In addition,some dirty data and data loss may occur due to power failures or transmission failures,further affecting data analysis and its application.In order to meet these needs,by using Internet of things,cloud computing,and big data technologies,we designed and implemented a water quality monitoring data intelligent service platform in C#and PHP language.The platform includes monitoring point addition,monitoring point map labeling,monitoring data uploading,monitoring data processing,early warning of exceeding the standard of monitoring indicators,and other functions modules.Using this platform,we can realize the automatic collection of water quality monitoring data,data cleaning,data analysis,intelligent early warning and early warning information push,and other functions.For better security and convenience,we deployed the system in the Tencent Cloud and tested it.The testing results showed that the data analysis platform could run well and will provide decision support for water resource protection.展开更多
Many countries are paying more and more attention to the protection of water resources at present,and how to protect water resources has received extensive attention from society.Water quality monitoring is the key wo...Many countries are paying more and more attention to the protection of water resources at present,and how to protect water resources has received extensive attention from society.Water quality monitoring is the key work to water resources protection.How to efficiently collect and analyze water quality monitoring data is an important aspect of water resources protection.In this paper,python programming tools and regular expressions were used to design a web crawler for the acquisition of water quality monitoring data from Global Freshwater Quality Database(GEMStat)sites,and the multi-thread parallelism was added to improve the efficiency in the process of downloading and parsing.In order to analyze and process the crawled water quality data,Pandas and Pyecharts are used to visualize the water quality data to show the intrinsic correlation and spatiotemporal relationship of the data.展开更多
The sedimentary record of climate change in the Arctic region is useful for understanding global warming.Kongsfjord is located in the subpolar region of the Arctic and is a suitable site for studying climate change.Gl...The sedimentary record of climate change in the Arctic region is useful for understanding global warming.Kongsfjord is located in the subpolar region of the Arctic and is a suitable site for studying climate change.Glacier retreat is occurring in this region due to climate change,leading to an increase in meltwater outflow with a high debris content.In August 2017,we collected a sediment Core Z3 from the central fjord near the Yellow River Station.Then,we used the widely used chronology method of 210Pb,^(137)Cs,and other parameters to reflect the climate change record in the sedimentary environment of Kongsfjord.The results showed that after the mid-late 1990s,the mass accumulation rate of this core increased from 0.10 g/(cm^(2)·a)to 0.34 g/(cm^(2)·a),while the flux of^(210)Pb_(ex)increased from 125 Bq/(m^(2)·a)to 316 Bq/(m^(2)·a).The higher sedimentary inventory of^(210)Pb_(ex)in Kongsfjord compared to global fallout might have been caused by sediment focusing,boundary scavenging,and riverine input.Similarities between the inventory of^(137)Cs and global fallout indicated that terrestrial particulate matter was the main source of^(137)Cs in fjord sediments.The sedimentation rate increased after 1997,possibly due to the increased influx of glacial meltwater containing debris.In addition,the^(137)Cs activity,percentage of organic carbon(OC),and OC/total nitrogen concentration ratio showed increasing trends toward the top of the core since 1997,corresponding to a decrease in the mass balance of glaciers in the region.The results ofδ^(13)C,δ^(15)N and OC/TN concentration ratio showed both terrestrial and marine sources contributed to the organic matter in Core Z3.The relative contribution of terrestrial organic matter which was calculated by a two-endmember model showed an increased trend since mid-1990s.All these data indicate that global climate change has a significant impact on Arctic glaciers.展开更多
Two packet scheduling algorithms for rechargeable sensor networks are proposed based on the signal to interference plus noise ratio model.They allocate different transmission slots to conflicting packets and overcome ...Two packet scheduling algorithms for rechargeable sensor networks are proposed based on the signal to interference plus noise ratio model.They allocate different transmission slots to conflicting packets and overcome the challenges caused by the fact that the channel state changes quickly and is uncontrollable.The first algorithm proposes a prioritybased framework for packet scheduling in rechargeable sensor networks.Every packet is assigned a priority related to the transmission delay and the remaining energy of rechargeable batteries,and the packets with higher priority are scheduled first.The second algorithm mainly focuses on the energy efficiency of batteries.The priorities are related to the transmission distance of packets,and the packets with short transmission distance are scheduled first.The sensors are equipped with low-capacity rechargeable batteries,and the harvest-store-use model is used.We consider imperfect batteries.That is,the battery capacity is limited,and battery energy leaks over time.The energy harvesting rate,energy retention rate and transmission power are known.Extensive simulation results indicate that the battery capacity has little effect on the packet scheduling delay.Therefore,the algorithms proposed in this paper are very suitable for wireless sensor networks with low-capacity batteries.展开更多
Temperature trends in the upper stratosphere are investigated using satellite measurements from Stratospheric Sounding Unit(SSU)outputs and simulations from chemistry-climate models(CCMs)and the Coupled Model Intercom...Temperature trends in the upper stratosphere are investigated using satellite measurements from Stratospheric Sounding Unit(SSU)outputs and simulations from chemistry-climate models(CCMs)and the Coupled Model Intercomparison Project Phase 6(CMIP6).Observational evidence shows a lack of cooling in the Antarctic,in contrast to strong cooling at other latitudes,during austral winter over 1979-97.Analysis of CCM simulations for a longer period of1961-97 also shows a significant contrast in the upper stratospheric temperature trends between the Antarctic and lower latitudes.Results from two sets of model integrations with fixed ozone-depleting substances(ODSs)and fixed greenhouse gases(GHGs)at their 1960 levels suggest that the ODSs have made a major contribution to the lack of cooling in the Antarctic upper stratosphere.Results from CMIP6 simulations with prescribed GHGs and ozone confirm that changes in the dynamical processes associated with observed ozone depletion are largely responsible for the lack of cooling in the Antarctic upper stratosphere.The lack of cooling is found to be dynamically induced through increased upward wave activity into the upper stratosphere,which is attributed mainly to ODSs forcing.Specifically,the radiative cooling caused by the ozone depletion results in a stronger meridional temperature gradient between middle and high latitudes in the upper stratosphere,allowing more planetary waves propagating upward to warm the Antarctic upper stratosphere.These findings improve our understanding of the chemistry-climate coupling in the southern upper stratosphere.展开更多
Water resources are an indispensable and valuable resource for human survival and development.Water quality predicting plays an important role in the protection and development of water resources.It is difficult to pr...Water resources are an indispensable and valuable resource for human survival and development.Water quality predicting plays an important role in the protection and development of water resources.It is difficult to predictwater quality due to its random and trend changes.Therefore,amethod of predicting water quality which combines Auto Regressive Integrated Moving Average(ARIMA)and clusteringmodelwas proposed in this paper.By taking thewater qualitymonitoring data of a certain river basin as a sample,thewater quality Total Phosphorus(TP)index was selected as the prediction object.Firstly,the sample data was cleaned,stationary analyzed,and white noise analyzed.Secondly,the appropriate parameters were selected according to the Bayesian Information Criterion(BIC)principle,and the trend component characteristics were obtained by using ARIMA to conduct water quality predicting.Thirdly,the relationship between the precipitation and the TP index in themonitoring water field was analyzed by the K-means clusteringmethod,and the random incremental characteristics of precipitation on water quality changes were calculated.Finally,by combining with the trend component characteristics and the random incremental characteristics,the water quality prediction results were calculated.Compared with the ARIMA water quality prediction method,experiments showed that the proposed method has higher accuracy,and its Mean Absolute Error(MAE),Mean Square Error(MSE),and Mean Absolute Percentage Error(MAPE)were respectively reduced by 44.6%,56.8%,and 45.8%.展开更多
Hepatocellular carcinoma(HCC),one of the most common gastrointestinal cancers,has been considered a worldwide threat due to its high incidence and poor prognosis.In recent years,with the continuous emergence and promo...Hepatocellular carcinoma(HCC),one of the most common gastrointestinal cancers,has been considered a worldwide threat due to its high incidence and poor prognosis.In recent years,with the continuous emergence and promotion of new sequencing technologies in omics,genomics,transcriptomics,proteomics,and liquid biopsy are used to assess HCC heterogeneity from different perspectives and become a hotspot in the field of tumor precision medicine.In addition,with the continuous improvement of machine learning algorithms and deep learning algorithms,radiomics has made great progress in the field of ultrasound,CT and MRI for HCC.This article mainly reviews the research progress of biological big data and radiomics in HCC,and it provides new methods and ideas for the diagnosis,prognosis,and therapy of HCC.展开更多
There are many influencing factors of fiscal revenue,and traditional forecasting methods cannot handle the feature dimensions well,which leads to serious over-fitting of the forecast results and unable to make a good ...There are many influencing factors of fiscal revenue,and traditional forecasting methods cannot handle the feature dimensions well,which leads to serious over-fitting of the forecast results and unable to make a good estimate of the true future trend.The grey neural network model fused with Lasso regression is a comprehensive prediction model that combines the grey prediction model and the BP neural network model after dimensionality reduction using Lasso.It can reduce the dimensionality of the original data,make separate predictions for each explanatory variable,and then use neural networks to make multivariate predictions,thereby making up for the shortcomings of traditional methods of insufficient prediction accuracy.In this paper,we took the financial revenue data of China’s Hunan Province from 2005 to 2019 as the object of analysis.Firstly,we used Lasso regression to reduce the dimensionality of the data.Because the grey prediction model has the excellent predictive performance for small data volumes,then we chose the grey prediction model to obtain the predicted values of all explanatory variables in 2020,2021 by using the data of 2005–2019.Finally,considering that fiscal revenue is affected by many factors,we applied the BP neural network,which has a good effect on multiple inputs,to make the final forecast of fiscal revenue.The experimental results show that the combined model has a good effect in financial revenue forecasting.展开更多
Accurate prediction of tropical cyclone(TC)intensity remains a challenge due to the complex physical processes involved in TC intensity changes.A seven-day TC intensity prediction scheme based on the logistic growth e...Accurate prediction of tropical cyclone(TC)intensity remains a challenge due to the complex physical processes involved in TC intensity changes.A seven-day TC intensity prediction scheme based on the logistic growth equation(LGE)for the western North Pacific(WNP)has been developed using the observed and reanalysis data.In the LGE,TC intensity change is determined by a growth term and a decay term.These two terms are comprised of four free parameters which include a time-dependent growth rate,a maximum potential intensity(MPI),and two constants.Using 33 years of training samples,optimal predictors are selected first,and then the two constants are determined based on the least square method,forcing the regressed growth rate from the optimal predictors to be as close to the observed as possible.The estimation of the growth rate is further refined based on a step-wise regression(SWR)method and a machine learning(ML)method for the period 1982−2014.Using the LGE-based scheme,a total of 80 TCs during 2015−17 are used to make independent forecasts.Results show that the root mean square errors of the LGE-based scheme are much smaller than those of the official intensity forecasts from the China Meteorological Administration(CMA),especially for TCs in the coastal regions of East Asia.Moreover,the scheme based on ML demonstrates better forecast skill than that based on SWR.The new prediction scheme offers strong potential for both improving the forecasts for rapid intensification and weakening of TCs as well as for extending the 5-day forecasts currently issued by the CMA to 7-day forecasts.展开更多
Gastrointestinal(GI) cancers prevail and account for an extremely high number of cancer deaths worldwide. The traditional treatment strategies, including surgery, chemotherapy, radiotherapy, and targeted therapy, have...Gastrointestinal(GI) cancers prevail and account for an extremely high number of cancer deaths worldwide. The traditional treatment strategies, including surgery, chemotherapy, radiotherapy, and targeted therapy, have a limited therapeutic effect for advanced GI cancers. Recently, immunotherapy has shown promise in treating various refractory malignancies, including the GI cancers with mismatch repair deficiency(dMMR) or microsatellite instability(MSI). Thus,immunotherapy could be a promising treatment approach for GI cancers.Unfortunately, only a small proportion of GI cancer patients currently respond to immunotherapy. Therefore, it is important to discover predictive biomarkers for stratifying GI cancer patients response to immunotherapy. Certain genomic features, such as dMMR/MSI, tumor mutation burden(TMB), and tumor aneuploidy have been associated with tumor immunity and im-munotherapy response and may serve as predictive biomarkers for cancer immunotherapy. In this review, we examined the correlations between tumor immunity and three genomic features: dMMR/MSI, TMB, and tumor aneuploidy. We also explored their correlations using The Cancer Genome Atlas data and confirmed that the dMMR/MSI status, high TMB, and low tumor aneuploidy are associated with elevated tumor immunity in GI cancers. To improve the immunotherapeutic potential in GI cancers, more genetic or genomic features associated with tumor immune response need to be identified. Furthermore, it is worth exploring the combination of different immunotherapeutic methods and the combination of immunotherapy with other therapeutic approaches for cancer therapy.展开更多
Computer clusters with the shared-nothing architecture are the major computing platforms for big data processing and analysis.In cluster computing,data partitioning and sampling are two fundamental strategies to speed...Computer clusters with the shared-nothing architecture are the major computing platforms for big data processing and analysis.In cluster computing,data partitioning and sampling are two fundamental strategies to speed up the computation of big data and increase scalability.In this paper,we present a comprehensive survey of the methods and techniques of data partitioning and sampling with respect to big data processing and analysis.We start with an overview of the mainstream big data frameworks on Hadoop clusters.The basic methods of data partitioning are then discussed including three classical horizontal partitioning schemes:range,hash,and random partitioning.Data partitioning on Hadoop clusters is also discussed with a summary of new strategies for big data partitioning,including the new Random Sample Partition(RSP)distributed model.The classical methods of data sampling are then investigated,including simple random sampling,stratified sampling,and reservoir sampling.Two common methods of big data sampling on computing clusters are also discussed:record-level sampling and blocklevel sampling.Record-level sampling is not as efficient as block-level sampling on big distributed data.On the other hand,block-level sampling on data blocks generated with the classical data partitioning methods does not necessarily produce good representative samples for approximate computing of big data.In this survey,we also summarize the prevailing strategies and related work on sampling-based approximation on Hadoop clusters.We believe that data partitioning and sampling should be considered together to build approximate cluster computing frameworks that are reliable in both the computational and statistical respects.展开更多
AIM: To figure out the contributed factors of the hospitalization expenses of senile cataract patients(HECP) and build up an area-specified senile cataract diagnosis related group(DRG) of Shanghai thereby formula...AIM: To figure out the contributed factors of the hospitalization expenses of senile cataract patients(HECP) and build up an area-specified senile cataract diagnosis related group(DRG) of Shanghai thereby formulating the reference range of HECP and providing scientific basis for the fair use and supervision of the health care insurance fund.METHODS: The data was collected from the first page of the medical records of 22 097 hospitalized patients from tertiary hospitals in Shanghai from 2010 to 2012 whose major diagnosis were senile cataract. Firstly, we analyzed the influence factors of HECP using univariate and multivariate analysis. DRG grouping was conducted according to the exhaustive Chi-squared automatic interaction detector(E-CHAID) model, using HECP as target variable. Finally we evaluated the grouping results using non-parametric test such as Kruskal-Wallis H test, RIV, CV, etc.RESULTS: The 6 DRGs were established as well as criterion of HECP, using age, sex, type of surgery and whether complications/comorbidities occurred as the key variables of classification node of senile cataract cases.CONCLUSION: The grouping of senile cataract cases based on E-CHAID algorithm is reasonable. And the criterion of HECP based on DRG can provide a feasible way of management in the fair use and supervision of medical insurance fund.展开更多
Spectrum management and resource allocation(RA)problems are challenging and critical in a vast number of research areas such as wireless communications and computer networks.The traditional approaches for solving such...Spectrum management and resource allocation(RA)problems are challenging and critical in a vast number of research areas such as wireless communications and computer networks.The traditional approaches for solving such problems usually consume time and memory,especially for large-size problems.Recently different machine learning approaches have been considered as potential promising techniques for combinatorial optimization problems,especially the generative model of the deep neural networks.In this work,we propose a resource allocation deep autoencoder network,as one of the promising generative models,for enabling spectrum sharing in underlay device-to-device(D2D)communication by solving linear sum assignment problems(LSAPs).Specifically,we investigate the performance of three different architectures for the conditional variational autoencoders(CVAE).The three proposed architecture are the convolutional neural network(CVAECNN)autoencoder,the feed-forward neural network(CVAE-FNN)autoencoder,and the hybrid(H-CVAE)autoencoder.The simulation results show that the proposed approach could be used as a replacement of the conventional RA techniques,such as the Hungarian algorithm,due to its ability to find solutions of LASPs of different sizes with high accuracy and very fast execution time.Moreover,the simulation results reveal that the accuracy of the proposed hybrid autoencoder architecture outperforms the other proposed architectures and the state-of-the-art DNN techniques.展开更多
The“Park City”concept presents a new,urban development model that can be used as a guide for planning the growth of a city.Advancing the transformation of Chengdu into a park city is an important and unique strategi...The“Park City”concept presents a new,urban development model that can be used as a guide for planning the growth of a city.Advancing the transformation of Chengdu into a park city is an important and unique strategic option for maintaining and strengthening the Chengdu-Chongqing economic circle.The construction of a park city in Chengdu has many potentially positive and favorable outcomes,including maintaining natural ecosystems while improving biodiversity,livability,and enhancing the city’s historical and cultural heritage,all while establishing Chengdu as a national leader for this new urban development model.In recent years,park city scenescape modeling and the resulting ecological value transformations have been analyzed through theory,mechanism designs,and limited practice,but throughout the explorations on scenescape building and ecological value transformations,the actual construction of a park city is still facing many problems.This article argues that Chengdu should take lessons from the advanced experience in urban ecological construction efforts of other cities,both in China and overseas,focusing on sustained scenescape building,theoretical research in ecological value transformations,establishing meaningful and compatible interaction mechanisms between ecological value transformations and capital markets,improving the levels of planning and designing needed for scenescape building,establishing innovative scenescape building and ecological value transformations management systems and mechanisms,and cultivating park city scenescape brands.Related and detailed scenescape building and ecological value transformation measurement and policy suggestions are also provided in this article.展开更多
Event Extraction(EE)is a key task in information extraction,which requires high-quality annotated data that are often costly to obtain.Traditional classification-based methods suffer from low-resource scenarios due to...Event Extraction(EE)is a key task in information extraction,which requires high-quality annotated data that are often costly to obtain.Traditional classification-based methods suffer from low-resource scenarios due to the lack of label semantics and fine-grained annotations.While recent approaches have endeavored to address EE through a more data-efficient generative process,they often overlook event keywords,which are vital for EE.To tackle these challenges,we introduce KeyEE,a multi-prompt learning strategy that improves low-resource event extraction by Event Keywords Extraction(EKE).We suggest employing an auxiliary EKE sub-prompt and concurrently training both EE and EKE with a shared pre-trained language model.With the auxiliary sub-prompt,KeyEE learns event keywords knowledge implicitly,thereby reducing the dependence on annotated data.Furthermore,we investigate and analyze various EKE sub-prompt strategies to encourage further research in this area.Our experiments on benchmark datasets ACE2005 and ERE show that KeyEE achieves significant improvement in low-resource settings and sets new state-of-the-art results.展开更多
The construction and development of the digital economy,digital society and digital government are facing some common basic problems.Among them,the construction of the data governance system and the improvement of dat...The construction and development of the digital economy,digital society and digital government are facing some common basic problems.Among them,the construction of the data governance system and the improvement of data governance capacity are short boards and weak links,which have seriously restricted the construction and development of the digital economy,digital society and digital government.At present,the broad concept of data governance goes beyond the scope of traditional data governance,which“involves at least four aspects:the establishment of data asset status,management system and mechanism,sharing and openness,security and privacy protection”.Traditional information technologies and methods are powerless to comprehensively solve these problems,so it is urgent to improve understanding and find another way to reconstruct the information technology architecture to provide a scientific and reasonable technical system for effectively solving the problems of data governance.This paper redefined the information technology architecture and proposed the data architecture as the connection link and application support system between the traditional hardware architecture and software architecture.The data registration system is the core composition of the data architecture,and the public key encryption and authentication system is the key component of the data architecture.This data governance system based on the data architecture supports complex,comprehensive,collaborative and cross-domain business application scenarios.It provides scientific and feasible basic support for the construction and development of the digital economy,digital society and digital government.展开更多
Large-scale graphs usually exhibit global sparsity with local cohesiveness,and mining the representative cohesive subgraphs is a fundamental problem in graph analysis.The k-truss is one of the most commonly studied co...Large-scale graphs usually exhibit global sparsity with local cohesiveness,and mining the representative cohesive subgraphs is a fundamental problem in graph analysis.The k-truss is one of the most commonly studied cohesive subgraphs,in which each edge is formed in at least k 2 triangles.A critical issue in mining a k-truss lies in the computation of the trussness of each edge,which is the maximum value of k that an edge can be in a k-truss.Existing works mostly focus on truss computation in static graphs by sequential models.However,the graphs are constantly changing dynamically in the real world.We study distributed truss computation in dynamic graphs in this paper.In particular,we compute the trussness of edges based on the local nature of the k-truss in a synchronized node-centric distributed model.Iteratively decomposing the trussness of edges by relying only on local topological information is possible with the proposed distributed decomposition algorithm.Moreover,the distributed maintenance algorithm only needs to update a small amount of dynamic information to complete the computation.Extensive experiments have been conducted to show the scalability and efficiency of the proposed algorithm.展开更多
基金Supported by the NSFC-Zhejiang Joint Fund for the Integration of Industrialization and Informatization(U1909208)the Science and Technology Major Project of Changsha(kh2202004)the Changsha Municipal Natural Science Foundation(kq2202106).
文摘Electrocardiogram(ECG)is a low-cost,simple,fast,and non-invasive test.It can reflect the heart’s electrical activity and provide valuable diagnostic clues about the health of the entire body.Therefore,ECG has been widely used in various biomedical applications such as arrhythmia detection,disease-specific detection,mortality prediction,and biometric recognition.In recent years,ECG-related studies have been carried out using a variety of publicly available datasets,with many differences in the datasets used,data preprocessing methods,targeted challenges,and modeling and analysis techniques.Here we systematically summarize and analyze the ECGbased automatic analysis methods and applications.Specifically,we first reviewed 22 commonly used ECG public datasets and provided an overview of data preprocessing processes.Then we described some of the most widely used applications of ECG signals and analyzed the advanced methods involved in these applications.Finally,we elucidated some of the challenges in ECG analysis and provided suggestions for further research.
基金the National Natural Science Foundation of China(No.51775185)Natural Science Foundation of Hunan Province(No.2022JJ90013)+1 种基金Intelligent Environmental Monitoring Technology Hunan Provincial Joint Training Base for Graduate Students in the Integration of Industry and Education,and Hunan Normal University University-Industry Cooperation.the 2011 Collaborative Innovation Center for Development and Utilization of Finance and Economics Big Data Property,Universities of Hunan Province,Open Project,Grant Number 20181901CRP04.
文摘At present,water pollution has become an important factor affecting and restricting national and regional economic development.Total phosphorus is one of the main sources of water pollution and eutrophication,so the prediction of total phosphorus in water quality has good research significance.This paper selects the total phosphorus and turbidity data for analysis by crawling the data of the water quality monitoring platform.By constructing the attribute object mapping relationship,the correlation between the two indicators was analyzed and used to predict the future data.Firstly,the monthly mean and daily mean concentrations of total phosphorus and turbidity outliers were calculated after cleaning,and the correlation between them was analyzed.Secondly,the correlation coefficients of different times and frequencies were used to predict the values for the next five days,and the data trend was predicted by python visualization.Finally,the real value was compared with the predicted value data,and the results showed that the correlation between total phosphorus and turbidity was useful in predicting the water quality.
文摘Sparse representation is an effective data classification algorithm that depends on the known training samples to categorise the test sample.It has been widely used in various image classification tasks.Sparseness in sparse representation means that only a few of instances selected from all training samples can effectively convey the essential class-specific information of the test sample,which is very important for classification.For deformable images such as human faces,pixels at the same location of different images of the same subject usually have different intensities.Therefore,extracting features and correctly classifying such deformable objects is very hard.Moreover,the lighting,attitude and occlusion cause more difficulty.Considering the problems and challenges listed above,a novel image representation and classification algorithm is proposed.First,the authors’algorithm generates virtual samples by a non-linear variation method.This method can effectively extract the low-frequency information of space-domain features of the original image,which is very useful for representing deformable objects.The combination of the original and virtual samples is more beneficial to improve the clas-sification performance and robustness of the algorithm.Thereby,the authors’algorithm calculates the expression coefficients of the original and virtual samples separately using the sparse representation principle and obtains the final score by a designed efficient score fusion scheme.The weighting coefficients in the score fusion scheme are set entirely automatically.Finally,the algorithm classifies the samples based on the final scores.The experimental results show that our method performs better classification than conventional sparse representation algorithms.
基金the National Natural Science Foundation of China(No.61304208)Scientific Research Fund of Hunan Province Education Department(18C0003)+5 种基金Researchproject on teaching reform in colleges and universities of Hunan Province Education Department(20190147)Changsha City Science and Technology Plan Program(K1501013-11)Hunan NormalUniversity University-Industry Cooperation.This work is implemented at the 2011 Collaborative Innovation Center for Development and Utilization of Finance and Economics Big Data PropertyUniversities of Hunan ProvinceOpen projectgrant number 20181901CRP04.
文摘Water is one of the basic resources for human survival.Water pollution monitoring and protection have been becoming a major problem for many countries all over the world.Most traditional water quality monitoring systems,however,generally focus only on water quality data collection,ignoring data analysis and data mining.In addition,some dirty data and data loss may occur due to power failures or transmission failures,further affecting data analysis and its application.In order to meet these needs,by using Internet of things,cloud computing,and big data technologies,we designed and implemented a water quality monitoring data intelligent service platform in C#and PHP language.The platform includes monitoring point addition,monitoring point map labeling,monitoring data uploading,monitoring data processing,early warning of exceeding the standard of monitoring indicators,and other functions modules.Using this platform,we can realize the automatic collection of water quality monitoring data,data cleaning,data analysis,intelligent early warning and early warning information push,and other functions.For better security and convenience,we deployed the system in the Tencent Cloud and tested it.The testing results showed that the data analysis platform could run well and will provide decision support for water resource protection.
基金This research was funded by the National Natural Science Foundation of China(No.51775185)Scientific Research Fund of Hunan Province Education Department(18C0003)+2 种基金Research project on teaching reform in colleges and universities of Hunan Province Education Department(20190147)Innovation and Entrepreneurship Training Program for College Students in Hunan Province(2021-1980)Hunan Normal University University-Industry Cooperation.This work is implemented at the 2011 Collaborative Innovation Center for Development and Utilization of Finance and Economics Big Data Property,Universities of Hunan Province,Open project,Grant Number 20181901CRP04.
文摘Many countries are paying more and more attention to the protection of water resources at present,and how to protect water resources has received extensive attention from society.Water quality monitoring is the key work to water resources protection.How to efficiently collect and analyze water quality monitoring data is an important aspect of water resources protection.In this paper,python programming tools and regular expressions were used to design a web crawler for the acquisition of water quality monitoring data from Global Freshwater Quality Database(GEMStat)sites,and the multi-thread parallelism was added to improve the efficiency in the process of downloading and parsing.In order to analyze and process the crawled water quality data,Pandas and Pyecharts are used to visualize the water quality data to show the intrinsic correlation and spatiotemporal relationship of the data.
基金The National Natural Science Foundation of China under contract Nos 42107251 and 41706089the Natural Science Foundation of Fujian Province under contract No.2020J05232.
文摘The sedimentary record of climate change in the Arctic region is useful for understanding global warming.Kongsfjord is located in the subpolar region of the Arctic and is a suitable site for studying climate change.Glacier retreat is occurring in this region due to climate change,leading to an increase in meltwater outflow with a high debris content.In August 2017,we collected a sediment Core Z3 from the central fjord near the Yellow River Station.Then,we used the widely used chronology method of 210Pb,^(137)Cs,and other parameters to reflect the climate change record in the sedimentary environment of Kongsfjord.The results showed that after the mid-late 1990s,the mass accumulation rate of this core increased from 0.10 g/(cm^(2)·a)to 0.34 g/(cm^(2)·a),while the flux of^(210)Pb_(ex)increased from 125 Bq/(m^(2)·a)to 316 Bq/(m^(2)·a).The higher sedimentary inventory of^(210)Pb_(ex)in Kongsfjord compared to global fallout might have been caused by sediment focusing,boundary scavenging,and riverine input.Similarities between the inventory of^(137)Cs and global fallout indicated that terrestrial particulate matter was the main source of^(137)Cs in fjord sediments.The sedimentation rate increased after 1997,possibly due to the increased influx of glacial meltwater containing debris.In addition,the^(137)Cs activity,percentage of organic carbon(OC),and OC/total nitrogen concentration ratio showed increasing trends toward the top of the core since 1997,corresponding to a decrease in the mass balance of glaciers in the region.The results ofδ^(13)C,δ^(15)N and OC/TN concentration ratio showed both terrestrial and marine sources contributed to the organic matter in Core Z3.The relative contribution of terrestrial organic matter which was calculated by a two-endmember model showed an increased trend since mid-1990s.All these data indicate that global climate change has a significant impact on Arctic glaciers.
基金supported by the National Natural Science Foundation of China under Grants 62272256,61832012,and 61771289Major Program of Shandong Provincial Natural Science Foundation for the Fundamental Research under Grant ZR2022ZD03+1 种基金the Pilot Project for Integrated Innovation of Science,Education and Industry of Qilu University of Technology(Shandong Academy of Sciences)under Grant 2022XD001Shandong Province Fundamental Research under Grant ZR201906140028。
文摘Two packet scheduling algorithms for rechargeable sensor networks are proposed based on the signal to interference plus noise ratio model.They allocate different transmission slots to conflicting packets and overcome the challenges caused by the fact that the channel state changes quickly and is uncontrollable.The first algorithm proposes a prioritybased framework for packet scheduling in rechargeable sensor networks.Every packet is assigned a priority related to the transmission delay and the remaining energy of rechargeable batteries,and the packets with higher priority are scheduled first.The second algorithm mainly focuses on the energy efficiency of batteries.The priorities are related to the transmission distance of packets,and the packets with short transmission distance are scheduled first.The sensors are equipped with low-capacity rechargeable batteries,and the harvest-store-use model is used.We consider imperfect batteries.That is,the battery capacity is limited,and battery energy leaks over time.The energy harvesting rate,energy retention rate and transmission power are known.Extensive simulation results indicate that the battery capacity has little effect on the packet scheduling delay.Therefore,the algorithms proposed in this paper are very suitable for wireless sensor networks with low-capacity batteries.
基金supported by Grant Nos.41875047 and 91837206 from the National Natural Science Foundation of China(NSFC)Grant No.JIH2308007 from Fudan University。
文摘Temperature trends in the upper stratosphere are investigated using satellite measurements from Stratospheric Sounding Unit(SSU)outputs and simulations from chemistry-climate models(CCMs)and the Coupled Model Intercomparison Project Phase 6(CMIP6).Observational evidence shows a lack of cooling in the Antarctic,in contrast to strong cooling at other latitudes,during austral winter over 1979-97.Analysis of CCM simulations for a longer period of1961-97 also shows a significant contrast in the upper stratospheric temperature trends between the Antarctic and lower latitudes.Results from two sets of model integrations with fixed ozone-depleting substances(ODSs)and fixed greenhouse gases(GHGs)at their 1960 levels suggest that the ODSs have made a major contribution to the lack of cooling in the Antarctic upper stratosphere.Results from CMIP6 simulations with prescribed GHGs and ozone confirm that changes in the dynamical processes associated with observed ozone depletion are largely responsible for the lack of cooling in the Antarctic upper stratosphere.The lack of cooling is found to be dynamically induced through increased upward wave activity into the upper stratosphere,which is attributed mainly to ODSs forcing.Specifically,the radiative cooling caused by the ozone depletion results in a stronger meridional temperature gradient between middle and high latitudes in the upper stratosphere,allowing more planetary waves propagating upward to warm the Antarctic upper stratosphere.These findings improve our understanding of the chemistry-climate coupling in the southern upper stratosphere.
基金funded by the National Natural Science Foundation of China(No.51775185),Natural Science Foundation of Hunan Province(2022JJ90013)Scientific Research Fund of Hunan Province Education Department(18C0003)+1 种基金Research project on teaching reform in colleges and universities of Hunan Province Education Department(20190147)Hunan Normal University University-Industry Cooperation.This work is implemented at the 2011 Collaborative Innovation Center for Development and Utilization of Finance and Economics Big Data Property,Universities of Hunan Province,Open project,Grant Number 20181901CRP04.
文摘Water resources are an indispensable and valuable resource for human survival and development.Water quality predicting plays an important role in the protection and development of water resources.It is difficult to predictwater quality due to its random and trend changes.Therefore,amethod of predicting water quality which combines Auto Regressive Integrated Moving Average(ARIMA)and clusteringmodelwas proposed in this paper.By taking thewater qualitymonitoring data of a certain river basin as a sample,thewater quality Total Phosphorus(TP)index was selected as the prediction object.Firstly,the sample data was cleaned,stationary analyzed,and white noise analyzed.Secondly,the appropriate parameters were selected according to the Bayesian Information Criterion(BIC)principle,and the trend component characteristics were obtained by using ARIMA to conduct water quality predicting.Thirdly,the relationship between the precipitation and the TP index in themonitoring water field was analyzed by the K-means clusteringmethod,and the random incremental characteristics of precipitation on water quality changes were calculated.Finally,by combining with the trend component characteristics and the random incremental characteristics,the water quality prediction results were calculated.Compared with the ARIMA water quality prediction method,experiments showed that the proposed method has higher accuracy,and its Mean Absolute Error(MAE),Mean Square Error(MSE),and Mean Absolute Percentage Error(MAPE)were respectively reduced by 44.6%,56.8%,and 45.8%.
基金supported by grants from the Natural Science Foundation of Fujian Province(2021J011283)a demonstration study on the application of domestic high-end endoscopy system and minimally invasive instruments for precise resection of hepatobiliary and pancreatic tumors(2022YFC2407304).
文摘Hepatocellular carcinoma(HCC),one of the most common gastrointestinal cancers,has been considered a worldwide threat due to its high incidence and poor prognosis.In recent years,with the continuous emergence and promotion of new sequencing technologies in omics,genomics,transcriptomics,proteomics,and liquid biopsy are used to assess HCC heterogeneity from different perspectives and become a hotspot in the field of tumor precision medicine.In addition,with the continuous improvement of machine learning algorithms and deep learning algorithms,radiomics has made great progress in the field of ultrasound,CT and MRI for HCC.This article mainly reviews the research progress of biological big data and radiomics in HCC,and it provides new methods and ideas for the diagnosis,prognosis,and therapy of HCC.
基金This research was funded by the National Natural Science Foundation of China(No.61304208)Scientific Research Fund of Hunan Province Education Department(18C0003)+2 种基金Research project on teaching reform in colleges and universities of Hunan Province Education Department(20190147)Changsha City Science and Technology Plan Program(K1501013-11)Hunan Normal University University-Industry Cooperation.This work is implemented at the 2011 Collaborative Innovation Center for Development and Utilization of Finance and Economics Big Data Property,Universities of Hunan Province,Open project,grant number 20181901CRP04.
文摘There are many influencing factors of fiscal revenue,and traditional forecasting methods cannot handle the feature dimensions well,which leads to serious over-fitting of the forecast results and unable to make a good estimate of the true future trend.The grey neural network model fused with Lasso regression is a comprehensive prediction model that combines the grey prediction model and the BP neural network model after dimensionality reduction using Lasso.It can reduce the dimensionality of the original data,make separate predictions for each explanatory variable,and then use neural networks to make multivariate predictions,thereby making up for the shortcomings of traditional methods of insufficient prediction accuracy.In this paper,we took the financial revenue data of China’s Hunan Province from 2005 to 2019 as the object of analysis.Firstly,we used Lasso regression to reduce the dimensionality of the data.Because the grey prediction model has the excellent predictive performance for small data volumes,then we chose the grey prediction model to obtain the predicted values of all explanatory variables in 2020,2021 by using the data of 2005–2019.Finally,considering that fiscal revenue is affected by many factors,we applied the BP neural network,which has a good effect on multiple inputs,to make the final forecast of fiscal revenue.The experimental results show that the combined model has a good effect in financial revenue forecasting.
基金This study is supported by the National Key R&D Program of China(Grant Nos.2017YFC1501604 and 2019YFC1509101)the National Natural Science Foundation of China(Grant Nos.41875114,41875057,and 91937302).
文摘Accurate prediction of tropical cyclone(TC)intensity remains a challenge due to the complex physical processes involved in TC intensity changes.A seven-day TC intensity prediction scheme based on the logistic growth equation(LGE)for the western North Pacific(WNP)has been developed using the observed and reanalysis data.In the LGE,TC intensity change is determined by a growth term and a decay term.These two terms are comprised of four free parameters which include a time-dependent growth rate,a maximum potential intensity(MPI),and two constants.Using 33 years of training samples,optimal predictors are selected first,and then the two constants are determined based on the least square method,forcing the regressed growth rate from the optimal predictors to be as close to the observed as possible.The estimation of the growth rate is further refined based on a step-wise regression(SWR)method and a machine learning(ML)method for the period 1982−2014.Using the LGE-based scheme,a total of 80 TCs during 2015−17 are used to make independent forecasts.Results show that the root mean square errors of the LGE-based scheme are much smaller than those of the official intensity forecasts from the China Meteorological Administration(CMA),especially for TCs in the coastal regions of East Asia.Moreover,the scheme based on ML demonstrates better forecast skill than that based on SWR.The new prediction scheme offers strong potential for both improving the forecasts for rapid intensification and weakening of TCs as well as for extending the 5-day forecasts currently issued by the CMA to 7-day forecasts.
基金the China Pharmaceutical University,No:3150120001
文摘Gastrointestinal(GI) cancers prevail and account for an extremely high number of cancer deaths worldwide. The traditional treatment strategies, including surgery, chemotherapy, radiotherapy, and targeted therapy, have a limited therapeutic effect for advanced GI cancers. Recently, immunotherapy has shown promise in treating various refractory malignancies, including the GI cancers with mismatch repair deficiency(dMMR) or microsatellite instability(MSI). Thus,immunotherapy could be a promising treatment approach for GI cancers.Unfortunately, only a small proportion of GI cancer patients currently respond to immunotherapy. Therefore, it is important to discover predictive biomarkers for stratifying GI cancer patients response to immunotherapy. Certain genomic features, such as dMMR/MSI, tumor mutation burden(TMB), and tumor aneuploidy have been associated with tumor immunity and im-munotherapy response and may serve as predictive biomarkers for cancer immunotherapy. In this review, we examined the correlations between tumor immunity and three genomic features: dMMR/MSI, TMB, and tumor aneuploidy. We also explored their correlations using The Cancer Genome Atlas data and confirmed that the dMMR/MSI status, high TMB, and low tumor aneuploidy are associated with elevated tumor immunity in GI cancers. To improve the immunotherapeutic potential in GI cancers, more genetic or genomic features associated with tumor immune response need to be identified. Furthermore, it is worth exploring the combination of different immunotherapeutic methods and the combination of immunotherapy with other therapeutic approaches for cancer therapy.
基金Supported in part by the National Natural Science Foundation of China(No.61972261)the National Key R&D Program of China(No.2017YFC0822604-2)
文摘Computer clusters with the shared-nothing architecture are the major computing platforms for big data processing and analysis.In cluster computing,data partitioning and sampling are two fundamental strategies to speed up the computation of big data and increase scalability.In this paper,we present a comprehensive survey of the methods and techniques of data partitioning and sampling with respect to big data processing and analysis.We start with an overview of the mainstream big data frameworks on Hadoop clusters.The basic methods of data partitioning are then discussed including three classical horizontal partitioning schemes:range,hash,and random partitioning.Data partitioning on Hadoop clusters is also discussed with a summary of new strategies for big data partitioning,including the new Random Sample Partition(RSP)distributed model.The classical methods of data sampling are then investigated,including simple random sampling,stratified sampling,and reservoir sampling.Two common methods of big data sampling on computing clusters are also discussed:record-level sampling and blocklevel sampling.Record-level sampling is not as efficient as block-level sampling on big distributed data.On the other hand,block-level sampling on data blocks generated with the classical data partitioning methods does not necessarily produce good representative samples for approximate computing of big data.In this survey,we also summarize the prevailing strategies and related work on sampling-based approximation on Hadoop clusters.We believe that data partitioning and sampling should be considered together to build approximate cluster computing frameworks that are reliable in both the computational and statistical respects.
基金Supported by the Key Research and Development Program of Hunan Province(No.2017SK2011)
文摘AIM: To figure out the contributed factors of the hospitalization expenses of senile cataract patients(HECP) and build up an area-specified senile cataract diagnosis related group(DRG) of Shanghai thereby formulating the reference range of HECP and providing scientific basis for the fair use and supervision of the health care insurance fund.METHODS: The data was collected from the first page of the medical records of 22 097 hospitalized patients from tertiary hospitals in Shanghai from 2010 to 2012 whose major diagnosis were senile cataract. Firstly, we analyzed the influence factors of HECP using univariate and multivariate analysis. DRG grouping was conducted according to the exhaustive Chi-squared automatic interaction detector(E-CHAID) model, using HECP as target variable. Finally we evaluated the grouping results using non-parametric test such as Kruskal-Wallis H test, RIV, CV, etc.RESULTS: The 6 DRGs were established as well as criterion of HECP, using age, sex, type of surgery and whether complications/comorbidities occurred as the key variables of classification node of senile cataract cases.CONCLUSION: The grouping of senile cataract cases based on E-CHAID algorithm is reasonable. And the criterion of HECP based on DRG can provide a feasible way of management in the fair use and supervision of medical insurance fund.
基金supported in part by the China NSFC Grant 61872248Guangdong NSF 2017A030312008+1 种基金Fok Ying-Tong Education Foundation for Young Teachers in the Higher Education Institutions of China (Grant No.161064)GDUPS (2015)
文摘Spectrum management and resource allocation(RA)problems are challenging and critical in a vast number of research areas such as wireless communications and computer networks.The traditional approaches for solving such problems usually consume time and memory,especially for large-size problems.Recently different machine learning approaches have been considered as potential promising techniques for combinatorial optimization problems,especially the generative model of the deep neural networks.In this work,we propose a resource allocation deep autoencoder network,as one of the promising generative models,for enabling spectrum sharing in underlay device-to-device(D2D)communication by solving linear sum assignment problems(LSAPs).Specifically,we investigate the performance of three different architectures for the conditional variational autoencoders(CVAE).The three proposed architecture are the convolutional neural network(CVAECNN)autoencoder,the feed-forward neural network(CVAE-FNN)autoencoder,and the hybrid(H-CVAE)autoencoder.The simulation results show that the proposed approach could be used as a replacement of the conventional RA techniques,such as the Hungarian algorithm,due to its ability to find solutions of LASPs of different sizes with high accuracy and very fast execution time.Moreover,the simulation results reveal that the accuracy of the proposed hybrid autoencoder architecture outperforms the other proposed architectures and the state-of-the-art DNN techniques.
文摘The“Park City”concept presents a new,urban development model that can be used as a guide for planning the growth of a city.Advancing the transformation of Chengdu into a park city is an important and unique strategic option for maintaining and strengthening the Chengdu-Chongqing economic circle.The construction of a park city in Chengdu has many potentially positive and favorable outcomes,including maintaining natural ecosystems while improving biodiversity,livability,and enhancing the city’s historical and cultural heritage,all while establishing Chengdu as a national leader for this new urban development model.In recent years,park city scenescape modeling and the resulting ecological value transformations have been analyzed through theory,mechanism designs,and limited practice,but throughout the explorations on scenescape building and ecological value transformations,the actual construction of a park city is still facing many problems.This article argues that Chengdu should take lessons from the advanced experience in urban ecological construction efforts of other cities,both in China and overseas,focusing on sustained scenescape building,theoretical research in ecological value transformations,establishing meaningful and compatible interaction mechanisms between ecological value transformations and capital markets,improving the levels of planning and designing needed for scenescape building,establishing innovative scenescape building and ecological value transformations management systems and mechanisms,and cultivating park city scenescape brands.Related and detailed scenescape building and ecological value transformation measurement and policy suggestions are also provided in this article.
基金supported by the National Key Research and Development Program of China(No.2021YFF1201200)the Science and Technology Major Project of Changsha(No.kh2202004)the Natural Science Foundation of China(No.62006251)。
文摘Event Extraction(EE)is a key task in information extraction,which requires high-quality annotated data that are often costly to obtain.Traditional classification-based methods suffer from low-resource scenarios due to the lack of label semantics and fine-grained annotations.While recent approaches have endeavored to address EE through a more data-efficient generative process,they often overlook event keywords,which are vital for EE.To tackle these challenges,we introduce KeyEE,a multi-prompt learning strategy that improves low-resource event extraction by Event Keywords Extraction(EKE).We suggest employing an auxiliary EKE sub-prompt and concurrently training both EE and EKE with a shared pre-trained language model.With the auxiliary sub-prompt,KeyEE learns event keywords knowledge implicitly,thereby reducing the dependence on annotated data.Furthermore,we investigate and analyze various EKE sub-prompt strategies to encourage further research in this area.Our experiments on benchmark datasets ACE2005 and ERE show that KeyEE achieves significant improvement in low-resource settings and sets new state-of-the-art results.
文摘The construction and development of the digital economy,digital society and digital government are facing some common basic problems.Among them,the construction of the data governance system and the improvement of data governance capacity are short boards and weak links,which have seriously restricted the construction and development of the digital economy,digital society and digital government.At present,the broad concept of data governance goes beyond the scope of traditional data governance,which“involves at least four aspects:the establishment of data asset status,management system and mechanism,sharing and openness,security and privacy protection”.Traditional information technologies and methods are powerless to comprehensively solve these problems,so it is urgent to improve understanding and find another way to reconstruct the information technology architecture to provide a scientific and reasonable technical system for effectively solving the problems of data governance.This paper redefined the information technology architecture and proposed the data architecture as the connection link and application support system between the traditional hardware architecture and software architecture.The data registration system is the core composition of the data architecture,and the public key encryption and authentication system is the key component of the data architecture.This data governance system based on the data architecture supports complex,comprehensive,collaborative and cross-domain business application scenarios.It provides scientific and feasible basic support for the construction and development of the digital economy,digital society and digital government.
基金supported in part by the National Key Research and Development Program of China(No.2020YFB1005900)in part by National Natural Science Foundation of China(No.62122042)in part by Shandong University Multidisciplinary Research and Innovation Team of Young Scholars(No.2020QNQT017)。
文摘Large-scale graphs usually exhibit global sparsity with local cohesiveness,and mining the representative cohesive subgraphs is a fundamental problem in graph analysis.The k-truss is one of the most commonly studied cohesive subgraphs,in which each edge is formed in at least k 2 triangles.A critical issue in mining a k-truss lies in the computation of the trussness of each edge,which is the maximum value of k that an edge can be in a k-truss.Existing works mostly focus on truss computation in static graphs by sequential models.However,the graphs are constantly changing dynamically in the real world.We study distributed truss computation in dynamic graphs in this paper.In particular,we compute the trussness of edges based on the local nature of the k-truss in a synchronized node-centric distributed model.Iteratively decomposing the trussness of edges by relying only on local topological information is possible with the proposed distributed decomposition algorithm.Moreover,the distributed maintenance algorithm only needs to update a small amount of dynamic information to complete the computation.Extensive experiments have been conducted to show the scalability and efficiency of the proposed algorithm.