期刊文献+
共找到4,344篇文章
< 1 2 218 >
每页显示 20 50 100
Censored Composite Conditional Quantile Screening for High-Dimensional Survival Data
1
作者 LIU Wei LI Yingqiu 《应用概率统计》 CSCD 北大核心 2024年第5期783-799,共17页
In this paper,we introduce the censored composite conditional quantile coefficient(cC-CQC)to rank the relative importance of each predictor in high-dimensional censored regression.The cCCQC takes advantage of all usef... In this paper,we introduce the censored composite conditional quantile coefficient(cC-CQC)to rank the relative importance of each predictor in high-dimensional censored regression.The cCCQC takes advantage of all useful information across quantiles and can detect nonlinear effects including interactions and heterogeneity,effectively.Furthermore,the proposed screening method based on cCCQC is robust to the existence of outliers and enjoys the sure screening property.Simulation results demonstrate that the proposed method performs competitively on survival datasets of high-dimensional predictors,particularly when the variables are highly correlated. 展开更多
关键词 high-dimensional survival data censored composite conditional quantile coefficient sure screening property rank consistency property
下载PDF
Optimal Estimation of High-Dimensional Covariance Matrices with Missing and Noisy Data
2
作者 Meiyin Wang Wanzhou Ye 《Advances in Pure Mathematics》 2024年第4期214-227,共14页
The estimation of covariance matrices is very important in many fields, such as statistics. In real applications, data are frequently influenced by high dimensions and noise. However, most relevant studies are based o... The estimation of covariance matrices is very important in many fields, such as statistics. In real applications, data are frequently influenced by high dimensions and noise. However, most relevant studies are based on complete data. This paper studies the optimal estimation of high-dimensional covariance matrices based on missing and noisy sample under the norm. First, the model with sub-Gaussian additive noise is presented. The generalized sample covariance is then modified to define a hard thresholding estimator , and the minimax upper bound is derived. After that, the minimax lower bound is derived, and it is concluded that the estimator presented in this article is rate-optimal. Finally, numerical simulation analysis is performed. The result shows that for missing samples with sub-Gaussian noise, if the true covariance matrix is sparse, the hard thresholding estimator outperforms the traditional estimate method. 展开更多
关键词 high-dimensional Covariance Matrix Missing data Sub-Gaussian Noise Optimal Estimation
下载PDF
A phenology-based vegetation index for improving ratoon rice mapping using harmonized Landsat and Sentinel-2 data 被引量:1
3
作者 Yunping Chen Jie Hu +6 位作者 Zhiwen Cai Jingya Yang Wei Zhou Qiong Hu Cong Wang Liangzhi You Baodong Xu 《Journal of Integrative Agriculture》 SCIE CAS CSCD 2024年第4期1164-1178,共15页
Ratoon rice,which refers to a second harvest of rice obtained from the regenerated tillers originating from the stubble of the first harvested crop,plays an important role in both food security and agroecology while r... Ratoon rice,which refers to a second harvest of rice obtained from the regenerated tillers originating from the stubble of the first harvested crop,plays an important role in both food security and agroecology while requiring minimal agricultural inputs.However,accurately identifying ratoon rice crops is challenging due to the similarity of its spectral features with other rice cropping systems(e.g.,double rice).Moreover,images with a high spatiotemporal resolution are essential since ratoon rice is generally cultivated in fragmented croplands within regions that frequently exhibit cloudy and rainy weather.In this study,taking Qichun County in Hubei Province,China as an example,we developed a new phenology-based ratoon rice vegetation index(PRVI)for the purpose of ratoon rice mapping at a 30 m spatial resolution using a robust time series generated from Harmonized Landsat and Sentinel-2(HLS)images.The PRVI that incorporated the red,near-infrared,and shortwave infrared 1 bands was developed based on the analysis of spectro-phenological separability and feature selection.Based on actual field samples,the performance of the PRVI for ratoon rice mapping was carefully evaluated by comparing it to several vegetation indices,including normalized difference vegetation index(NDVI),enhanced vegetation index(EVI)and land surface water index(LSWI).The results suggested that the PRVI could sufficiently capture the specific characteristics of ratoon rice,leading to a favorable separability between ratoon rice and other land cover types.Furthermore,the PRVI showed the best performance for identifying ratoon rice in the phenological phases characterized by grain filling and harvesting to tillering of the ratoon crop(GHS-TS2),indicating that only several images are required to obtain an accurate ratoon rice map.Finally,the PRVI performed better than NDVI,EVI,LSWI and their combination at the GHS-TS2 stages,with producer's accuracy and user's accuracy of 92.22 and 89.30%,respectively.These results demonstrate that the proposed PRVI based on HLS data can effectively identify ratoon rice in fragmented croplands at crucial phenological stages,which is promising for identifying the earliest timing of ratoon rice planting and can provide a fundamental dataset for crop management activities. 展开更多
关键词 ratoon rice phenology-based ratoon rice vegetation index(PRVI) phenological phase feature selection Harmonized Landsat Sentinel-2 data
下载PDF
Indexing the bit-code and distance for fast KNN search in high-dimensional spaces
4
作者 LIANG Jun-jie FENG Yu-cai 《Journal of Zhejiang University-Science A(Applied Physics & Engineering)》 SCIE EI CAS CSCD 2007年第6期857-863,共7页
Various index structures have recently been proposed to facilitate high-dimensional KNN queries, among which the techniques of approximate vector presentation and one-dimensional (1D) transformation can break the curs... Various index structures have recently been proposed to facilitate high-dimensional KNN queries, among which the techniques of approximate vector presentation and one-dimensional (1D) transformation can break the curse of dimensionality. Based on the two techniques above, a novel high-dimensional index is proposed, called Bit-code and Distance based index (BD). BD is based on a special partitioning strategy which is optimized for high-dimensional data. By the definitions of bit code and transformation function, a high-dimensional vector can be first approximately represented and then transformed into a 1D vector, the key managed by a B+-tree. A new KNN search algorithm is also proposed that exploits the bit code and distance to prune the search space more effectively. Results of extensive experiments using both synthetic and real data demonstrated that BD out- performs the existing index structures for KNN search in high-dimensional spaces. 展开更多
关键词 high-dimensional spaces KNN search Bit-code and distance based index (BD) Approximate vector
下载PDF
Observation points classifier ensemble for high-dimensional imbalanced classification 被引量:1
5
作者 Yulin He Xu Li +3 位作者 Philippe Fournier‐Viger Joshua Zhexue Huang Mianjie Li Salman Salloum 《CAAI Transactions on Intelligence Technology》 SCIE EI 2023年第2期500-517,共18页
In this paper,an Observation Points Classifier Ensemble(OPCE)algorithm is proposed to deal with High-Dimensional Imbalanced Classification(HDIC)problems based on data processed using the Multi-Dimensional Scaling(MDS)... In this paper,an Observation Points Classifier Ensemble(OPCE)algorithm is proposed to deal with High-Dimensional Imbalanced Classification(HDIC)problems based on data processed using the Multi-Dimensional Scaling(MDS)feature extraction technique.First,dimensionality of the original imbalanced data is reduced using MDS so that distances between any two different samples are preserved as well as possible.Second,a novel OPCE algorithm is applied to classify imbalanced samples by placing optimised observation points in a low-dimensional data space.Third,optimization of the observation point mappings is carried out to obtain a reliable assessment of the unknown samples.Exhaustive experiments have been conducted to evaluate the feasibility,rationality,and effectiveness of the proposed OPCE algorithm using seven benchmark HDIC data sets.Experimental results show that(1)the OPCE algorithm can be trained faster on low-dimensional imbalanced data than on high-dimensional data;(2)the OPCE algorithm can correctly identify samples as the number of optimised observation points is increased;and(3)statistical analysis reveals that OPCE yields better HDIC performances on the selected data sets in comparison with eight other HDIC algorithms.This demonstrates that OPCE is a viable algorithm to deal with HDIC problems. 展开更多
关键词 classifier ensemble feature transformation high-dimensional data classification imbalanced learning observation point mechanism
下载PDF
A Length-Adaptive Non-Dominated Sorting Genetic Algorithm for Bi-Objective High-Dimensional Feature Selection
6
作者 Yanlu Gong Junhai Zhou +2 位作者 Quanwang Wu MengChu Zhou Junhao Wen 《IEEE/CAA Journal of Automatica Sinica》 SCIE EI CSCD 2023年第9期1834-1844,共11页
As a crucial data preprocessing method in data mining,feature selection(FS)can be regarded as a bi-objective optimization problem that aims to maximize classification accuracy and minimize the number of selected featu... As a crucial data preprocessing method in data mining,feature selection(FS)can be regarded as a bi-objective optimization problem that aims to maximize classification accuracy and minimize the number of selected features.Evolutionary computing(EC)is promising for FS owing to its powerful search capability.However,in traditional EC-based methods,feature subsets are represented via a length-fixed individual encoding.It is ineffective for high-dimensional data,because it results in a huge search space and prohibitive training time.This work proposes a length-adaptive non-dominated sorting genetic algorithm(LA-NSGA)with a length-variable individual encoding and a length-adaptive evolution mechanism for bi-objective highdimensional FS.In LA-NSGA,an initialization method based on correlation and redundancy is devised to initialize individuals of diverse lengths,and a Pareto dominance-based length change operator is introduced to guide individuals to explore in promising search space adaptively.Moreover,a dominance-based local search method is employed for further improvement.The experimental results based on 12 high-dimensional gene datasets show that the Pareto front of feature subsets produced by LA-NSGA is superior to those of existing algorithms. 展开更多
关键词 Bi-objective optimization feature selection(FS) genetic algorithm high-dimensional data length-adaptive
下载PDF
Similarity measurement method of high-dimensional data based on normalized net lattice subspace 被引量:4
7
作者 李文法 Wang Gongming +1 位作者 Li Ke Huang Su 《High Technology Letters》 EI CAS 2017年第2期179-184,共6页
The performance of conventional similarity measurement methods is affected seriously by the curse of dimensionality of high-dimensional data.The reason is that data difference between sparse and noisy dimensionalities... The performance of conventional similarity measurement methods is affected seriously by the curse of dimensionality of high-dimensional data.The reason is that data difference between sparse and noisy dimensionalities occupies a large proportion of the similarity,leading to the dissimilarities between any results.A similarity measurement method of high-dimensional data based on normalized net lattice subspace is proposed.The data range of each dimension is divided into several intervals,and the components in different dimensions are mapped onto the corresponding interval.Only the component in the same or adjacent interval is used to calculate the similarity.To validate this method,three data types are used,and seven common similarity measurement methods are compared.The experimental result indicates that the relative difference of the method is increasing with the dimensionality and is approximately two or three orders of magnitude higher than the conventional method.In addition,the similarity range of this method in different dimensions is [0,1],which is fit for similarity analysis after dimensionality reduction. 展开更多
关键词 high-dimensional data the curse of dimensionality SIMILARITY NORMALIZATION SUBSPACE NPsim
下载PDF
Joint inversion of gravity and vertical gradient data based on modified structural similarity index for the structural and petrophysical consistency constraint
8
作者 Sheng Liu Xiangyun Wan +6 位作者 Shuanggen Jin Bin Jia Quan Lou Songbai Xuan Binbin Qin Yiju Tang Dali Sun 《Geodesy and Geodynamics》 EI CSCD 2023年第5期485-499,共15页
Joint inversion is one of the most effective methods for reducing non-uniqueness for geophysical inversion.The current joint inversion methods can be divided into the structural consistency constraint and petrophysica... Joint inversion is one of the most effective methods for reducing non-uniqueness for geophysical inversion.The current joint inversion methods can be divided into the structural consistency constraint and petrophysical consistency constraint methods,which are mutually independent.Currently,there is a need for joint inversion methods that can comprehensively consider the structural consistency constraints and petrophysical consistency constraints.This paper develops the structural similarity index(SSIM)as a new structural and petrophysical consistency constraint for the joint inversion of gravity and vertical gradient data.The SSIM constraint is in the form of a fraction,which may have analytical singularities.Therefore,converting the fractional form to the subtractive form can solve the problem of analytic singularity and finally form a modified structural consistency index of the joint inversion,which enhances the stability of the SSIM constraint applied to the joint inversion.Compared to the reconstructed results from the cross-gradient inversion,the proposed method presents good performance and stability.The SSIM algorithm is a new joint inversion method for petrophysical and structural constraints.It can promote the consistency of the recovered models from the distribution and the structure of the physical property values.Then,applications to synthetic data illustrate that the algorithm proposed in this paper can well process the synthetic data and acquire good reconstructed results. 展开更多
关键词 Joint inversion Gravity and vertical gradient data Modified structural similarity index
下载PDF
Variance Estimation for High-Dimensional Varying Index Coefficient Models
9
作者 Miao Wang Hao Lv Yicun Wang 《Open Journal of Statistics》 2019年第5期555-570,共16页
This paper studies the re-adjusted cross-validation method and a semiparametric regression model called the varying index coefficient model. We use the profile spline modal estimator method to estimate the coefficient... This paper studies the re-adjusted cross-validation method and a semiparametric regression model called the varying index coefficient model. We use the profile spline modal estimator method to estimate the coefficients of the parameter part of the Varying Index Coefficient Model (VICM), while the unknown function part uses the B-spline to expand. Moreover, we combine the above two estimation methods under the assumption of high-dimensional data. The results of data simulation and empirical analysis show that for the varying index coefficient model, the re-adjusted cross-validation method is better in terms of accuracy and stability than traditional methods based on ordinary least squares. 展开更多
关键词 high-dimensional data Refitted Cross-Validation VARYING index COEFFICIENT MODELS Variance ESTIMATION
下载PDF
A nearest neighbor search algorithm of high-dimensional data based on sequential NPsim matrix
10
作者 李文法 Wang Gongming +1 位作者 Ma Nan Liu Hongzhe 《High Technology Letters》 EI CAS 2016年第3期241-247,共7页
Problems existin similarity measurement and index tree construction which affect the performance of nearest neighbor search of high-dimensional data. The equidistance problem is solved using NPsim function to calculat... Problems existin similarity measurement and index tree construction which affect the performance of nearest neighbor search of high-dimensional data. The equidistance problem is solved using NPsim function to calculate similarity. And a sequential NPsim matrix is built to improve indexing performance. To sum up the above innovations,a nearest neighbor search algorithm of high-dimensional data based on sequential NPsim matrix is proposed in comparison with the nearest neighbor search algorithms based on KD-tree or SR-tree on Munsell spectral data set. Experimental results show that the proposed algorithm similarity is better than that of other algorithms and searching speed is more than thousands times of others. In addition,the slow construction speed of sequential NPsim matrix can be increased by using parallel computing. 展开更多
关键词 nearest neighbor search high-dimensional data SIMILARITY indexing tree NPsim KD-TREE SR-tree Munsell
下载PDF
A method for extracting the preseismic gravity anomalies over the Tibetan Plateau based on the maximum shear strain using GRACE data
11
作者 Hui Wang DongMei Song +1 位作者 XinJian Shan Bin Wang 《Earth and Planetary Physics》 EI CAS CSCD 2024年第4期589-608,共20页
The occurrence of earthquakes is closely related to the crustal geotectonic movement and the migration of mass,which consequently cause changes in gravity.The Gravity Recovery And Climate Experiment(GRACE)satellite da... The occurrence of earthquakes is closely related to the crustal geotectonic movement and the migration of mass,which consequently cause changes in gravity.The Gravity Recovery And Climate Experiment(GRACE)satellite data can be used to detect gravity changes associated with large earthquakes.However,previous GRACE satellite-based seismic gravity-change studies have focused more on coseismic gravity changes than on preseismic gravity changes.Moreover,the noise of the north–south stripe in GRACE data is difficult to eliminate,thereby resulting in the loss of some gravity information related to tectonic activities.To explore the preseismic gravity anomalies in a more refined way,we first propose a method of characterizing gravity variation based on the maximum shear strain of gravity,inspired by the concept of crustal strain.The offset index method is then adopted to describe the gravity anomalies,and the spatial and temporal characteristics of gravity anomalies before earthquakes are analyzed at the scales of the fault zone and plate,respectively.In this work,experiments are carried out on the Tibetan Plateau and its surrounding areas,and the following findings are obtained:First,from the observation scale of the fault zone,we detect the occurrence of large-area gravity anomalies near the epicenter,oftentimes about half a year before an earthquake,and these anomalies were distributed along the fault zone.Second,from the observation scale of the plate,we find that when an earthquake occurred on the Tibetan Plateau,a large number of gravity anomalies also occurred at the boundary of the Tibetan Plateau and the Indian Plate.Moreover,the aforementioned experiments confirm that the proposed method can successfully capture the preseismic gravity anomalies of large earthquakes with a magnitude of less than 8,which suggests a new idea for the application of gravity satellite data to earthquake research. 展开更多
关键词 Gravity Recovery And Climate Experiment(GRACE)data maximum shear strain offset index K preseismic gravity anomalies Tibetan Plateau fault zone
下载PDF
BC-iDistance:an optimized high-dimensional index for KNN processing
12
作者 梁俊杰 冯玉才 《Journal of Harbin Institute of Technology(New Series)》 EI CAS 2008年第6期856-861,共6页
To facilitate high-dimensional KNN queries,based on techniques of approximate vector presentation and one-dimensional transformation,an optimal index is proposed,namely Bit-Code based iDistance(BC-iDistance).To overco... To facilitate high-dimensional KNN queries,based on techniques of approximate vector presentation and one-dimensional transformation,an optimal index is proposed,namely Bit-Code based iDistance(BC-iDistance).To overcome the defect of much information loss for iDistance in one-dimensional transformation,the BC-iDistance adopts a novel representation of compressing a d-dimensional vector into a two-dimensional vector,and employs the concepts of bit code and one-dimensional distance to reflect the location and similarity of the data point relative to the corresponding reference point respectively.By employing the classical B+tree,this representation realizes a two-level pruning process and facilitates the use of a single index structure to further speed up the processing.Experimental evaluations using synthetic data and real data demonstrate that the BC-iDistance outperforms the iDistance and sequential scan for KNN search in high-dimensional spaces. 展开更多
关键词 high-dimensional index KNN search bit code approximate vector
下载PDF
Dimensionality Reduction of High-Dimensional Highly Correlated Multivariate Grapevine Dataset
13
作者 Uday Kant Jha Peter Bajorski +3 位作者 Ernest Fokoue Justine Vanden Heuvel Jan van Aardt Grant Anderson 《Open Journal of Statistics》 2017年第4期702-717,共16页
Viticulturists traditionally have a keen interest in studying the relationship between the biochemistry of grapevines’ leaves/petioles and their associated spectral reflectance in order to understand the fruit ripeni... Viticulturists traditionally have a keen interest in studying the relationship between the biochemistry of grapevines’ leaves/petioles and their associated spectral reflectance in order to understand the fruit ripening rate, water status, nutrient levels, and disease risk. In this paper, we implement imaging spectroscopy (hyperspectral) reflectance data, for the reflective 330 - 2510 nm wavelength region (986 total spectral bands), to assess vineyard nutrient status;this constitutes a high dimensional dataset with a covariance matrix that is ill-conditioned. The identification of the variables (wavelength bands) that contribute useful information for nutrient assessment and prediction, plays a pivotal role in multivariate statistical modeling. In recent years, researchers have successfully developed many continuous, nearly unbiased, sparse and accurate variable selection methods to overcome this problem. This paper compares four regularized and one functional regression methods: Elastic Net, Multi-Step Adaptive Elastic Net, Minimax Concave Penalty, iterative Sure Independence Screening, and Functional Data Analysis for wavelength variable selection. Thereafter, the predictive performance of these regularized sparse models is enhanced using the stepwise regression. This comparative study of regression methods using a high-dimensional and highly correlated grapevine hyperspectral dataset revealed that the performance of Elastic Net for variable selection yields the best predictive ability. 展开更多
关键词 high-dimensional data MULTI-STEP Adaptive Elastic Net MINIMAX CONCAVE Penalty Sure Independence Screening Functional data Analysis
下载PDF
Making Short-term High-dimensional Data Predictable
14
作者 CHEN Luonan 《Bulletin of the Chinese Academy of Sciences》 2018年第4期243-244,共2页
Making accurate forecast or prediction is a challenging task in the big data era, in particular for those datasets involving high-dimensional variables but short-term time series points,which are generally available f... Making accurate forecast or prediction is a challenging task in the big data era, in particular for those datasets involving high-dimensional variables but short-term time series points,which are generally available from real-world systems.To address this issue, Prof. 展开更多
关键词 RDE MAKING SHORT-TERM high-dimensional data Predictable
下载PDF
Data-driven Surrogate-assisted Method for High-dimensional Multi-area Combined Economic/Emission Dispatch
15
作者 Chenhao Lin Huijun Liang +2 位作者 Aokang Pang Jianwei Zhong Yongchao Yang 《Journal of Modern Power Systems and Clean Energy》 SCIE EI CSCD 2024年第1期52-64,共13页
Multi-area combined economic/emission dispatch(MACEED)problems are generally studied using analytical functions.However,as the scale of power systems increases,ex isting solutions become time-consuming and may not mee... Multi-area combined economic/emission dispatch(MACEED)problems are generally studied using analytical functions.However,as the scale of power systems increases,ex isting solutions become time-consuming and may not meet oper ational constraints.To overcome excessive computational ex pense in high-dimensional MACEED problems,a novel data-driven surrogate-assisted method is proposed.First,a cosine-similarity-based deep belief network combined with a back-propagation(DBN+BP)neural network is utilized to replace cost and emission functions.Second,transfer learning is applied with a pretraining and fine-tuning method to improve DBN+BP regression surrogate models,thus realizing fast con struction of surrogate models between different regional power systems.Third,a multi-objective antlion optimizer with a novel general single-dimension retention bi-objective optimization poli cy is proposed to execute MACEED optimization to obtain scheduling decisions.The proposed method not only ensures the convergence,uniformity,and extensibility of the Pareto front,but also greatly reduces the computational time.Finally,a 4-ar ea 40-unit test system with different constraints is employed to demonstrate the effectiveness of the proposed method. 展开更多
关键词 Multi-area combined economic/emission dispatch high-dimensional power system deep belief network data driven transfer learning
原文传递
Edge-assisted indexing for highly dynamic and static data in mixed reality connected autonomous vehicles
16
作者 Daniel Mawunyo Doe Dawei Chen +3 位作者 Kyungtae Han Haoxin Wang Jiang Xie Zhu Han 《Intelligent and Converged Networks》 EI 2024年第2期167-179,共13页
The integration of Mixed Reality(MR)technology into Autonomous Vehicles(AVs)has ushered in a new era for the automotive industry,offering heightened safety,convenience,and passenger comfort.However,the substantial and... The integration of Mixed Reality(MR)technology into Autonomous Vehicles(AVs)has ushered in a new era for the automotive industry,offering heightened safety,convenience,and passenger comfort.However,the substantial and varied data generated by MR-Connected AVs(MR-CAVs),encompassing both highly dynamic and static information,presents formidable challenges for efficient data management and retrieval.In this paper,we formulate our indexing problem as a constrained optimization problem,with the aim of maximizing the utility function that represents the overall performance of our indexing system.This optimization problem encompasses multiple decision variables and constraints,rendering it mathematically infeasible to solve directly.Therefore,we propose a heuristic algorithm to address the combinatorial complexity of the problem.Our heuristic indexing algorithm efficiently divides data into highly dynamic and static categories,distributing the index across Roadside Units(RSUs)and optimizing query processing.Our approach takes advantage of the computational capabilities of edge servers or RSUs to perform indexing operations,thereby shifting the burden away from the vehicles themselves.Our algorithm strategically places data in the cache,optimizing cache hit rate and space utilization while reducing latency.The quantitative evaluation demonstrates the superiority of our proposed scheme,with significant reductions in latency(averaging 27%-49.25%),a 30.75%improvement in throughput,a 22.50%enhancement in cache hit rate,and a 32%-50.75%improvement in space utilization compared to baseline schemes. 展开更多
关键词 mixed reality autonomous vehicles data indexing edge computing query optimization
原文传递
Randomized Latent Factor Model for High-dimensional and Sparse Matrices from Industrial Applications 被引量:13
17
作者 Mingsheng Shang Xin Luo +3 位作者 Zhigang Liu Jia Chen Ye Yuan MengChu Zhou 《IEEE/CAA Journal of Automatica Sinica》 EI CSCD 2019年第1期131-141,共11页
Latent factor(LF) models are highly effective in extracting useful knowledge from High-Dimensional and Sparse(HiDS) matrices which are commonly seen in various industrial applications. An LF model usually adopts itera... Latent factor(LF) models are highly effective in extracting useful knowledge from High-Dimensional and Sparse(HiDS) matrices which are commonly seen in various industrial applications. An LF model usually adopts iterative optimizers,which may consume many iterations to achieve a local optima,resulting in considerable time cost. Hence, determining how to accelerate the training process for LF models has become a significant issue. To address this, this work proposes a randomized latent factor(RLF) model. It incorporates the principle of randomized learning techniques from neural networks into the LF analysis of HiDS matrices, thereby greatly alleviating computational burden. It also extends a standard learning process for randomized neural networks in context of LF analysis to make the resulting model represent an HiDS matrix correctly.Experimental results on three HiDS matrices from industrial applications demonstrate that compared with state-of-the-art LF models, RLF is able to achieve significantly higher computational efficiency and comparable prediction accuracy for missing data.I provides an important alternative approach to LF analysis of HiDS matrices, which is especially desired for industrial applications demanding highly efficient models. 展开更多
关键词 Big data high-dimensional and sparse matrix latent factor analysis latent factor model randomized learning
下载PDF
Establishing evaluation index system for desertification of Keerqin sandy land with remote sensing data 被引量:4
18
作者 FAN Wen-yi ZHANG Wen-hua +1 位作者 YU Su-fang LIU Dan 《Journal of Forestry Research》 SCIE CAS CSCD 2005年第3期209-212,共4页
Keerqin sand land is located in the transitional terrain between the Northeast Plain and Inner Mongolia (42°41′-45°15′N, 118°35′-123°30′ E) in Northeast China and it is seriously affected by ... Keerqin sand land is located in the transitional terrain between the Northeast Plain and Inner Mongolia (42°41′-45°15′N, 118°35′-123°30′ E) in Northeast China and it is seriously affected by desertification. According to the configuration and ecotope of the earths surface, the coverage of vegetation, occupation ratio of bare sandy land and the soil texture were selected as evaluation indexes by using the field investigation data. The evaluation index system of Keerqin sandy desertification was established by using Remote Sensing data. and the occupation ratio of bare sandy land was obtained by mixed spectrum model. This index system is validated by the field investioation data and results indicate that it is suitable for the desertification evaluation of Keerqin.Foundation Item: This study is supported by a grant from the National Natural Science Foundation of China (No. 30371192) 展开更多
关键词 Sandy desertification Evaluation index system Remote sensing data Keerqin sandy land Inner Mongolia
下载PDF
SLC-index: A scalable skip list-based index for cloud data processing 被引量:2
19
作者 HE Jing YAO Shao-wen +1 位作者 CAI Li ZHOU Wei 《Journal of Central South University》 SCIE EI CAS CSCD 2018年第10期2438-2450,共13页
Due to the increasing number of cloud applications,the amount of data in the cloud shows signs of growing faster than ever before.The nature of cloud computing requires cloud data processing systems that can handle hu... Due to the increasing number of cloud applications,the amount of data in the cloud shows signs of growing faster than ever before.The nature of cloud computing requires cloud data processing systems that can handle huge volumes of data and have high performance.However,most cloud storage systems currently adopt a hash-like approach to retrieving data that only supports simple keyword-based enquiries,but lacks various forms of information search.Therefore,a scalable and efficient indexing scheme is clearly required.In this paper,we present a skip list-based cloud index,called SLC-index,which is a novel,scalable skip list-based indexing for cloud data processing.The SLC-index offers a two-layered architecture for extending indexing scope and facilitating better throughput.Dynamic load-balancing for the SLC-index is achieved by online migration of index nodes between servers.Furthermore,it is a flexible system due to its dynamic addition and removal of servers.The SLC-index is efficient for both point and range queries.Experimental results show the efficiency of the SLC-index and its usefulness as an alternative approach for cloud-suitable data structures. 展开更多
关键词 cloud computing distributed index cloud data processing skip list
下载PDF
CSFW-SC: Cuckoo Search Fuzzy-Weighting Algorithm for Subspace Clustering Applying to High-Dimensional Clustering 被引量:1
20
作者 WANG Jindong HE Jiajing +1 位作者 ZHANG Hengwei YU Zhiyong 《China Communications》 SCIE CSCD 2015年第S2期55-63,共9页
Aimed at the issue that traditional clustering methods are not appropriate to high-dimensional data, a cuckoo search fuzzy-weighting algorithm for subspace clustering is presented on the basis of the exited soft subsp... Aimed at the issue that traditional clustering methods are not appropriate to high-dimensional data, a cuckoo search fuzzy-weighting algorithm for subspace clustering is presented on the basis of the exited soft subspace clustering algorithm. In the proposed algorithm, a novel objective function is firstly designed by considering the fuzzy weighting within-cluster compactness and the between-cluster separation, and loosening the constraints of dimension weight matrix. Then gradual membership and improved Cuckoo search, a global search strategy, are introduced to optimize the objective function and search subspace clusters, giving novel learning rules for clustering. At last, the performance of the proposed algorithm on the clustering analysis of various low and high dimensional datasets is experimentally compared with that of several competitive subspace clustering algorithms. Experimental studies demonstrate that the proposed algorithm can obtain better performance than most of the existing soft subspace clustering algorithms. 展开更多
关键词 high-dimensional data CLUSTERING soft SUBSPACE CUCKOO SEARCH FUZZY CLUSTERING
下载PDF
上一页 1 2 218 下一页 到第
使用帮助 返回顶部