In order to improve the generalization ability of binary decision trees, a new learning algorithm, the MMDT algorithm, is presented. Based on statistical learning theory the generalization performance of binary decisi...In order to improve the generalization ability of binary decision trees, a new learning algorithm, the MMDT algorithm, is presented. Based on statistical learning theory the generalization performance of binary decision trees is analyzed, and the assessment rule is proposed. Under the direction of the assessment rule, the MMDT algorithm is implemented. The algorithm maps training examples from an original space to a high dimension feature space, and constructs a decision tree in it. In the feature space, a new decision node splitting criterion, the max-min rule, is used, and the margin of each decision node is maximized using a support vector machine, to improve the generalization performance. Experimental results show that the new learning algorithm is much superior to others such as C4. 5 and OCI.展开更多
Hot compression tests in the temperature range of 340-450 ℃ and strain rate range of 0.001-1 s^-1 of spray-formed 7055 aluminum alloy were carried out to study its hot deformation behavior. Three phenomenological mod...Hot compression tests in the temperature range of 340-450 ℃ and strain rate range of 0.001-1 s^-1 of spray-formed 7055 aluminum alloy were carried out to study its hot deformation behavior. Three phenomenological models including Johnson-Cook, modified Fields-Backofen and Arrhenius-type were introduced to predict the flow stresses during the compression process. And then, a comparative predictability of the phenomenological models was estimated in terms of the relative errors, correlation coefficient(R), and average absolute relative error(AARE). The results indicate that Johnson-Cook model and modified Fields-Backofen model cannot well predict the hot deformation behavior due to the large deviation in the process of line regression fitting. Arrhenius-type model obtains the best fit through combining the effect of strain rate and temperature.展开更多
This paper shows the dynamic process of regional disparity of economic development in China in the past 50 years from a new insight by using the rescaled range statistic (R/S) analysis and wavelet analysis of the Thei...This paper shows the dynamic process of regional disparity of economic development in China in the past 50 years from a new insight by using the rescaled range statistic (R/S) analysis and wavelet analysis of the Theil index sequence with different time scales. The main conclusions are: 1) The regional disparity of economic development in China, including the inter-provincial disparity, inter-regional disparity and intra-regional disparity, has existed for many years. Theil index by the comparative price has revealed the true trend for comparative disparity of regional economic development from 1952 to 2000. 2) Decomposition of Theil index indicates that the dynamic trend of comparative inter-provincial disparity in the coastal region is in line with dynamic trend of inter-provincial disparity in the whole China. 3) The R/S analysis results tell us that during 1966-1978, the Hurst exponent H=0.504 approximate to 0.5, which indicates that in that period the evolution of comparative inter-provincial disparity of economic development showed a random characteristic, and in the other periods, i.e. 1952-1965, 1979-1990 and 1991-2000, the Hurst exponent H>0.5, which indicates that in those periods the evolution of the comparative inter-provincial disparity of economic development in China had a long-enduring characteristic. 4) By using wavelet analysis at different time scale, we arrived at a conclusion that the evolutionary process of the disparity of economic development of China is not a simple inverted U shape but a compound of several U shapes. The result tells us that the evolutionary plot of inter-provincial disparity in China follows the inverted U on the whole at the higher scale, 24 ( 16 years). That is to say, the disparity tends to rise in the first stage of economic development, and fall slowly over the peak in the second stage of economic development. However, if we shorten the time scale to 23 ( 8 years), then a link of several U shapes will appear.展开更多
The multivariate statistical techniques, principal component analysis, Q-modefactor analysis, correspondence analysis and fuzzy C-means clustering were applied to analyzing thedatasets of minor element concentrations ...The multivariate statistical techniques, principal component analysis, Q-modefactor analysis, correspondence analysis and fuzzy C-means clustering were applied to analyzing thedatasets of minor element concentrations in sediment samples of a core collected from the outershelf of the East China Sea. According to the analysis results, the sediment core Q43 can be dividedinto three strata with different features in minor elements. The first stratum (unit Ⅰ) ischaracterized by higher concentrations of Ⅴ, Cr, Cd and Sc, which are active and inactive elements.The second stratum (unit Ⅱ) is controlled by ultrastable elements Ⅴ, Ti, Cr, Th, Sc, Pb, etc. Thethird stratum (unit Ⅲ) is dominated by Ni, Co, Ba, Rb and Mn, which are authigenic andvolcanogenic elements. The geochemical features of the core Q43 show environmental changes in thedepositional process from the Late Pleistocene to Holocene.展开更多
AIM:To determine the association between serum levels of growth-related gene product β(GROβ) and clinical parameters in esophageal squamous cell carcinoma(ESCC).METHODS:Using enzyme-linked immunosorbent assay,serum ...AIM:To determine the association between serum levels of growth-related gene product β(GROβ) and clinical parameters in esophageal squamous cell carcinoma(ESCC).METHODS:Using enzyme-linked immunosorbent assay,serum GROβ levels were measured in ESCC patients(n = 72) and healthy volunteers(n = 83).The association between serum levels of GROβ and clinical parameters of ESCC was analyzed statistically.RESULTS:The serum GROβ levels were much higher in ESCC patients than in healthy controls(median:645 ng/L vs 269 ng/L,P < 0.05).Serum GROβ levels were correlated positively with tumor size,lymph node metastasis,and tumor-node-metastasis(TNM) staging,but not with gender or the histological grade of tumors in ESCC patients.The sensitivity and specificity of the assay for serum GROβ were 73.61% and 56.63%,respectively.CONCLUSION:GROβ may function as an oncogene product and contribute to tumorigenesis and metastasis of ESCC.展开更多
Owing to the radical changing of Chinese economy, it is essential to build an effective financial distress prediction model. In this paper, we present a genetic algorithm (GA) approach for optimizing parameters of s...Owing to the radical changing of Chinese economy, it is essential to build an effective financial distress prediction model. In this paper, we present a genetic algorithm (GA) approach for optimizing parameters of support vector machine (SVM). We validate the proposed model on datasets of Chinese high-tech manufacturing industry. Experimental results reveal that the proposed GAo SVM model can compare to and even outperform other exiting classifiers. Compared to grid-search algorithm, the proposed GA-based takes less time to optimize SVM parameter without degrading the prediction accuracy of SVM.展开更多
Objective This work aims to investigate the expression pattern and clinicopathologic significance of centromere protein H(CENP-H) in uterine cervical cancer(UCC). Methods The level of CENP-H expression in the paraffin...Objective This work aims to investigate the expression pattern and clinicopathologic significance of centromere protein H(CENP-H) in uterine cervical cancer(UCC). Methods The level of CENP-H expression in the paraffin sections of 62 UCC cases was determined by the SP immunohistochemical method,with complete clinicopathologic data in all cases.Statistical analysis was conducted to evaluate the prognostic and diagnostic significance of CENP-H using SPSS13.0 software package. Results Immunohistochemical assay showed strong CENP-H expression in 61.29% (38/62) of the paraffin-embedded cervical cancer tissues.Statistical analysis revealed a strong correlation between the CENP-H expression and the clinical classification(P=0.038) of the cervical carcinoma.The expression increased with rise of the stages.The analysis of Cox proportional hazards regression model suggested that CENP-H expression(P=0.002) and tumor stage(P=0.001) were independent prognostic markers for the survival of UCC patients.The survival analysis showed that the survival rate was significantly lower in patients with high expression of CENP-H than in those with low expression of CENP-H(P=0.001). Conclusions CENP-H is likely to be a valuable marker for carcinogenesis and progression of UCC.It might be used as the important diagnostic and prognostic marker for cervical carcinoma patients,especially for those at early stage.展开更多
An OpenMP approach was proposed to parallelize the sequential molecular dynamics(MD) code on shared memory machines. When a code is converted from the sequential form to the parallel form, data dependence is a main pr...An OpenMP approach was proposed to parallelize the sequential molecular dynamics(MD) code on shared memory machines. When a code is converted from the sequential form to the parallel form, data dependence is a main problem. A traditional sequential molecular dynamics code is anatomized to find the data dependence segments in it, and the two different methods, i.e., recover method and backward mapping method were used to eliminate those data dependencies in order to realize the parallelization of this sequential MD code. The performance of the parallelized MD code was analyzed by using some performance analysis tools. The results of the test show that the computing size of this code increases sharply form 1 million atoms before parallelization to 20 million atoms after parallelization, and the wall clock during computing is reduced largely. Some hot-spots in this code are found and optimized by improved algorithm. The efficiency of parallel computing is 30% higher than that of before, and the calculation time is saved and larger scale calculation problems are solved.展开更多
AIM:To identify whether there could have been changes in survival if lymph node ratio (N ratio) had been used.METHODS:We assessed 334 gastric adenocarcinoma cases retrospectively between 2001 and 2009.Two hundred and ...AIM:To identify whether there could have been changes in survival if lymph node ratio (N ratio) had been used.METHODS:We assessed 334 gastric adenocarcinoma cases retrospectively between 2001 and 2009.Two hundred and sixteen patients out of 334 were included in the study.Patients were grouped according to disection1 (D1) or dissection 2 (D2) dissection.We compared the estimated survival and actual survival determined by Pathologic nodes (pN) class and N ratio,and SPSS 15.0 software was used for statistical analysis.RESULTS:Ninety-six (44.4%) patients underwent D1 dissection and 120 (55.6%) had D2 dissection.When groups were evaluated,23 (24.0%) patients in D1 and 21 (17.5%) in D2 had stage migration (P=0.001).When both D1 and D2 groups were evaluated for number of pathological lymph nodes,despite the fact that there was no difference in N ratio between D1 and D2 groups,a statistically significant difference was found between them with regard to pN1 and pN2 groups (P=0.047,P=0.044 respectively).In D1,pN0 had the longest survival while pN3 had the shortest.In D2,pN0 had the longest survival whereas pN3 had the shortest survival.CONCLUSION:N ratio is an accurate staging system for defining prognosis and treatment plan,thus decreasing methodological errors in gastric cancer staging.展开更多
A generalized Lyapunov function was employed to investigate the ultimate bound and positively invariant set of a generalized Lorenz system.We derived an ellipsoidal estimate of the ultimate bound and positively invari...A generalized Lyapunov function was employed to investigate the ultimate bound and positively invariant set of a generalized Lorenz system.We derived an ellipsoidal estimate of the ultimate bound and positively invariant set for the generalized Lorenz system,for all the positive values of system parameters a,b,and c.Our results extend the related result of Li,et al.[Li DM,Lu JA,Wu XQ,et al.,Estimating the ultimate bound and positively invariant set for the Lorenz system and a unified chaotic system,Journal of Mathematical Analysis and Application,2006,323(2):844-653].展开更多
Accurately identifying network traffics at the early stage is very important for the application of traffic identification.Recent years,more and more research works have tried to build effective machine learning model...Accurately identifying network traffics at the early stage is very important for the application of traffic identification.Recent years,more and more research works have tried to build effective machine learning models to identify traffics with the few packets at the early stage.However,a basic and important problem is still unresolved,that is how many packets are most effective in early stage traffic identification.In this paper,we try to resolve this problem using experimental methods.We firstly extract the packet size of the first 2-10 packets of 3 traffic data sets.And then execute crossover identification experiments with different numbers of packets using 11 well-known machine learning classifiers.Finally,statistical tests are applied to find out which number is the best performed one.Our experimental results show that 5-7are the best packet numbers for early stage traffic identification.展开更多
We link nuclear force with gravity. We use statistical entropy to link fine-structure constant (ct) and cosmological constant, showing mystical number 137 (as reciprocal of increasing entropy of the universe) as n...We link nuclear force with gravity. We use statistical entropy to link fine-structure constant (ct) and cosmological constant, showing mystical number 137 (as reciprocal of increasing entropy of the universe) as negative entropy needed for life to exist. If our computational route applies to the physical universe, it should apply to life. Molecular biology is searching for the fundamental source of information that would link to the information in DNA.展开更多
文摘In order to improve the generalization ability of binary decision trees, a new learning algorithm, the MMDT algorithm, is presented. Based on statistical learning theory the generalization performance of binary decision trees is analyzed, and the assessment rule is proposed. Under the direction of the assessment rule, the MMDT algorithm is implemented. The algorithm maps training examples from an original space to a high dimension feature space, and constructs a decision tree in it. In the feature space, a new decision node splitting criterion, the max-min rule, is used, and the margin of each decision node is maximized using a support vector machine, to improve the generalization performance. Experimental results show that the new learning algorithm is much superior to others such as C4. 5 and OCI.
基金Project(2013HH100055) supported by the Basic Research and Science and Technology Innovation Fund of Foshan City,China
文摘Hot compression tests in the temperature range of 340-450 ℃ and strain rate range of 0.001-1 s^-1 of spray-formed 7055 aluminum alloy were carried out to study its hot deformation behavior. Three phenomenological models including Johnson-Cook, modified Fields-Backofen and Arrhenius-type were introduced to predict the flow stresses during the compression process. And then, a comparative predictability of the phenomenological models was estimated in terms of the relative errors, correlation coefficient(R), and average absolute relative error(AARE). The results indicate that Johnson-Cook model and modified Fields-Backofen model cannot well predict the hot deformation behavior due to the large deviation in the process of line regression fitting. Arrhenius-type model obtains the best fit through combining the effect of strain rate and temperature.
基金Under the auspices of National Philosophy and Social Sciences Foundation of China (No. 00BJL051 03BJL027)
文摘This paper shows the dynamic process of regional disparity of economic development in China in the past 50 years from a new insight by using the rescaled range statistic (R/S) analysis and wavelet analysis of the Theil index sequence with different time scales. The main conclusions are: 1) The regional disparity of economic development in China, including the inter-provincial disparity, inter-regional disparity and intra-regional disparity, has existed for many years. Theil index by the comparative price has revealed the true trend for comparative disparity of regional economic development from 1952 to 2000. 2) Decomposition of Theil index indicates that the dynamic trend of comparative inter-provincial disparity in the coastal region is in line with dynamic trend of inter-provincial disparity in the whole China. 3) The R/S analysis results tell us that during 1966-1978, the Hurst exponent H=0.504 approximate to 0.5, which indicates that in that period the evolution of comparative inter-provincial disparity of economic development showed a random characteristic, and in the other periods, i.e. 1952-1965, 1979-1990 and 1991-2000, the Hurst exponent H>0.5, which indicates that in those periods the evolution of the comparative inter-provincial disparity of economic development in China had a long-enduring characteristic. 4) By using wavelet analysis at different time scale, we arrived at a conclusion that the evolutionary process of the disparity of economic development of China is not a simple inverted U shape but a compound of several U shapes. The result tells us that the evolutionary plot of inter-provincial disparity in China follows the inverted U on the whole at the higher scale, 24 ( 16 years). That is to say, the disparity tends to rise in the first stage of economic development, and fall slowly over the peak in the second stage of economic development. However, if we shorten the time scale to 23 ( 8 years), then a link of several U shapes will appear.
基金funded by the National Natural Science Foundation(Nos.40176014 and 40067013).
文摘The multivariate statistical techniques, principal component analysis, Q-modefactor analysis, correspondence analysis and fuzzy C-means clustering were applied to analyzing thedatasets of minor element concentrations in sediment samples of a core collected from the outershelf of the East China Sea. According to the analysis results, the sediment core Q43 can be dividedinto three strata with different features in minor elements. The first stratum (unit Ⅰ) ischaracterized by higher concentrations of Ⅴ, Cr, Cd and Sc, which are active and inactive elements.The second stratum (unit Ⅱ) is controlled by ultrastable elements Ⅴ, Ti, Cr, Th, Sc, Pb, etc. Thethird stratum (unit Ⅲ) is dominated by Ni, Co, Ba, Rb and Mn, which are authigenic andvolcanogenic elements. The geochemical features of the core Q43 show environmental changes in thedepositional process from the Late Pleistocene to Holocene.
基金Supported by The Grants from International Science & Technology Cooperation and Exchange Programs, No. 2008DFA31130Joint China/South Africa Science and Technology Agreement+1 种基金National Natural Science Foundation of China, No. 81021061, No. 0772507 and No. 30700992State Key Projects for Basic Research of China, No. 2011CB910703
文摘AIM:To determine the association between serum levels of growth-related gene product β(GROβ) and clinical parameters in esophageal squamous cell carcinoma(ESCC).METHODS:Using enzyme-linked immunosorbent assay,serum GROβ levels were measured in ESCC patients(n = 72) and healthy volunteers(n = 83).The association between serum levels of GROβ and clinical parameters of ESCC was analyzed statistically.RESULTS:The serum GROβ levels were much higher in ESCC patients than in healthy controls(median:645 ng/L vs 269 ng/L,P < 0.05).Serum GROβ levels were correlated positively with tumor size,lymph node metastasis,and tumor-node-metastasis(TNM) staging,but not with gender or the histological grade of tumors in ESCC patients.The sensitivity and specificity of the assay for serum GROβ were 73.61% and 56.63%,respectively.CONCLUSION:GROβ may function as an oncogene product and contribute to tumorigenesis and metastasis of ESCC.
基金Supported by the Cultivation Fund of the Key Scientific and Technical Innovation Project from Ministry of Education of China ( No.706024)the International Science Cooperation Foundation of Shanghai (No.061307041)the Excellent Youth Foundation ofShanghai (No.07A212)
文摘Owing to the radical changing of Chinese economy, it is essential to build an effective financial distress prediction model. In this paper, we present a genetic algorithm (GA) approach for optimizing parameters of support vector machine (SVM). We validate the proposed model on datasets of Chinese high-tech manufacturing industry. Experimental results reveal that the proposed GAo SVM model can compare to and even outperform other exiting classifiers. Compared to grid-search algorithm, the proposed GA-based takes less time to optimize SVM parameter without degrading the prediction accuracy of SVM.
基金supported by grants from the Social Development Projects of Guangdong SciTech Planning (No.2010B031600201)
文摘Objective This work aims to investigate the expression pattern and clinicopathologic significance of centromere protein H(CENP-H) in uterine cervical cancer(UCC). Methods The level of CENP-H expression in the paraffin sections of 62 UCC cases was determined by the SP immunohistochemical method,with complete clinicopathologic data in all cases.Statistical analysis was conducted to evaluate the prognostic and diagnostic significance of CENP-H using SPSS13.0 software package. Results Immunohistochemical assay showed strong CENP-H expression in 61.29% (38/62) of the paraffin-embedded cervical cancer tissues.Statistical analysis revealed a strong correlation between the CENP-H expression and the clinical classification(P=0.038) of the cervical carcinoma.The expression increased with rise of the stages.The analysis of Cox proportional hazards regression model suggested that CENP-H expression(P=0.002) and tumor stage(P=0.001) were independent prognostic markers for the survival of UCC patients.The survival analysis showed that the survival rate was significantly lower in patients with high expression of CENP-H than in those with low expression of CENP-H(P=0.001). Conclusions CENP-H is likely to be a valuable marker for carcinogenesis and progression of UCC.It might be used as the important diagnostic and prognostic marker for cervical carcinoma patients,especially for those at early stage.
基金Project (50371026) supported by the National Natural Science Foundation of China
文摘An OpenMP approach was proposed to parallelize the sequential molecular dynamics(MD) code on shared memory machines. When a code is converted from the sequential form to the parallel form, data dependence is a main problem. A traditional sequential molecular dynamics code is anatomized to find the data dependence segments in it, and the two different methods, i.e., recover method and backward mapping method were used to eliminate those data dependencies in order to realize the parallelization of this sequential MD code. The performance of the parallelized MD code was analyzed by using some performance analysis tools. The results of the test show that the computing size of this code increases sharply form 1 million atoms before parallelization to 20 million atoms after parallelization, and the wall clock during computing is reduced largely. Some hot-spots in this code are found and optimized by improved algorithm. The efficiency of parallel computing is 30% higher than that of before, and the calculation time is saved and larger scale calculation problems are solved.
文摘AIM:To identify whether there could have been changes in survival if lymph node ratio (N ratio) had been used.METHODS:We assessed 334 gastric adenocarcinoma cases retrospectively between 2001 and 2009.Two hundred and sixteen patients out of 334 were included in the study.Patients were grouped according to disection1 (D1) or dissection 2 (D2) dissection.We compared the estimated survival and actual survival determined by Pathologic nodes (pN) class and N ratio,and SPSS 15.0 software was used for statistical analysis.RESULTS:Ninety-six (44.4%) patients underwent D1 dissection and 120 (55.6%) had D2 dissection.When groups were evaluated,23 (24.0%) patients in D1 and 21 (17.5%) in D2 had stage migration (P=0.001).When both D1 and D2 groups were evaluated for number of pathological lymph nodes,despite the fact that there was no difference in N ratio between D1 and D2 groups,a statistically significant difference was found between them with regard to pN1 and pN2 groups (P=0.047,P=0.044 respectively).In D1,pN0 had the longest survival while pN3 had the shortest.In D2,pN0 had the longest survival whereas pN3 had the shortest survival.CONCLUSION:N ratio is an accurate staging system for defining prognosis and treatment plan,thus decreasing methodological errors in gastric cancer staging.
文摘A generalized Lyapunov function was employed to investigate the ultimate bound and positively invariant set of a generalized Lorenz system.We derived an ellipsoidal estimate of the ultimate bound and positively invariant set for the generalized Lorenz system,for all the positive values of system parameters a,b,and c.Our results extend the related result of Li,et al.[Li DM,Lu JA,Wu XQ,et al.,Estimating the ultimate bound and positively invariant set for the Lorenz system and a unified chaotic system,Journal of Mathematical Analysis and Application,2006,323(2):844-653].
基金This research was partially supported by National Natural Science Foundation of China under grant No.61472164,No.61402475,No.61173078,No.61203105,No.61173079,No.61070130,and No.60903176,the Provincial Natural Science Foundation of Shandong under grant No.ZR2012FM010,No.ZR2011FZ001,No.ZR2010FM047,No.ZR2010FQ028 and No.ZR2012FQ016
文摘Accurately identifying network traffics at the early stage is very important for the application of traffic identification.Recent years,more and more research works have tried to build effective machine learning models to identify traffics with the few packets at the early stage.However,a basic and important problem is still unresolved,that is how many packets are most effective in early stage traffic identification.In this paper,we try to resolve this problem using experimental methods.We firstly extract the packet size of the first 2-10 packets of 3 traffic data sets.And then execute crossover identification experiments with different numbers of packets using 11 well-known machine learning classifiers.Finally,statistical tests are applied to find out which number is the best performed one.Our experimental results show that 5-7are the best packet numbers for early stage traffic identification.
文摘We link nuclear force with gravity. We use statistical entropy to link fine-structure constant (ct) and cosmological constant, showing mystical number 137 (as reciprocal of increasing entropy of the universe) as negative entropy needed for life to exist. If our computational route applies to the physical universe, it should apply to life. Molecular biology is searching for the fundamental source of information that would link to the information in DNA.