The validity measurement of fuzzy clustering is a key problem. If clustering is formed, it needs a kind of machine to verify its validity. To make mining more accountable, comprehensible and with a usable spatial patt...The validity measurement of fuzzy clustering is a key problem. If clustering is formed, it needs a kind of machine to verify its validity. To make mining more accountable, comprehensible and with a usable spatial pattern, it is necessary to first detect whether the data set has a clustered structure or not before clustering. This paper discusses a detection method for clustered patterns and a fuzzy clustering algorithm, and studies the validity function of the result produced by fuzzy clustering based on two aspects, which reflect the un-certainty of classification during fuzzy partition and spatial location features of spatial data, and proposes a new validity function of fuzzy clustering for spatial data. The experimental result indicates that the new validity function can accurately measure the validity of the results of fuzzy clustering. Especially, for the result of fuzzy clustering of spatial data, it is robust and its classification result is better when compared to other indices.展开更多
The characteristic of geographic information system(GfS) spatial data operation is that query is much more frequent than insertion and deletion, and a new hybrid spatial clustering method used to build R-tree for GI...The characteristic of geographic information system(GfS) spatial data operation is that query is much more frequent than insertion and deletion, and a new hybrid spatial clustering method used to build R-tree for GIS spatial data was proposed in this paper. According to the aggregation of clustering method, R-tree was used to construct rules and specialty of spatial data. HCR-tree was the R-tree built with HCR algorithm. To test the efficiency of HCR algorithm, it was applied not only to the data organization of static R-tree but also to the nodes splitting of dynamic R-tree. The results show that R-tree with HCR has some advantages such as higher searching efficiency, less disk accesses and so on.展开更多
Chlorophyta species are common in the southern and northern coastal areas of China. In recent years, frequent green tide incidents in Chinese coastal waters have raised concerns and attracted the attention of scientis...Chlorophyta species are common in the southern and northern coastal areas of China. In recent years, frequent green tide incidents in Chinese coastal waters have raised concerns and attracted the attention of scientists. In this paper, we sequenced the 18S rDNA genes, the internal transcribed spacer (ITS) regions and the rbcL genes in seven organisms and obtained 536-566 bp long ITS sequences, 1 377-I 407 bp long rbcL sequences and 1 718-1 761 bp long partial 18S rDNA sequences. The GC base pair content was highest in the ITS regions and lowest in the rbcL genes. The sequencing results showed that the three Ulvaprolifera (or U. pertusa) gene sequences from Qingdao and Nan'ao Island were identical. The ITS, 18S rDNA and rbcL genes in U. prolifera and U. pertusa from different sea areas in China were unchanged by geographic distance. U.flexuosa had the least evolutionary distance from U. californica in both the ITS regions (0.009) and the 18S rDNA (0.002). These data verified that Ulva and Enteromorpha are not separate genera.展开更多
Spatial objects have two types of attributes: geometrical attributes and non-geometrical attributes, which belong to two different attribute domains (geometrical and non-geometrical domains). Although geometrically...Spatial objects have two types of attributes: geometrical attributes and non-geometrical attributes, which belong to two different attribute domains (geometrical and non-geometrical domains). Although geometrically scattered in a geometrical domain, spatial objects may be similar to each other in a non-geometrical domain. Most existing clustering algorithms group spatial datasets into different compact regions in a geometrical domain without considering the aspect of a non-geometrical domain. However, many application scenarios require clustering results in which a cluster has not only high proximity in a geometrical domain, but also high similarity in a non-geometrical domain. This means constraints are imposed on the clustering goal from both geometrical and non-geometrical domains simultaneously. Such a clustering problem is called dual clustering. As distributed clustering applications become more and more popular, it is necessary to tackle the dual clustering problem in distributed databases. The DCAD algorithm is proposed to solve this problem. DCAD consists of two levels of clustering: local clustering and global clustering. First, clustering is conducted at each local site with a local clustering algorithm, and the features of local clusters are extracted clustering is obtained based on those features fective and efficient. Second, local features from each site are sent to a central site where global Experiments on both artificial and real spatial datasets show that DCAD is effective and efficient.展开更多
By using remote sensing images from three periods (1980, 1995, 2000) and with the support of GIS and RS, the spatial information of landscape elements of Jilin Province from 1980 to 2000 was interpreted and extracted....By using remote sensing images from three periods (1980, 1995, 2000) and with the support of GIS and RS, the spatial information of landscape elements of Jilin Province from 1980 to 2000 was interpreted and extracted. Using models of landscape indices such as diversity, fragmentation, and mean patch fractal dimension, dynamic spatio-temporal changes of landscape patterns of the province were analyzed. The results: ① cropland and forestland were the main landscape types, and forestland became a landscape matrix; ② in the study area, landscapes were distributed unevenly, and there was low heterogeneity, a simple ecosystem structure and a tendency of irrational landscape patterns. There were also simple spatial shapes of patches and strong self-similarities, and in terms of dynamic change analysis, patch shapes tended to be more simple; ③ from 1980 to 2000, holistic landscape fragmentation was low and changed slightly. As far as landscape elements were concerned, the fragmentation of grassland, water area, land for residential area and factory facilities was relatively low; land distribution for residential areas and factory facilities was dispersed; and cropland and forestland were most concentrated-an indication that the trend will continue. Comprehensive effects among human activity, local policy, regional climate and environmental change led to the results.展开更多
Based on the historical observed data and the modeling results,this paper investigated the seasonal variations in the Taiwan Warm Current Water(TWCW)using a cluster analysis method and examined the contributions of th...Based on the historical observed data and the modeling results,this paper investigated the seasonal variations in the Taiwan Warm Current Water(TWCW)using a cluster analysis method and examined the contributions of the Kuroshio onshore intrusion and the Taiwan Strait Warm Current(TSWC)to the TWCW on seasonal time scales.The TWCW has obviously seasonal variation in its horizontal distribution,T-S characteristics and volume.The volume of TWCW is maximum(13746 km^3)in winter and minimum(11397 km^3)in autumn.As to the contributions to the TWCW,the TSWC is greatest in summer and smallest in winter,while the Kuroshio onshore intrusion northeast of Taiwan Island is strongest in winter and weakest in summer.By comparison,the Kuroshio onshore intrusion make greater contributions to the Taiwan Warm Current Surface Water(TWCSW)than the TSWC for most of the year,except for in the summertime(from June to August),while the Kuroshio Subsurface Water(KSSW)dominate the Taiwan Warm Current Deep Water(TWCDW).The analysis results demonstrate that the local monsoon winds is the dominant factor controlling the seasonal variation in the TWCW volume via Ekman dynamics,while the surface heat fl ux can play a secondary role via the joint ef fect of baroclinicity and relief.展开更多
Gap statistic is a well-known index of clustering validity, but its realization is difficult to be comprehended and accurately determined. A direct method is presented to improve the performance of the Gap statistic, ...Gap statistic is a well-known index of clustering validity, but its realization is difficult to be comprehended and accurately determined. A direct method is presented to improve the performance of the Gap statistic, which applies the two-order difference of within-cluster dispersion to replace the constructed null reference distribution in the Gap statistic. Hence, the realization of the Gap statistic becomes easy and is reformulated, and its uncertainty in applications is reduced. Also, the limitation of the Gap statistic is analyzed by two typical examples, that is, the Gap statistic is difficult to be applied to the dataset that contains strong-overlap or uneven-density clusters. Experiments verify the usefulness of the proposed method.展开更多
This work presents a novel application of second-order calibration based on self-weighted alternating trilinear decomposition(SWATLD)algorithm for analyzing the HPLC-DAD data.The proposed method makes it possible to s...This work presents a novel application of second-order calibration based on self-weighted alternating trilinear decomposition(SWATLD)algorithm for analyzing the HPLC-DAD data.The proposed method makes it possible to simultaneously determine teflubenzuron,hexaflumuron,flufenoxuron,chlorfluazuron,diflubenzuron and benzoylurea in different fruit samples,i.e.pear,apple and banana,in the selected time region of chromatogram.The concentration,elution time and spectral information of these benzoylurea insecticides are selectively extracted from complex matrices even in the presence of unknown interferences.The root-mean-square error of prediction(RMSEP)and figures of merit,including sensitivity(SEN),selectivity(SEL)and limit of detection(LOD)are employed to access the performance of the method.The LODs obtained for these insecticides are within the range 0.017–0.26 ppm in pears,0.039–0.33 ppm in apples,0.041–0.44 ppm in bananas,respectively.Such a chemometrics-based protocol holds great potential to be extended as a promising alternative for more practical applications in food safety and quality monitoring.展开更多
文摘The validity measurement of fuzzy clustering is a key problem. If clustering is formed, it needs a kind of machine to verify its validity. To make mining more accountable, comprehensible and with a usable spatial pattern, it is necessary to first detect whether the data set has a clustered structure or not before clustering. This paper discusses a detection method for clustered patterns and a fuzzy clustering algorithm, and studies the validity function of the result produced by fuzzy clustering based on two aspects, which reflect the un-certainty of classification during fuzzy partition and spatial location features of spatial data, and proposes a new validity function of fuzzy clustering for spatial data. The experimental result indicates that the new validity function can accurately measure the validity of the results of fuzzy clustering. Especially, for the result of fuzzy clustering of spatial data, it is robust and its classification result is better when compared to other indices.
文摘The characteristic of geographic information system(GfS) spatial data operation is that query is much more frequent than insertion and deletion, and a new hybrid spatial clustering method used to build R-tree for GIS spatial data was proposed in this paper. According to the aggregation of clustering method, R-tree was used to construct rules and specialty of spatial data. HCR-tree was the R-tree built with HCR algorithm. To test the efficiency of HCR algorithm, it was applied not only to the data organization of static R-tree but also to the nodes splitting of dynamic R-tree. The results show that R-tree with HCR has some advantages such as higher searching efficiency, less disk accesses and so on.
基金Supported by the National Natural Science Foundation of China (No.30570125)the Key Construction Laboratory of Marine Biotechnology of Jiangsu Province (No. 2010HS03)
文摘Chlorophyta species are common in the southern and northern coastal areas of China. In recent years, frequent green tide incidents in Chinese coastal waters have raised concerns and attracted the attention of scientists. In this paper, we sequenced the 18S rDNA genes, the internal transcribed spacer (ITS) regions and the rbcL genes in seven organisms and obtained 536-566 bp long ITS sequences, 1 377-I 407 bp long rbcL sequences and 1 718-1 761 bp long partial 18S rDNA sequences. The GC base pair content was highest in the ITS regions and lowest in the rbcL genes. The sequencing results showed that the three Ulvaprolifera (or U. pertusa) gene sequences from Qingdao and Nan'ao Island were identical. The ITS, 18S rDNA and rbcL genes in U. prolifera and U. pertusa from different sea areas in China were unchanged by geographic distance. U.flexuosa had the least evolutionary distance from U. californica in both the ITS regions (0.009) and the 18S rDNA (0.002). These data verified that Ulva and Enteromorpha are not separate genera.
基金Funded by the National 973 Program of China (No.2003CB415205)the National Natural Science Foundation of China (No.40523005, No.60573183, No.60373019)the Open Research Fund Program of LIESMARS (No.WKL(04)0303).
文摘Spatial objects have two types of attributes: geometrical attributes and non-geometrical attributes, which belong to two different attribute domains (geometrical and non-geometrical domains). Although geometrically scattered in a geometrical domain, spatial objects may be similar to each other in a non-geometrical domain. Most existing clustering algorithms group spatial datasets into different compact regions in a geometrical domain without considering the aspect of a non-geometrical domain. However, many application scenarios require clustering results in which a cluster has not only high proximity in a geometrical domain, but also high similarity in a non-geometrical domain. This means constraints are imposed on the clustering goal from both geometrical and non-geometrical domains simultaneously. Such a clustering problem is called dual clustering. As distributed clustering applications become more and more popular, it is necessary to tackle the dual clustering problem in distributed databases. The DCAD algorithm is proposed to solve this problem. DCAD consists of two levels of clustering: local clustering and global clustering. First, clustering is conducted at each local site with a local clustering algorithm, and the features of local clusters are extracted clustering is obtained based on those features fective and efficient. Second, local features from each site are sent to a central site where global Experiments on both artificial and real spatial datasets show that DCAD is effective and efficient.
基金Funded by the National Natural Science Foundation of China (No.40401003).
文摘By using remote sensing images from three periods (1980, 1995, 2000) and with the support of GIS and RS, the spatial information of landscape elements of Jilin Province from 1980 to 2000 was interpreted and extracted. Using models of landscape indices such as diversity, fragmentation, and mean patch fractal dimension, dynamic spatio-temporal changes of landscape patterns of the province were analyzed. The results: ① cropland and forestland were the main landscape types, and forestland became a landscape matrix; ② in the study area, landscapes were distributed unevenly, and there was low heterogeneity, a simple ecosystem structure and a tendency of irrational landscape patterns. There were also simple spatial shapes of patches and strong self-similarities, and in terms of dynamic change analysis, patch shapes tended to be more simple; ③ from 1980 to 2000, holistic landscape fragmentation was low and changed slightly. As far as landscape elements were concerned, the fragmentation of grassland, water area, land for residential area and factory facilities was relatively low; land distribution for residential areas and factory facilities was dispersed; and cropland and forestland were most concentrated-an indication that the trend will continue. Comprehensive effects among human activity, local policy, regional climate and environmental change led to the results.
基金Supported by the National Natural Science Foundation of China(Nos.41506020,41476019,41528601)the CAS Strategy Pioneering Program(No.XDA110020104)+2 种基金the Foundation for Innovative Research Groups of the National Natural Science Foundation of China(No.41421005)the NSFC-Shandong Joint Fund for Marine Science Research Centers(No.U1406401)the Global Change and Air-Sea Interaction(No.GASI-03-01-01-02)
文摘Based on the historical observed data and the modeling results,this paper investigated the seasonal variations in the Taiwan Warm Current Water(TWCW)using a cluster analysis method and examined the contributions of the Kuroshio onshore intrusion and the Taiwan Strait Warm Current(TSWC)to the TWCW on seasonal time scales.The TWCW has obviously seasonal variation in its horizontal distribution,T-S characteristics and volume.The volume of TWCW is maximum(13746 km^3)in winter and minimum(11397 km^3)in autumn.As to the contributions to the TWCW,the TSWC is greatest in summer and smallest in winter,while the Kuroshio onshore intrusion northeast of Taiwan Island is strongest in winter and weakest in summer.By comparison,the Kuroshio onshore intrusion make greater contributions to the Taiwan Warm Current Surface Water(TWCSW)than the TSWC for most of the year,except for in the summertime(from June to August),while the Kuroshio Subsurface Water(KSSW)dominate the Taiwan Warm Current Deep Water(TWCDW).The analysis results demonstrate that the local monsoon winds is the dominant factor controlling the seasonal variation in the TWCW volume via Ekman dynamics,while the surface heat fl ux can play a secondary role via the joint ef fect of baroclinicity and relief.
基金National Natural Science Foundation of China(No.60572065, 60772080, 60532020)
文摘Gap statistic is a well-known index of clustering validity, but its realization is difficult to be comprehended and accurately determined. A direct method is presented to improve the performance of the Gap statistic, which applies the two-order difference of within-cluster dispersion to replace the constructed null reference distribution in the Gap statistic. Hence, the realization of the Gap statistic becomes easy and is reformulated, and its uncertainty in applications is reduced. Also, the limitation of the Gap statistic is analyzed by two typical examples, that is, the Gap statistic is difficult to be applied to the dataset that contains strong-overlap or uneven-density clusters. Experiments verify the usefulness of the proposed method.
基金the National Natural Science Foundation of China(21175041)the National Basic Research Program(2012CB910602)for financial supports
文摘This work presents a novel application of second-order calibration based on self-weighted alternating trilinear decomposition(SWATLD)algorithm for analyzing the HPLC-DAD data.The proposed method makes it possible to simultaneously determine teflubenzuron,hexaflumuron,flufenoxuron,chlorfluazuron,diflubenzuron and benzoylurea in different fruit samples,i.e.pear,apple and banana,in the selected time region of chromatogram.The concentration,elution time and spectral information of these benzoylurea insecticides are selectively extracted from complex matrices even in the presence of unknown interferences.The root-mean-square error of prediction(RMSEP)and figures of merit,including sensitivity(SEN),selectivity(SEL)and limit of detection(LOD)are employed to access the performance of the method.The LODs obtained for these insecticides are within the range 0.017–0.26 ppm in pears,0.039–0.33 ppm in apples,0.041–0.44 ppm in bananas,respectively.Such a chemometrics-based protocol holds great potential to be extended as a promising alternative for more practical applications in food safety and quality monitoring.