A method that combines category-based and keyword-based concepts for a better information retrieval system is introduced. To improve document clustering, a document similarity measure based on cosine vector and keywor...A method that combines category-based and keyword-based concepts for a better information retrieval system is introduced. To improve document clustering, a document similarity measure based on cosine vector and keywords frequency in documents is proposed, but also with an input ontology. The ontology is domain specific and includes a list of keywords organized by degree of importance to the categories of the ontology, and by means of semantic knowledge, the ontology can improve the effects of document similarity measure and feedback of information retrieval systems. Two approaches to evaluating the performance of this similarity measure and the comparison with standard cosine vector similarity measure are also described.展开更多
A method for the multi target locating and tracking with the multi sensor in a field artillery system is studied. A general modeling structure of the system is established. Based on concepts of cluster and closed ba...A method for the multi target locating and tracking with the multi sensor in a field artillery system is studied. A general modeling structure of the system is established. Based on concepts of cluster and closed ball, an algorithm is put forward for multi sensor multi target data fusion and an optimal solution for state estimation is presented. The simulation results prove the algorithm works well for the multi stationary target locating and the multi moving target tracking under the condition of the sparse target environment. Therefore, this method can be directly applied to the field artillery C 3I system.展开更多
This study was undertaken to construct a preliminary spatial analysis method for building an urban-suburban-rural category in the specific sample area of central California and providing distribution characteristics i...This study was undertaken to construct a preliminary spatial analysis method for building an urban-suburban-rural category in the specific sample area of central California and providing distribution characteristics in each category, based on which, some further studies such as regional manners of residential wood burning emission (PM2.5, the term used for a mixture of solid particles and liquid droplets found in the air, refers to particulate matter that is 2.5 mu m or smaller in size) could be carried out for the project of residential wood combustion. Demographic and infrastructure data with spatial characteristics were processed by integrating both Geographic Information System (GIS) and statistics method (Cluster Analysis), and then output to a category map as the result. It approached the quantitative and multi-variables description on the major characteristics variations among the urban, suburban and rural; and perfected the TIGER's urban-rural classification scheme by adding suburban category. Based on the free public GIS data, the spatial analysis method provides an easy and ideal tool for geographic researchers, environmental planners, urban/regional planners and administrators to delineate different categories of regional function on the specific locations and dig out spatial distribution information they wanted. Furthermore, it allows for future adjustment on some parameters as the spatial analysis method is implemented in the different regions or various eco-social models.展开更多
Spatial objects have two types of attributes: geometrical attributes and non-geometrical attributes, which belong to two different attribute domains (geometrical and non-geometrical domains). Although geometrically...Spatial objects have two types of attributes: geometrical attributes and non-geometrical attributes, which belong to two different attribute domains (geometrical and non-geometrical domains). Although geometrically scattered in a geometrical domain, spatial objects may be similar to each other in a non-geometrical domain. Most existing clustering algorithms group spatial datasets into different compact regions in a geometrical domain without considering the aspect of a non-geometrical domain. However, many application scenarios require clustering results in which a cluster has not only high proximity in a geometrical domain, but also high similarity in a non-geometrical domain. This means constraints are imposed on the clustering goal from both geometrical and non-geometrical domains simultaneously. Such a clustering problem is called dual clustering. As distributed clustering applications become more and more popular, it is necessary to tackle the dual clustering problem in distributed databases. The DCAD algorithm is proposed to solve this problem. DCAD consists of two levels of clustering: local clustering and global clustering. First, clustering is conducted at each local site with a local clustering algorithm, and the features of local clusters are extracted clustering is obtained based on those features fective and efficient. Second, local features from each site are sent to a central site where global Experiments on both artificial and real spatial datasets show that DCAD is effective and efficient.展开更多
Many existing product family design methods assume a given platform, However, it is not an in-tuitive task to select the platform and unique variable within a product family. Meanwhile, most approaches are single-plat...Many existing product family design methods assume a given platform, However, it is not an in-tuitive task to select the platform and unique variable within a product family. Meanwhile, most approaches are single-platform methods, in which design variables are either shared across all product variants or not at all. While in multiple-platform design, platform variables can have special value with regard to a subset of product variants within the product family, and offer opportunities for superior overall design. An information theoretical approach incorporating fuzzy clustering and Shannon's entropy was proposed for platform variables selection in multiple-platform product family. A 2-level chromosome genetic algorithm (2LCGA) was proposed and developed for optimizing the corresponding product family in a single stage, simultaneously determining the optimal settings for the product platform and unique variables. The single-stage approach can yield im-provements in the overall performance of the product family compared with two-stage approaches, in which the first stage involves determining the best settings for the platform and values of unique variables are found for each product in the second stage. An example of design of a family of universal motors was used to verify the proposed method.展开更多
A new application of cluster states is investigated for quantum information splitting (QIS) of an arbitrary three-qubit state. In our scheme, a four-qubit cluster state and a Bell state are shared by a sender (Alic...A new application of cluster states is investigated for quantum information splitting (QIS) of an arbitrary three-qubit state. In our scheme, a four-qubit cluster state and a Bell state are shared by a sender (Alice), a controller (Charlie), and a receiver (Bob). Both the sender and controller only need to perform Bell-state measurements (BSMs), the receiver can reconstruct the arbitrary three-qubit state by performing some appropriately unitary transformations on his qubits after he knows the measured results of both the sender and the controller. This QIS scheme is deterministic.展开更多
This paper introduces some definitions and defines a set of calculating indexes to facilitate the research, and then presents an algorithm to complete the spatial clustering result comparison between different cluster...This paper introduces some definitions and defines a set of calculating indexes to facilitate the research, and then presents an algorithm to complete the spatial clustering result comparison between different clustering themes. The research shows that some valuable spatial correlation patterns can be further found from the clustering result comparison with multi-themes, based on traditional spatial clustering as the first step. Those patterns can tell us what relations those themes have, and thus will help us have a deeper understanding of the studied spatial entities. An example is also given to demonstrate the principle and process of the method.展开更多
According to the structure characteristics of foreign fibers detection system,the foreign fiber flow flux mathematical model and fiber detection system were designed.The information fusion clustering structure of fore...According to the structure characteristics of foreign fibers detection system,the foreign fiber flow flux mathematical model and fiber detection system were designed.The information fusion clustering structure of foreign fiber flow flux was put forward.The data of the pressure difference,pressure,temperature,and density sensor which had impacted on flux were integrated and output by the Adaptive Resonance Theory-2(ART-2)network and BP network to clustering analysis of output space.The clustering control strategy will keep the output flow pressure stable,when the output pressure and temperature change.展开更多
Based on previous evaluating methods, a new method which combines GIS with Fussy Clustering algorithm is proposed and applied in evaluating the engineering geological environment of the research area of XuZHou City in...Based on previous evaluating methods, a new method which combines GIS with Fussy Clustering algorithm is proposed and applied in evaluating the engineering geological environment of the research area of XuZHou City in this paper. By analyzing the characteristics and formation of engineering geological environment,the major problems are discussed, including stability of basement rock, sandy soil liquefaction and cultural stratum.According to effecting factors of these problems, the stability of every engineering geological problem in the worked area is classified into different classes. Then, the Fussy Clustering method is used in assessing all conditions of engineering geological environment. Finally, the evaluation is fulfilled in the whole studied area. The calculating result shows the method is feasible.展开更多
This paper studies the application of the cluster-based approach in the enhancement of the competitiveness of Thailand's SME industry. The author had employed a qualitative method through the in-depth interview. The ...This paper studies the application of the cluster-based approach in the enhancement of the competitiveness of Thailand's SME industry. The author had employed a qualitative method through the in-depth interview. The result showed that Ratchaburi orchid cluster in Thailand has employed the concept of the cluster-based approach since they realized that it was useful and could enable them to produce good quality orchids for the international market. The finding also showed how individuals have worked together and helped each other, in order to build a good horizontal network of support and creating competitive advantages. In addition, the research paper related to knowledge management because knowledge management refers to a method for development which requires cluster members to exchange information, interact with each other, sharing and distribute information, create closer business relationship and build mutual benefit. Therefore, the cluster members will help each other to create a culture that values learning through making a commitment and sharing information to strengthen the cluster.展开更多
Optimal clustering for the web documents is known to complicated cornbinatorial Optimization problem and it is hard to develop a generally applicable oplimal algorithm. An accelerated simuIated arlneaIing aIgorithm is...Optimal clustering for the web documents is known to complicated cornbinatorial Optimization problem and it is hard to develop a generally applicable oplimal algorithm. An accelerated simuIated arlneaIing aIgorithm is developed for automatic web document classification. The web document classification problem is addressed as the problem of best describing a match between a web query and a hypothesized web object. The normalized term frequency and inverse document frequency coefficient is used as a measure of the match. Test beds are generated on - line during the search by transforming model web sites. As a result, web sites can be clustered optimally in terms of keyword vectofs of corresponding web documents.展开更多
The characteristic of geographic information system(GfS) spatial data operation is that query is much more frequent than insertion and deletion, and a new hybrid spatial clustering method used to build R-tree for GI...The characteristic of geographic information system(GfS) spatial data operation is that query is much more frequent than insertion and deletion, and a new hybrid spatial clustering method used to build R-tree for GIS spatial data was proposed in this paper. According to the aggregation of clustering method, R-tree was used to construct rules and specialty of spatial data. HCR-tree was the R-tree built with HCR algorithm. To test the efficiency of HCR algorithm, it was applied not only to the data organization of static R-tree but also to the nodes splitting of dynamic R-tree. The results show that R-tree with HCR has some advantages such as higher searching efficiency, less disk accesses and so on.展开更多
Social media is an information technology that allows users to communicate and share information. With steadily rising number of users across the globe, individuals participate in various activities, like connecting w...Social media is an information technology that allows users to communicate and share information. With steadily rising number of users across the globe, individuals participate in various activities, like connecting with friends and community members, sharing information, posting political messages, disaster recovery activities, reading daily news, learning, and entertainment, each driven by different sets of motivations. Culture has an influence on the kind of motivation―pro-social and personal needs oriented―that drives social media usage by individuals. Using Hofstede’s (1984b) cultural dimensions, the paper suggests that each of the dimensions will have influences on the social media behaviors differently. Such a cultural perspective helps future social media users to plan the kind of activity and information sharing based on the kind of motivation driving the target audience and the platform providers to design and market accordingly.展开更多
In order to solve the bottleneck problem of the traditional K-Medoids clustering algorithm facing to deal with massive data information at the time of memory capacity and processing speed of CPU, the paper proposed a ...In order to solve the bottleneck problem of the traditional K-Medoids clustering algorithm facing to deal with massive data information at the time of memory capacity and processing speed of CPU, the paper proposed a parallel algorithm MapReduce programming model based on the research of K-Medoids algorithm. This algorithm increase the computation granularity and reduces the communication cost ratio based on the MapReduce model. The experimental results show that the improved parallel algorithm compared with other algorithms, speedup and operation efficiency is greatly enhanced.展开更多
In this paper, we research the probability theory and matrix transformation based technique to manage the data for processing and analysis. Clustering analysis research has a long history, over the decades, the import...In this paper, we research the probability theory and matrix transformation based technique to manage the data for processing and analysis. Clustering analysis research has a long history, over the decades, the importance and the cross characteristics with other research direction to get the affirmation of the people. The probability theory and linear algebra act as the powerful tool for analyzing and mining data. The experimental result illustrates the effectiveness. In the near future, we plan to conduct more theoretical analysis on the topic.展开更多
The construction of virtual community in foreign language learning is a comprehensive foreign language learning environment integrated with foreign language vocabulary database construction and vocabulary retrieval, c...The construction of virtual community in foreign language learning is a comprehensive foreign language learning environment integrated with foreign language vocabulary database construction and vocabulary retrieval, combining the virtual reality technology to construct the language environment of foreign language learning. The virtual community of foreign language leaming can improve the sense of language authenticity in foreign language learning and improve the quality of foreign language teaching. A method of building a virtual community for foreign language learning is proposed based on data mining technology, data acquisition and feature preprocessing model for building semantic vocabulary of foreign language learning is constructed, the linguistic environment characteristics of the semantic vocabulary data of foreign language learning is analyzed, and the semantic noumenon structure model is obtained. Fuzzy clustering method is used for vocabulary clustering and comprehensive retrieval in the virtual community of foreign language learning, the performance of vocabulary classification in foreign language learning is improved, the adaptive semantic information fusion method is used to realize the vocabulary data mining in the virtual community of foreign language learning, information retrieval and access scheduling for virtual communities in foreign language learning are realized based on data mining results. The simulation results show that the accuracy of foreign language vocabulary retrieval is good, improve the efficiency of foreign language learning.展开更多
基金The Young Teachers Scientific Research Foundation (YTSRF) of Nanjing University of Science and Technology in the Year of2005-2006.
文摘A method that combines category-based and keyword-based concepts for a better information retrieval system is introduced. To improve document clustering, a document similarity measure based on cosine vector and keywords frequency in documents is proposed, but also with an input ontology. The ontology is domain specific and includes a list of keywords organized by degree of importance to the categories of the ontology, and by means of semantic knowledge, the ontology can improve the effects of document similarity measure and feedback of information retrieval systems. Two approaches to evaluating the performance of this similarity measure and the comparison with standard cosine vector similarity measure are also described.
文摘A method for the multi target locating and tracking with the multi sensor in a field artillery system is studied. A general modeling structure of the system is established. Based on concepts of cluster and closed ball, an algorithm is put forward for multi sensor multi target data fusion and an optimal solution for state estimation is presented. The simulation results prove the algorithm works well for the multi stationary target locating and the multi moving target tracking under the condition of the sparse target environment. Therefore, this method can be directly applied to the field artillery C 3I system.
文摘This study was undertaken to construct a preliminary spatial analysis method for building an urban-suburban-rural category in the specific sample area of central California and providing distribution characteristics in each category, based on which, some further studies such as regional manners of residential wood burning emission (PM2.5, the term used for a mixture of solid particles and liquid droplets found in the air, refers to particulate matter that is 2.5 mu m or smaller in size) could be carried out for the project of residential wood combustion. Demographic and infrastructure data with spatial characteristics were processed by integrating both Geographic Information System (GIS) and statistics method (Cluster Analysis), and then output to a category map as the result. It approached the quantitative and multi-variables description on the major characteristics variations among the urban, suburban and rural; and perfected the TIGER's urban-rural classification scheme by adding suburban category. Based on the free public GIS data, the spatial analysis method provides an easy and ideal tool for geographic researchers, environmental planners, urban/regional planners and administrators to delineate different categories of regional function on the specific locations and dig out spatial distribution information they wanted. Furthermore, it allows for future adjustment on some parameters as the spatial analysis method is implemented in the different regions or various eco-social models.
基金Funded by the National 973 Program of China (No.2003CB415205)the National Natural Science Foundation of China (No.40523005, No.60573183, No.60373019)the Open Research Fund Program of LIESMARS (No.WKL(04)0303).
文摘Spatial objects have two types of attributes: geometrical attributes and non-geometrical attributes, which belong to two different attribute domains (geometrical and non-geometrical domains). Although geometrically scattered in a geometrical domain, spatial objects may be similar to each other in a non-geometrical domain. Most existing clustering algorithms group spatial datasets into different compact regions in a geometrical domain without considering the aspect of a non-geometrical domain. However, many application scenarios require clustering results in which a cluster has not only high proximity in a geometrical domain, but also high similarity in a non-geometrical domain. This means constraints are imposed on the clustering goal from both geometrical and non-geometrical domains simultaneously. Such a clustering problem is called dual clustering. As distributed clustering applications become more and more popular, it is necessary to tackle the dual clustering problem in distributed databases. The DCAD algorithm is proposed to solve this problem. DCAD consists of two levels of clustering: local clustering and global clustering. First, clustering is conducted at each local site with a local clustering algorithm, and the features of local clusters are extracted clustering is obtained based on those features fective and efficient. Second, local features from each site are sent to a central site where global Experiments on both artificial and real spatial datasets show that DCAD is effective and efficient.
基金the National Natural Science Founda-tion of China (No. 70471022)Joint Research Scheme ofthe National Natural Science Foundation of China andthe Hong Kong Research Grant Council (No. 70418013)
文摘Many existing product family design methods assume a given platform, However, it is not an in-tuitive task to select the platform and unique variable within a product family. Meanwhile, most approaches are single-platform methods, in which design variables are either shared across all product variants or not at all. While in multiple-platform design, platform variables can have special value with regard to a subset of product variants within the product family, and offer opportunities for superior overall design. An information theoretical approach incorporating fuzzy clustering and Shannon's entropy was proposed for platform variables selection in multiple-platform product family. A 2-level chromosome genetic algorithm (2LCGA) was proposed and developed for optimizing the corresponding product family in a single stage, simultaneously determining the optimal settings for the product platform and unique variables. The single-stage approach can yield im-provements in the overall performance of the product family compared with two-stage approaches, in which the first stage involves determining the best settings for the platform and values of unique variables are found for each product in the second stage. An example of design of a family of universal motors was used to verify the proposed method.
基金*Supported by the National Natural Science Foundation of China under Grant No. 60807014, the Natural Science Foundation of Jiangxi Province of China under Grant No. 2009GZW0005, the Research Foundation of state key laboratory of advanced optical communication systems and networks, and the Research Foundation of the Education Department of Jiangxi Province under Grant No. G J J09153
文摘A new application of cluster states is investigated for quantum information splitting (QIS) of an arbitrary three-qubit state. In our scheme, a four-qubit cluster state and a Bell state are shared by a sender (Alice), a controller (Charlie), and a receiver (Bob). Both the sender and controller only need to perform Bell-state measurements (BSMs), the receiver can reconstruct the arbitrary three-qubit state by performing some appropriately unitary transformations on his qubits after he knows the measured results of both the sender and the controller. This QIS scheme is deterministic.
文摘This paper introduces some definitions and defines a set of calculating indexes to facilitate the research, and then presents an algorithm to complete the spatial clustering result comparison between different clustering themes. The research shows that some valuable spatial correlation patterns can be further found from the clustering result comparison with multi-themes, based on traditional spatial clustering as the first step. Those patterns can tell us what relations those themes have, and thus will help us have a deeper understanding of the studied spatial entities. An example is also given to demonstrate the principle and process of the method.
基金National Programon Key Basic Research Project of China(973program)(No.2010CB334711)
文摘According to the structure characteristics of foreign fibers detection system,the foreign fiber flow flux mathematical model and fiber detection system were designed.The information fusion clustering structure of foreign fiber flow flux was put forward.The data of the pressure difference,pressure,temperature,and density sensor which had impacted on flux were integrated and output by the Adaptive Resonance Theory-2(ART-2)network and BP network to clustering analysis of output space.The clustering control strategy will keep the output flow pressure stable,when the output pressure and temperature change.
文摘Based on previous evaluating methods, a new method which combines GIS with Fussy Clustering algorithm is proposed and applied in evaluating the engineering geological environment of the research area of XuZHou City in this paper. By analyzing the characteristics and formation of engineering geological environment,the major problems are discussed, including stability of basement rock, sandy soil liquefaction and cultural stratum.According to effecting factors of these problems, the stability of every engineering geological problem in the worked area is classified into different classes. Then, the Fussy Clustering method is used in assessing all conditions of engineering geological environment. Finally, the evaluation is fulfilled in the whole studied area. The calculating result shows the method is feasible.
文摘This paper studies the application of the cluster-based approach in the enhancement of the competitiveness of Thailand's SME industry. The author had employed a qualitative method through the in-depth interview. The result showed that Ratchaburi orchid cluster in Thailand has employed the concept of the cluster-based approach since they realized that it was useful and could enable them to produce good quality orchids for the international market. The finding also showed how individuals have worked together and helped each other, in order to build a good horizontal network of support and creating competitive advantages. In addition, the research paper related to knowledge management because knowledge management refers to a method for development which requires cluster members to exchange information, interact with each other, sharing and distribute information, create closer business relationship and build mutual benefit. Therefore, the cluster members will help each other to create a culture that values learning through making a commitment and sharing information to strengthen the cluster.
文摘Optimal clustering for the web documents is known to complicated cornbinatorial Optimization problem and it is hard to develop a generally applicable oplimal algorithm. An accelerated simuIated arlneaIing aIgorithm is developed for automatic web document classification. The web document classification problem is addressed as the problem of best describing a match between a web query and a hypothesized web object. The normalized term frequency and inverse document frequency coefficient is used as a measure of the match. Test beds are generated on - line during the search by transforming model web sites. As a result, web sites can be clustered optimally in terms of keyword vectofs of corresponding web documents.
文摘The characteristic of geographic information system(GfS) spatial data operation is that query is much more frequent than insertion and deletion, and a new hybrid spatial clustering method used to build R-tree for GIS spatial data was proposed in this paper. According to the aggregation of clustering method, R-tree was used to construct rules and specialty of spatial data. HCR-tree was the R-tree built with HCR algorithm. To test the efficiency of HCR algorithm, it was applied not only to the data organization of static R-tree but also to the nodes splitting of dynamic R-tree. The results show that R-tree with HCR has some advantages such as higher searching efficiency, less disk accesses and so on.
文摘Social media is an information technology that allows users to communicate and share information. With steadily rising number of users across the globe, individuals participate in various activities, like connecting with friends and community members, sharing information, posting political messages, disaster recovery activities, reading daily news, learning, and entertainment, each driven by different sets of motivations. Culture has an influence on the kind of motivation―pro-social and personal needs oriented―that drives social media usage by individuals. Using Hofstede’s (1984b) cultural dimensions, the paper suggests that each of the dimensions will have influences on the social media behaviors differently. Such a cultural perspective helps future social media users to plan the kind of activity and information sharing based on the kind of motivation driving the target audience and the platform providers to design and market accordingly.
文摘In order to solve the bottleneck problem of the traditional K-Medoids clustering algorithm facing to deal with massive data information at the time of memory capacity and processing speed of CPU, the paper proposed a parallel algorithm MapReduce programming model based on the research of K-Medoids algorithm. This algorithm increase the computation granularity and reduces the communication cost ratio based on the MapReduce model. The experimental results show that the improved parallel algorithm compared with other algorithms, speedup and operation efficiency is greatly enhanced.
文摘In this paper, we research the probability theory and matrix transformation based technique to manage the data for processing and analysis. Clustering analysis research has a long history, over the decades, the importance and the cross characteristics with other research direction to get the affirmation of the people. The probability theory and linear algebra act as the powerful tool for analyzing and mining data. The experimental result illustrates the effectiveness. In the near future, we plan to conduct more theoretical analysis on the topic.
文摘The construction of virtual community in foreign language learning is a comprehensive foreign language learning environment integrated with foreign language vocabulary database construction and vocabulary retrieval, combining the virtual reality technology to construct the language environment of foreign language learning. The virtual community of foreign language leaming can improve the sense of language authenticity in foreign language learning and improve the quality of foreign language teaching. A method of building a virtual community for foreign language learning is proposed based on data mining technology, data acquisition and feature preprocessing model for building semantic vocabulary of foreign language learning is constructed, the linguistic environment characteristics of the semantic vocabulary data of foreign language learning is analyzed, and the semantic noumenon structure model is obtained. Fuzzy clustering method is used for vocabulary clustering and comprehensive retrieval in the virtual community of foreign language learning, the performance of vocabulary classification in foreign language learning is improved, the adaptive semantic information fusion method is used to realize the vocabulary data mining in the virtual community of foreign language learning, information retrieval and access scheduling for virtual communities in foreign language learning are realized based on data mining results. The simulation results show that the accuracy of foreign language vocabulary retrieval is good, improve the efficiency of foreign language learning.