With this work, we introduce a novel method for the unsupervised learning of conceptual hierarchies, or concept maps as they are sometimes called, which is aimed specifically for use with literary texts, as such disti...With this work, we introduce a novel method for the unsupervised learning of conceptual hierarchies, or concept maps as they are sometimes called, which is aimed specifically for use with literary texts, as such distinguishing itself from the majority of research literature on the topic which is primarily focused on building ontologies from a vast array of different types of data sources, both structured and unstructured, to support various forms of AI, in particular, the Semantic Web as envisioned by Tim Berners-Lee. We first elaborate on mutually informing disciplines of philosophy and computer science, or more specifically the relationship between metaphysics, epistemology, ontology, computing and AI, followed by a technically in-depth discussion of DEBRA, our dependency tree based concept hierarchy constructor, which as its name alludes to, constructs a conceptual map in the form of a directed graph which illustrates the concepts, their respective relations, and the implied ontological structure of the concepts as encoded in the text, decoded with standard Python NLP libraries such as spaCy and NLTK. With this work we hope to both augment the Knowledge Representation literature with opportunities for intellectual advancement in AI with more intuitive, less analytical, and well-known forms of knowledge representation from the cognitive science community, as well as open up new areas of research between Computer Science and the Humanities with respect to the application of the latest in NLP tools and techniques upon literature of cultural significance, shedding light on existing methods of computation with respect to documents in semantic space that effectively allows for, at the very least, the comparison and evolution of texts through time, using vector space math.展开更多
An important issue in Knowledge Discovery in Databases is to allow the discovered knowledge to be as close as possible to natural languages to satisfy user needs with tractability on one hand, and to offer KDD systems...An important issue in Knowledge Discovery in Databases is to allow the discovered knowledge to be as close as possible to natural languages to satisfy user needs with tractability on one hand, and to offer KDD systems robustness on the other hand. At this junction, this paper describes a new concept of linguistic atoms with three digital characteristics: expected value Ex, entropy En, and deviation D. The mathematical description has effectively illtegrated the fuzziness and randomness of linguistic terms in a unified way Based on this model a method of knowledge representation in KDD is developed which bridges the gap between quantitative knowledge and qualitative knowledge. Mapping between quantitatives and qualitatives becomes much easier and interchangeab1e. In order to discover genera1ized knowledge from a database, one may use virtual linguistic terms and cloud transforms for the auto-generation of concept hierarchies to attributes. Predictive data mining with the cloud model is given for implementation. This further illustrates the advantages of this linguistic model in KDD.展开更多
文摘With this work, we introduce a novel method for the unsupervised learning of conceptual hierarchies, or concept maps as they are sometimes called, which is aimed specifically for use with literary texts, as such distinguishing itself from the majority of research literature on the topic which is primarily focused on building ontologies from a vast array of different types of data sources, both structured and unstructured, to support various forms of AI, in particular, the Semantic Web as envisioned by Tim Berners-Lee. We first elaborate on mutually informing disciplines of philosophy and computer science, or more specifically the relationship between metaphysics, epistemology, ontology, computing and AI, followed by a technically in-depth discussion of DEBRA, our dependency tree based concept hierarchy constructor, which as its name alludes to, constructs a conceptual map in the form of a directed graph which illustrates the concepts, their respective relations, and the implied ontological structure of the concepts as encoded in the text, decoded with standard Python NLP libraries such as spaCy and NLTK. With this work we hope to both augment the Knowledge Representation literature with opportunities for intellectual advancement in AI with more intuitive, less analytical, and well-known forms of knowledge representation from the cognitive science community, as well as open up new areas of research between Computer Science and the Humanities with respect to the application of the latest in NLP tools and techniques upon literature of cultural significance, shedding light on existing methods of computation with respect to documents in semantic space that effectively allows for, at the very least, the comparison and evolution of texts through time, using vector space math.
文摘An important issue in Knowledge Discovery in Databases is to allow the discovered knowledge to be as close as possible to natural languages to satisfy user needs with tractability on one hand, and to offer KDD systems robustness on the other hand. At this junction, this paper describes a new concept of linguistic atoms with three digital characteristics: expected value Ex, entropy En, and deviation D. The mathematical description has effectively illtegrated the fuzziness and randomness of linguistic terms in a unified way Based on this model a method of knowledge representation in KDD is developed which bridges the gap between quantitative knowledge and qualitative knowledge. Mapping between quantitatives and qualitatives becomes much easier and interchangeab1e. In order to discover genera1ized knowledge from a database, one may use virtual linguistic terms and cloud transforms for the auto-generation of concept hierarchies to attributes. Predictive data mining with the cloud model is given for implementation. This further illustrates the advantages of this linguistic model in KDD.