Traditional information retrieval systems respond to user queries with ranked lists of relevant documents. Since, XML (Extensible Markup Language) documents separate content and structure; XML-IR (information retri...Traditional information retrieval systems respond to user queries with ranked lists of relevant documents. Since, XML (Extensible Markup Language) documents separate content and structure; XML-IR (information retrieval) systems are able to retrieve only the relevant portions of documents. Therefore, users who utilize an XML-IR system could potentially receive highly relevant and precise material. We have developed the XML information retrieval system by using MySQL and Sphinx, which we call MEXIR. In our system, XML documents are stored into one table that has fixed relational schema. The schema is independent of the logical structure of XML documents. Each node in XML documents is represented by labels that express the positions in XML tree, namely ADXPI scheme. Our system has performance experiments on INEX collections and shown an average up to four seconds better than GPX. In addition, it has been reduced the size of the data down by 82.29 % compare to GPX system.展开更多
文摘Traditional information retrieval systems respond to user queries with ranked lists of relevant documents. Since, XML (Extensible Markup Language) documents separate content and structure; XML-IR (information retrieval) systems are able to retrieve only the relevant portions of documents. Therefore, users who utilize an XML-IR system could potentially receive highly relevant and precise material. We have developed the XML information retrieval system by using MySQL and Sphinx, which we call MEXIR. In our system, XML documents are stored into one table that has fixed relational schema. The schema is independent of the logical structure of XML documents. Each node in XML documents is represented by labels that express the positions in XML tree, namely ADXPI scheme. Our system has performance experiments on INEX collections and shown an average up to four seconds better than GPX. In addition, it has been reduced the size of the data down by 82.29 % compare to GPX system.