Digitization projects should focus on quantity rather than quality. Increasing quantities of information produce qualitatively more valuable services. Online writing and searching are now common, and it is only online...Digitization projects should focus on quantity rather than quality. Increasing quantities of information produce qualitatively more valuable services. Online writing and searching are now common, and it is only online reading that is still limiting our use of online books. New interfaces might increase our willingness to read online, which should be encouraged rather than fought, since it represents an increase both the amount of information available and the participation of more people in the writing and exchange of information.展开更多
Google’s announcement that it intended to digitize all the books in several major research libraries was met with mixed reactions. John Wilkin at the University of Michigan declared “This is the day the world chang...Google’s announcement that it intended to digitize all the books in several major research libraries was met with mixed reactions. John Wilkin at the University of Michigan declared “This is the day the world changes,” while Rory Litwin said in Library Juice that the move would “commercialize the great research libraries with a handshake, suddenly and epochally.” The four directors of the Universal Library and Million Book Project have received many questions about the comparative aspects of our work and Google Print. My purpose is to compare the two, talking about their genesis, the realities of collections and logistics, and the worries that arise from these realities.展开更多
In the last two decades of the 20th century, there has been an increasing interest in and emphasis on the study of the Hong Kong literature in both the academic and general public in Hong Kong. Recognizing the emergen...In the last two decades of the 20th century, there has been an increasing interest in and emphasis on the study of the Hong Kong literature in both the academic and general public in Hong Kong. Recognizing the emergent need of the resources on Hong Kong literature, the University Library System of the Chinese University of Hong Kong set up the Hong Kong Literature Database (the “Database”), which was the first Chinese literature database in the Internet in 2000. The paper will examine how the database is constructed using XML technology andometadata schema, The database also employs Unicode UTF-8 as the internal code. A mapping table for traditional and simplified Chinese characters was created based on Unihan and is used behind the scene so that a user can either input traditional or simplified Chinese characters and retrieval will give both traditional and simplified Chinese characters. Currently 65% of journals use OCR technology so that full-text searching is possible. The Chinese OCR technology will be examined in greater detail. Special features of the Database such as, page-by-page browse mode, position-highlight for full-page newspaper, linking Table-Of-Contents and book jackets from the Library catalogue, etc. are described. The paper will also bring out the problem of massive downloading and compare the state-of-the-art technology and their shortcomings. This paper shows how the Hong Kong Literature Database facilitates future collaboration and data exchange by using open standard, shareable structure and the latest technology.展开更多
This paper starts with a description of the present status of the Digital Library of India Initiative. As part of this initiative large corpus of scanned text is available in many Indian languages and has stimulated a...This paper starts with a description of the present status of the Digital Library of India Initiative. As part of this initiative large corpus of scanned text is available in many Indian languages and has stimulated a vast amount of research in Indian language technology briefly described in this paper. Other than the Digital Library of India Initiative which is part of the Million Books to the Web Project initiated by Prof Raj Reddy of Carnegie Mellon University, there are a few more initiatives in India towards taking the heritage of the country to the Web. This paper presents the future directions for the Digital Library of India Initiative both in terms of growing collection and the technical challenges in managing such large collection poses.展开更多
文摘Digitization projects should focus on quantity rather than quality. Increasing quantities of information produce qualitatively more valuable services. Online writing and searching are now common, and it is only online reading that is still limiting our use of online books. New interfaces might increase our willingness to read online, which should be encouraged rather than fought, since it represents an increase both the amount of information available and the participation of more people in the writing and exchange of information.
文摘Google’s announcement that it intended to digitize all the books in several major research libraries was met with mixed reactions. John Wilkin at the University of Michigan declared “This is the day the world changes,” while Rory Litwin said in Library Juice that the move would “commercialize the great research libraries with a handshake, suddenly and epochally.” The four directors of the Universal Library and Million Book Project have received many questions about the comparative aspects of our work and Google Print. My purpose is to compare the two, talking about their genesis, the realities of collections and logistics, and the worries that arise from these realities.
文摘In the last two decades of the 20th century, there has been an increasing interest in and emphasis on the study of the Hong Kong literature in both the academic and general public in Hong Kong. Recognizing the emergent need of the resources on Hong Kong literature, the University Library System of the Chinese University of Hong Kong set up the Hong Kong Literature Database (the “Database”), which was the first Chinese literature database in the Internet in 2000. The paper will examine how the database is constructed using XML technology andometadata schema, The database also employs Unicode UTF-8 as the internal code. A mapping table for traditional and simplified Chinese characters was created based on Unihan and is used behind the scene so that a user can either input traditional or simplified Chinese characters and retrieval will give both traditional and simplified Chinese characters. Currently 65% of journals use OCR technology so that full-text searching is possible. The Chinese OCR technology will be examined in greater detail. Special features of the Database such as, page-by-page browse mode, position-highlight for full-page newspaper, linking Table-Of-Contents and book jackets from the Library catalogue, etc. are described. The paper will also bring out the problem of massive downloading and compare the state-of-the-art technology and their shortcomings. This paper shows how the Hong Kong Literature Database facilitates future collaboration and data exchange by using open standard, shareable structure and the latest technology.
文摘This paper starts with a description of the present status of the Digital Library of India Initiative. As part of this initiative large corpus of scanned text is available in many Indian languages and has stimulated a vast amount of research in Indian language technology briefly described in this paper. Other than the Digital Library of India Initiative which is part of the Million Books to the Web Project initiated by Prof Raj Reddy of Carnegie Mellon University, there are a few more initiatives in India towards taking the heritage of the country to the Web. This paper presents the future directions for the Digital Library of India Initiative both in terms of growing collection and the technical challenges in managing such large collection poses.