Purpose: This paper relates the definition of data quality procedures for knowledge organizations such as Higher Education Institutions. The main purpose is to present the flexible approach developed for monitoring th...Purpose: This paper relates the definition of data quality procedures for knowledge organizations such as Higher Education Institutions. The main purpose is to present the flexible approach developed for monitoring the data quality of the European Tertiary Education Register(ETER) database, illustrating its functioning and highlighting the main challenges that still have to be faced in this domain.Design/methodology/approach: The proposed data quality methodology is based on two kinds of checks, one to assess the consistency of cross-sectional data and the other to evaluate the stability of multiannual data. This methodology has an operational and empirical orientation. This means that the proposed checks do not assume any theoretical distribution for the determination of the threshold parameters that identify potential outliers, inconsistencies, and errors in the data. Findings: We show that the proposed cross-sectional checks and multiannual checks are helpful to identify outliers, extreme observations and to detect ontological inconsistencies not described in the available meta-data. For this reason, they may be a useful complement to integrate the processing of the available information.Research limitations: The coverage of the study is limited to European Higher Education Institutions. The cross-sectional and multiannual checks are not yet completely integrated.Practical implications: The consideration of the quality of the available data and information is important to enhance data quality-aware empirical investigations, highlighting problems, and areas where to invest for improving the coverage and interoperability of data in future data collection initiatives.Originality/value: The data-driven quality checks proposed in this paper may be useful as a reference for building and monitoring the data quality of new databases or of existing databases available for other countries or systems characterized by high heterogeneity and complexity of the units of analysis without relying on pre-specified theoretical distributions.展开更多
This is the second part of the Journal of Data and Information Science(JDIS)Special Issue on ISSI 2019,the 17thInternational Conference on Scientometrics and Informetrics(ISSI2019)held in Rome,on 2-5 September 2019 an...This is the second part of the Journal of Data and Information Science(JDIS)Special Issue on ISSI 2019,the 17thInternational Conference on Scientometrics and Informetrics(ISSI2019)held in Rome,on 2-5 September 2019 and includes additional 10 selected posters presented during the conference largely expanded by the authors afterwards.展开更多
This volume(Vol. 5, No. 3) of the Journal of Data and Information Science(JDIS) is the Part I of the Special Issue on ISSI 2019, the 17 th International Conference on Scientometrics and Informetrics(ISSI2019) held in ...This volume(Vol. 5, No. 3) of the Journal of Data and Information Science(JDIS) is the Part I of the Special Issue on ISSI 2019, the 17 th International Conference on Scientometrics and Informetrics(ISSI2019) held in Rome, on 2–5 September 2019 and includes the first part of the selected posters presented during the conference and extended by the authors afterward.展开更多
基金support of the European Commission ETER Project (No. 934533-2017-AO8-CH)H2020 RISIS 2 project (No. 824091)。
文摘Purpose: This paper relates the definition of data quality procedures for knowledge organizations such as Higher Education Institutions. The main purpose is to present the flexible approach developed for monitoring the data quality of the European Tertiary Education Register(ETER) database, illustrating its functioning and highlighting the main challenges that still have to be faced in this domain.Design/methodology/approach: The proposed data quality methodology is based on two kinds of checks, one to assess the consistency of cross-sectional data and the other to evaluate the stability of multiannual data. This methodology has an operational and empirical orientation. This means that the proposed checks do not assume any theoretical distribution for the determination of the threshold parameters that identify potential outliers, inconsistencies, and errors in the data. Findings: We show that the proposed cross-sectional checks and multiannual checks are helpful to identify outliers, extreme observations and to detect ontological inconsistencies not described in the available meta-data. For this reason, they may be a useful complement to integrate the processing of the available information.Research limitations: The coverage of the study is limited to European Higher Education Institutions. The cross-sectional and multiannual checks are not yet completely integrated.Practical implications: The consideration of the quality of the available data and information is important to enhance data quality-aware empirical investigations, highlighting problems, and areas where to invest for improving the coverage and interoperability of data in future data collection initiatives.Originality/value: The data-driven quality checks proposed in this paper may be useful as a reference for building and monitoring the data quality of new databases or of existing databases available for other countries or systems characterized by high heterogeneity and complexity of the units of analysis without relying on pre-specified theoretical distributions.
文摘This is the second part of the Journal of Data and Information Science(JDIS)Special Issue on ISSI 2019,the 17thInternational Conference on Scientometrics and Informetrics(ISSI2019)held in Rome,on 2-5 September 2019 and includes additional 10 selected posters presented during the conference largely expanded by the authors afterwards.
文摘This volume(Vol. 5, No. 3) of the Journal of Data and Information Science(JDIS) is the Part I of the Special Issue on ISSI 2019, the 17 th International Conference on Scientometrics and Informetrics(ISSI2019) held in Rome, on 2–5 September 2019 and includes the first part of the selected posters presented during the conference and extended by the authors afterward.