In this article,we propose a novel probabilistic framework to improve the accuracy of a weighted majority voting algorithm.In order to assign higher weights to the classifiers which can correctly classify hard-to-clas...In this article,we propose a novel probabilistic framework to improve the accuracy of a weighted majority voting algorithm.In order to assign higher weights to the classifiers which can correctly classify hard-to-classify instances,we introduce the item response theory(IRT)framework to evaluate the samples′difficulty and classifiers′ability simultaneously.We assigned the weights to classifiers based on their abilities.Three models are created with different assumptions suitable for different cases.When making an inference,we keep a balance between the accuracy and complexity.In our experiment,all the base models are constructed by single trees via bootstrap.To explain the models,we illustrate how the IRT ensemble model constructs the classifying boundary.We also compare their performance with other widely used methods and show that our model performs well on 19 datasets.展开更多
This paper studies the technics of reducing item exposure by utilizing automatic item generation methods. Known test item calibration method uses item parameter estimation with the statistical data, collected during e...This paper studies the technics of reducing item exposure by utilizing automatic item generation methods. Known test item calibration method uses item parameter estimation with the statistical data, collected during examinees prior testing. Disadvantage of the mentioned item calibration method is the item exposure; when test items become familiar to the examinees. To reduce the item exposure, automatic item generation method is used, where item models are being constructed based on already calibrated test items without losing already estimated item parameters. A technic of item model extraction method from the already calibrated and therefore exposed test items described, which can be used by the test item development specialists to integrate automatic item generation principles with the existing testing applications.展开更多
AIM:To examine the links between quality of sleep and the severity of intestinal symptoms in irritable bow-el syndrome(IBS).METHODS:One hundred and forty-two outpatients(110female,32 male)who met the Rome Ⅲ criteria ...AIM:To examine the links between quality of sleep and the severity of intestinal symptoms in irritable bow-el syndrome(IBS).METHODS:One hundred and forty-two outpatients(110female,32 male)who met the Rome Ⅲ criteria for IBS with no psychiatric comorbidity were consecutively en-rolled in this study.Data on age,body mass index(BMI),and a set of life-habit variables were recorded,and IBS symptoms and sleep quality were evaluated using the questionnaires IBS Symptom Severity Score(IBS-SSS)and Pittsburgh Sleep Quality Index(PSQI).The associa-tion between severity of IBS and sleep disturbances was evaluated by comparing the global IBS-SSS and PSQI score(Pearson's correlation and Fisher's exact test)and then analyzing the individual items of the IBS-SSS and PSQI questionnaires by a unitary bowel-sleep model based on item response theory(IRT).RESULTS:IBS-SSS ranged from mild to severe(120-470).The global PSQI score ranged from 1 to 17(median 5),and 60 patients were found to be poor sleepers(PSQI>5).The correlation between the global IBS-SSS and PSQI score indicated a weak association(r=0.2 and 95% CI:-0.03 to 0.35,P<0.05),which becomes stronger using our unitary model.Indeed,the IBS and sleep disturbances severities,estimated as latent variables,resulted significantly high intra-subject cor-relation(posterior mean of r=0.45 and 95% CI:0.17 to 0.70,P<0.05).Moreover,the correlations between patient features(age,sex,BMI,daily coffee and alcohol intake)and IBS and sleep disturbances were also ana-lyzed through our unitary model.Age was a signif icant regressor,with patients≤50 years old showing more severe bowel disturbances(posterior mean=-0.38,P<0.05)and less severe sleep disturbances(posterior mean=0.49,P<0.05)than older patients.Higher daily coffee intake was correlated with a lower sever-ity of bowel disturbances(posterior mean=-0.31,P<0.05).Sex(female)and daily alcohol intake(modest)were correlated with less severe sleep disturbances.CONCLUSION:The unitary bowel-sleep model based on IRT revealed a strong positive correlation between the severity of IBS symptoms and sleep disturbances.展开更多
Cognitive diagnosis is an important issue of intelligent education systems,which aims to estimate students'proficiency on specific knowledge concepts.Most existing studies rely on the assumption of static student ...Cognitive diagnosis is an important issue of intelligent education systems,which aims to estimate students'proficiency on specific knowledge concepts.Most existing studies rely on the assumption of static student states and ig-nore the dynamics of proficiency in the learning process,which makes them unsuitable for online learning scenarios.In this paper,we propose a unified temporal item response theory(UTIRT)framework,incorporating temporality and random-ness of proficiency evolving to get both accurate and interpretable diagnosis results.Specifically,we hypothesize that stu-dents'proficiency varies as a Wiener process and describe a probabilistic graphical model in UTIRT to consider temporali-ty and randomness factors.Furthermore,based on the relationship between student states and exercising answers,we hy-pothesize that the answering result at time k contributes most to inferring a student's proficiency at time k,which also re-flects the temporality aspect and enables us to get analytical maximization(M-step)in the expectation maximization(EM)algorithm when estimating model parameters.Our UTIRT is a framework containing unified training and inferenc-ing methods,and is general to cover several typical traditional models such as Item Response Theory(IRT),multidimen-sional IRT(MIRT),and temporal IRT(TIRT).Extensive experimental results on real-world datasets show the effective-ness of UTIRT and prove its superiority in leveraging temporality theoretically and practically over TIRT.展开更多
Objective: To evaluate a scale of patient-reported outcomes for the assessment of myasthenia gravis patients (MG-PRO) in China. Methods: A total of 100 MG patients were interviewed for the field testing. Another 5...Objective: To evaluate a scale of patient-reported outcomes for the assessment of myasthenia gravis patients (MG-PRO) in China. Methods: A total of 100 MG patients were interviewed for the field testing. Another 56 MG patients were selected and assessed with the MG-PRO scale before treatment and at 1, 2 and 4 weeks after treatment. The classical test theory and item response theory (IRT) were used to assess the psychometric characteristics of the MG-PRO scale, Results: The MG-PRO scale included 4 dimensions: physical, psychological, social environment, and treatment. Confirmatory factor analysis showed that each dimension was consistent with the theoretical construct. The scores of the physical and psychological dimensions increased significantly at 1 week after treatment (P〈0.05). All the dimension scores and the MG-PRO score increased significantly at 2 and 4 weeks after treatment (P〈0.05). IRT showed that person separation indices were greater than 0.8, most of the item fit residual statistics were within + 2.5, and no item had uniform or non-uniform differential item functioning (DIF) between gender and age (〈40, 〉140). Conclusions: The MG-PRO scale is valid for measuring the quality of life (QOL) of MG patients, with good reliability, validity, responsiveness, and good psychometric characteristics from IRT. It can be applied to evaluate the QOL of MG patients and to assess treatment effects in clinical trials.展开更多
The physical vulnerability of coastal areas due to rising sea level and the flooding risk consequent,does not guarantee the implementation of protective behaviors by these risk zones’inhabitants.This study aims to es...The physical vulnerability of coastal areas due to rising sea level and the flooding risk consequent,does not guarantee the implementation of protective behaviors by these risk zones’inhabitants.This study aims to establish the link between the willingness to carry out protective behaviors and physical and perceived indicators of vulnerability.A typology of coastal flooding vulnerability,uses various physical indicators and their perceived counterparts which have been collected from 490 inhabitants of Cartagena(Colombia,declared world heritage of humanity by UNESCO in 1984),resident in areas of coastal flooding risks.The item-response theory(IRT)approach has been used.The results reveal that the implementation of protective behaviors is more related to perceived indicators,such as distance to the sea,than to actual physical vulnerability.We observe that physical vulnerability is linked to the intention to carry out protective behaviors.The presence of a defensive structure against coastal flooding could be considered as a visual cue and be a good predictor of the willingness to carry out protective behaviors.On the contrary,people in the most vulnerable situation(single-storey house)do not demonstrate a higher level of willingness to carry out protective behavior,as well of participants who lived in residential buildings which have demonstrated lower level of willingness to carry out such behaviors.Therefore,vulnerability of the house is not seen as a criterion that encourages participants to better protect themselves.展开更多
文摘In this article,we propose a novel probabilistic framework to improve the accuracy of a weighted majority voting algorithm.In order to assign higher weights to the classifiers which can correctly classify hard-to-classify instances,we introduce the item response theory(IRT)framework to evaluate the samples′difficulty and classifiers′ability simultaneously.We assigned the weights to classifiers based on their abilities.Three models are created with different assumptions suitable for different cases.When making an inference,we keep a balance between the accuracy and complexity.In our experiment,all the base models are constructed by single trees via bootstrap.To explain the models,we illustrate how the IRT ensemble model constructs the classifying boundary.We also compare their performance with other widely used methods and show that our model performs well on 19 datasets.
文摘This paper studies the technics of reducing item exposure by utilizing automatic item generation methods. Known test item calibration method uses item parameter estimation with the statistical data, collected during examinees prior testing. Disadvantage of the mentioned item calibration method is the item exposure; when test items become familiar to the examinees. To reduce the item exposure, automatic item generation method is used, where item models are being constructed based on already calibrated test items without losing already estimated item parameters. A technic of item model extraction method from the already calibrated and therefore exposed test items described, which can be used by the test item development specialists to integrate automatic item generation principles with the existing testing applications.
文摘AIM:To examine the links between quality of sleep and the severity of intestinal symptoms in irritable bow-el syndrome(IBS).METHODS:One hundred and forty-two outpatients(110female,32 male)who met the Rome Ⅲ criteria for IBS with no psychiatric comorbidity were consecutively en-rolled in this study.Data on age,body mass index(BMI),and a set of life-habit variables were recorded,and IBS symptoms and sleep quality were evaluated using the questionnaires IBS Symptom Severity Score(IBS-SSS)and Pittsburgh Sleep Quality Index(PSQI).The associa-tion between severity of IBS and sleep disturbances was evaluated by comparing the global IBS-SSS and PSQI score(Pearson's correlation and Fisher's exact test)and then analyzing the individual items of the IBS-SSS and PSQI questionnaires by a unitary bowel-sleep model based on item response theory(IRT).RESULTS:IBS-SSS ranged from mild to severe(120-470).The global PSQI score ranged from 1 to 17(median 5),and 60 patients were found to be poor sleepers(PSQI>5).The correlation between the global IBS-SSS and PSQI score indicated a weak association(r=0.2 and 95% CI:-0.03 to 0.35,P<0.05),which becomes stronger using our unitary model.Indeed,the IBS and sleep disturbances severities,estimated as latent variables,resulted significantly high intra-subject cor-relation(posterior mean of r=0.45 and 95% CI:0.17 to 0.70,P<0.05).Moreover,the correlations between patient features(age,sex,BMI,daily coffee and alcohol intake)and IBS and sleep disturbances were also ana-lyzed through our unitary model.Age was a signif icant regressor,with patients≤50 years old showing more severe bowel disturbances(posterior mean=-0.38,P<0.05)and less severe sleep disturbances(posterior mean=0.49,P<0.05)than older patients.Higher daily coffee intake was correlated with a lower sever-ity of bowel disturbances(posterior mean=-0.31,P<0.05).Sex(female)and daily alcohol intake(modest)were correlated with less severe sleep disturbances.CONCLUSION:The unitary bowel-sleep model based on IRT revealed a strong positive correlation between the severity of IBS symptoms and sleep disturbances.
基金supported by the National Key Research and Development Program of China under Grant No.2021YFF0901003the National Natural Science Foundation of China under Grant Nos.U20A20229,61922073,and 62106244the Natural Science Foundation of Anhui Province of China under Grant No.2108085QF272.
文摘Cognitive diagnosis is an important issue of intelligent education systems,which aims to estimate students'proficiency on specific knowledge concepts.Most existing studies rely on the assumption of static student states and ig-nore the dynamics of proficiency in the learning process,which makes them unsuitable for online learning scenarios.In this paper,we propose a unified temporal item response theory(UTIRT)framework,incorporating temporality and random-ness of proficiency evolving to get both accurate and interpretable diagnosis results.Specifically,we hypothesize that stu-dents'proficiency varies as a Wiener process and describe a probabilistic graphical model in UTIRT to consider temporali-ty and randomness factors.Furthermore,based on the relationship between student states and exercising answers,we hy-pothesize that the answering result at time k contributes most to inferring a student's proficiency at time k,which also re-flects the temporality aspect and enables us to get analytical maximization(M-step)in the expectation maximization(EM)algorithm when estimating model parameters.Our UTIRT is a framework containing unified training and inferenc-ing methods,and is general to cover several typical traditional models such as Item Response Theory(IRT),multidimen-sional IRT(MIRT),and temporal IRT(TIRT).Extensive experimental results on real-world datasets show the effective-ness of UTIRT and prove its superiority in leveraging temporality theoretically and practically over TIRT.
基金Supported by the Major State Basic Research Development Program of China(973 Program,No.2005CB523500)the Key Project of the National 11th Five Year Research Program of China(No.2006BAI04A12)
文摘Objective: To evaluate a scale of patient-reported outcomes for the assessment of myasthenia gravis patients (MG-PRO) in China. Methods: A total of 100 MG patients were interviewed for the field testing. Another 56 MG patients were selected and assessed with the MG-PRO scale before treatment and at 1, 2 and 4 weeks after treatment. The classical test theory and item response theory (IRT) were used to assess the psychometric characteristics of the MG-PRO scale, Results: The MG-PRO scale included 4 dimensions: physical, psychological, social environment, and treatment. Confirmatory factor analysis showed that each dimension was consistent with the theoretical construct. The scores of the physical and psychological dimensions increased significantly at 1 week after treatment (P〈0.05). All the dimension scores and the MG-PRO score increased significantly at 2 and 4 weeks after treatment (P〈0.05). IRT showed that person separation indices were greater than 0.8, most of the item fit residual statistics were within + 2.5, and no item had uniform or non-uniform differential item functioning (DIF) between gender and age (〈40, 〉140). Conclusions: The MG-PRO scale is valid for measuring the quality of life (QOL) of MG patients, with good reliability, validity, responsiveness, and good psychometric characteristics from IRT. It can be applied to evaluate the QOL of MG patients and to assess treatment effects in clinical trials.
基金supported by the National Research Agency,France within the framework of the CLIMATRisk project(ANR-15-CE03-0002-01).
文摘The physical vulnerability of coastal areas due to rising sea level and the flooding risk consequent,does not guarantee the implementation of protective behaviors by these risk zones’inhabitants.This study aims to establish the link between the willingness to carry out protective behaviors and physical and perceived indicators of vulnerability.A typology of coastal flooding vulnerability,uses various physical indicators and their perceived counterparts which have been collected from 490 inhabitants of Cartagena(Colombia,declared world heritage of humanity by UNESCO in 1984),resident in areas of coastal flooding risks.The item-response theory(IRT)approach has been used.The results reveal that the implementation of protective behaviors is more related to perceived indicators,such as distance to the sea,than to actual physical vulnerability.We observe that physical vulnerability is linked to the intention to carry out protective behaviors.The presence of a defensive structure against coastal flooding could be considered as a visual cue and be a good predictor of the willingness to carry out protective behaviors.On the contrary,people in the most vulnerable situation(single-storey house)do not demonstrate a higher level of willingness to carry out protective behavior,as well of participants who lived in residential buildings which have demonstrated lower level of willingness to carry out such behaviors.Therefore,vulnerability of the house is not seen as a criterion that encourages participants to better protect themselves.