There are a lot of diseases that carry death risk when these diseases are infected to human body, if early measures are not taken. Thyroid cancer is one of them. In USA, number of thyroid cancer cases resulted in deat...There are a lot of diseases that carry death risk when these diseases are infected to human body, if early measures are not taken. Thyroid cancer is one of them. In USA, number of thyroid cancer cases resulted in death in only 2013 shows necessity of early fight with this disease. This study aims performance improvement in diagnosis of thyroid cancer with machine learning techniques. Study consists of 3 phases. In the first phase, BayesNet, NaiveBayes, SMO, Ibk and Random Forest classifiers have been trained with thyroid cancer train dataset. In the second phase, trained classifiers have been tested with thyroid cancer test dataset and the obtained performance results have been compared. In the third and last phase, approaches named above have been integrated to algorithm AdaboostMI to show difference between of ensemble classifiers from conventional individual classifiers and first two phases have been repeated. With using ensemble approaches performance improvement has been achieved in diagnosis of thyroid cancer. Also, kappa, accuracy and MCC values obtained from these classifier models have been explained in tables and effects on diagnosis of the disease have been shown with ROC graphics. All of these operations have been carried out with WEKA data mining program.展开更多
文摘There are a lot of diseases that carry death risk when these diseases are infected to human body, if early measures are not taken. Thyroid cancer is one of them. In USA, number of thyroid cancer cases resulted in death in only 2013 shows necessity of early fight with this disease. This study aims performance improvement in diagnosis of thyroid cancer with machine learning techniques. Study consists of 3 phases. In the first phase, BayesNet, NaiveBayes, SMO, Ibk and Random Forest classifiers have been trained with thyroid cancer train dataset. In the second phase, trained classifiers have been tested with thyroid cancer test dataset and the obtained performance results have been compared. In the third and last phase, approaches named above have been integrated to algorithm AdaboostMI to show difference between of ensemble classifiers from conventional individual classifiers and first two phases have been repeated. With using ensemble approaches performance improvement has been achieved in diagnosis of thyroid cancer. Also, kappa, accuracy and MCC values obtained from these classifier models have been explained in tables and effects on diagnosis of the disease have been shown with ROC graphics. All of these operations have been carried out with WEKA data mining program.