摘要
This study explores the segmentation algorithm of item examination. In the specific implementation, a large amount of h method of character statistics, the connection tightness values text data, especially of sing storical health examination e long length data in health data is analysed. Using the TABS between two adjacent characters are calculated Three parameters, the candidate number N, the best position BP, and balance weight BW are set. The total segmentation indexes Sis are calculated, thus determined the segmentation position Pos. The optimal parameter values are determined by the method of information measurement. Experimental results show that the accuracy rate is 78.6% and reaches 82.9% in the most frequently appeared text item. The complexity of the algorithm is O(n). Using no existing domain knowledge, it is very simple and fast. By executed repeatedly, it is convenient to obtain the characteristics of each single item of text data, furthermore, to distinguish respective express preference of different physicians to the same item. The assumption is verified that without professional domain knowledge, a large amount of historical data can provide valuable clues for the text understanding. The results of this research are being applied and verified in the following research works in the field of health examination.
作者
Hui An
Dahui Wang
Zhigeng Pan
Meiling Chen
Xinting Wang
Hui An;Dahui Wang;Zhigeng Pan;Meiling Chen;Xinting Wang(DigitalMedia & Interaction Research Center, Hangzhou Normal University, Wenzhou People's Hospital, Wenzhou 325000, People's Republic of China;Department of Health Examination, Hangzhou Normal University, Hangzhou, People's Republic of China;Institute of Industrial VR, Foshan University, Guangdong, People's Republic of China)