Rare categories become more and more abundant and their characterization has received little attention thus far. Fraudulent banking transactions, network intrusions, and rare diseases are examples of rare classes whos...Rare categories become more and more abundant and their characterization has received little attention thus far. Fraudulent banking transactions, network intrusions, and rare diseases are examples of rare classes whose detection and characterization are of high value. However, accurate char- acterization is challenging due to high-skewness and non- separability from majority classes, e.g., fraudulent transac- tions masquerade as legitimate ones. This paper proposes the RACH algorithm by exploring the compactness property of the rare categories. This algorithm is semi-supervised in na- ture since it uses both labeled and unlabeled data. It is based on an optimization framework which encloses the rare exam- ples by a minimum-radius hyperball. The framework is then converted into a convex optimization problem, which is in turn effectively solved in its dual form by the projected sub- gradient method. RACH can be naturally kernelized. Experi- mental results validate the effectiveness of RACH.展开更多
文摘Rare categories become more and more abundant and their characterization has received little attention thus far. Fraudulent banking transactions, network intrusions, and rare diseases are examples of rare classes whose detection and characterization are of high value. However, accurate char- acterization is challenging due to high-skewness and non- separability from majority classes, e.g., fraudulent transac- tions masquerade as legitimate ones. This paper proposes the RACH algorithm by exploring the compactness property of the rare categories. This algorithm is semi-supervised in na- ture since it uses both labeled and unlabeled data. It is based on an optimization framework which encloses the rare exam- ples by a minimum-radius hyperball. The framework is then converted into a convex optimization problem, which is in turn effectively solved in its dual form by the projected sub- gradient method. RACH can be naturally kernelized. Experi- mental results validate the effectiveness of RACH.