Background With the gradual increase of infertility in the world,among which male sperm problems are the main factor for infertility,more and more couples are using computer-assisted sperm analysis(CASA)to assist in t...Background With the gradual increase of infertility in the world,among which male sperm problems are the main factor for infertility,more and more couples are using computer-assisted sperm analysis(CASA)to assist in the analysis and treatment of infertility.Meanwhile,the rapid development of deep learning(DL)has led to strong results in image classification tasks.However,the classification of sperm images has not been well studied in current deep learning methods,and the sperm images are often affected by noise in practical CASA applications.The purpose of this article is to investigate the anti-noise robustness of deep learning classification methods applied on sperm images.Methods The SVIA dataset is a publicly available large-scale sperm dataset containing three subsets.In this work,we used subset-C,which provides more than 125,000 independent images of sperms and impurities,including 121,401 sperm images and 4,479 impurity images.To investigate the anti-noise robustness of deep learning classification methods applied on sperm images,we conducted a comprehensive comparative study of sperm images using many convolutional neural network(CNN)and visual transformer(VT)deep learning methods to find the deep learning model with the most stable anti-noise robustness.Results This study proved that VT had strong robustness for the classification of tiny object(sperm and impurity)image datasets under some types of conventional noise and some adversarial attacks.In particular,under the influence of Poisson noise,accuracy changed from 91.45%to 91.08%,impurity precison changed from 92.7%to 91.3%,impurity recall changed from 88.8%to 89.5%,and impurity F1-score changed 90.7%to 90.4%.Meanwhile,sperm precision changed from 90.9%to 90.5%,sperm recall changed from 92.5%to 93.8%,and sperm F1-score changed from 92.1%to 90.4%.Conclusion Sperm image classification may be strongly affected by noise in current deep learning methods;the robustness with regard to noise of VT methods based on global information is greater than that of CNN methods based on local information,indicating that the robustness with regard to noise is reflected mainly in global information.展开更多
Most economists approach the economy of China from a single visual angle considering it as a special economic modality of transition economy. Based on the analysis from the single visual angle, the paper puts forward ...Most economists approach the economy of China from a single visual angle considering it as a special economic modality of transition economy. Based on the analysis from the single visual angle, the paper puts forward a dual visual angle treating China's economy as one of both transition and transformation features, and attempts to research it from this dual visual angle.展开更多
Transformers,the dominant architecture for natural language processing,have also recently attracted much attention from computational visual media researchers due to their capacity for long-range representation and hi...Transformers,the dominant architecture for natural language processing,have also recently attracted much attention from computational visual media researchers due to their capacity for long-range representation and high performance.Transformers are sequence-to-sequence models,which use a selfattention mechanism rather than the RNN sequential structure.Thus,such models can be trained in parallel and can represent global information.This study comprehensively surveys recent visual transformer works.We categorize them according to task scenario:backbone design,high-level vision,low-level vision and generation,and multimodal learning.Their key ideas are also analyzed.Differing from previous surveys,we mainly focus on visual transformer methods in low-level vision and generation.The latest works on backbone design are also reviewed in detail.For ease of understanding,we precisely describe the main contributions of the latest works in the form of tables.As well as giving quantitative comparisons,we also present image results for low-level vision and generation tasks.Computational costs and source code links for various important works are also given in this survey to assist further development.展开更多
基金supported by the National Natural Science Foundation of China(Grant No.82220108007).
文摘Background With the gradual increase of infertility in the world,among which male sperm problems are the main factor for infertility,more and more couples are using computer-assisted sperm analysis(CASA)to assist in the analysis and treatment of infertility.Meanwhile,the rapid development of deep learning(DL)has led to strong results in image classification tasks.However,the classification of sperm images has not been well studied in current deep learning methods,and the sperm images are often affected by noise in practical CASA applications.The purpose of this article is to investigate the anti-noise robustness of deep learning classification methods applied on sperm images.Methods The SVIA dataset is a publicly available large-scale sperm dataset containing three subsets.In this work,we used subset-C,which provides more than 125,000 independent images of sperms and impurities,including 121,401 sperm images and 4,479 impurity images.To investigate the anti-noise robustness of deep learning classification methods applied on sperm images,we conducted a comprehensive comparative study of sperm images using many convolutional neural network(CNN)and visual transformer(VT)deep learning methods to find the deep learning model with the most stable anti-noise robustness.Results This study proved that VT had strong robustness for the classification of tiny object(sperm and impurity)image datasets under some types of conventional noise and some adversarial attacks.In particular,under the influence of Poisson noise,accuracy changed from 91.45%to 91.08%,impurity precison changed from 92.7%to 91.3%,impurity recall changed from 88.8%to 89.5%,and impurity F1-score changed 90.7%to 90.4%.Meanwhile,sperm precision changed from 90.9%to 90.5%,sperm recall changed from 92.5%to 93.8%,and sperm F1-score changed from 92.1%to 90.4%.Conclusion Sperm image classification may be strongly affected by noise in current deep learning methods;the robustness with regard to noise of VT methods based on global information is greater than that of CNN methods based on local information,indicating that the robustness with regard to noise is reflected mainly in global information.
文摘Most economists approach the economy of China from a single visual angle considering it as a special economic modality of transition economy. Based on the analysis from the single visual angle, the paper puts forward a dual visual angle treating China's economy as one of both transition and transformation features, and attempts to research it from this dual visual angle.
基金supported by National Key R&D Program of China under Grant No.2020AAA0106200National Natural Science Foundation of China under Grant Nos.61832016 and U20B2070.
文摘Transformers,the dominant architecture for natural language processing,have also recently attracted much attention from computational visual media researchers due to their capacity for long-range representation and high performance.Transformers are sequence-to-sequence models,which use a selfattention mechanism rather than the RNN sequential structure.Thus,such models can be trained in parallel and can represent global information.This study comprehensively surveys recent visual transformer works.We categorize them according to task scenario:backbone design,high-level vision,low-level vision and generation,and multimodal learning.Their key ideas are also analyzed.Differing from previous surveys,we mainly focus on visual transformer methods in low-level vision and generation.The latest works on backbone design are also reviewed in detail.For ease of understanding,we precisely describe the main contributions of the latest works in the form of tables.As well as giving quantitative comparisons,we also present image results for low-level vision and generation tasks.Computational costs and source code links for various important works are also given in this survey to assist further development.