The COVID-19 pandemic has devastated our daily lives,leaving horrific repercussions in its aftermath.Due to its rapid spread,it was quite difficult for medical personnel to diagnose it in such a big quantity.Patients ...The COVID-19 pandemic has devastated our daily lives,leaving horrific repercussions in its aftermath.Due to its rapid spread,it was quite difficult for medical personnel to diagnose it in such a big quantity.Patients who test positive for Covid-19 are diagnosed via a nasal PCR test.In comparison,polymerase chain reaction(PCR)findings take a few hours to a few days.The PCR test is expensive,although the government may bear expenses in certain places.Furthermore,subsets of the population resist invasive testing like swabs.Therefore,chest X-rays or Computerized Vomography(CT)scans are preferred in most cases,and more importantly,they are non-invasive,inexpensive,and provide a faster response time.Recent advances in Artificial Intelligence(AI),in combination with state-of-the-art methods,have allowed for the diagnosis of COVID-19 using chest x-rays.This article proposes a method for classifying COVID-19 as positive or negative on a decentralized dataset that is based on the Federated learning scheme.In order to build a progressive global COVID-19 classification model,two edge devices are employed to train the model on their respective localized dataset,and a 3-layered custom Convolutional Neural Network(CNN)model is used in the process of training the model,which can be deployed from the server.These two edge devices then communicate their learned parameter and weight to the server,where it aggregates and updates the globalmodel.The proposed model is trained using an image dataset that can be found on Kaggle.There are more than 13,000 X-ray images in Kaggle Database collection,from that collection 9000 images of Normal and COVID-19 positive images are used.Each edge node possesses a different number of images;edge node 1 has 3200 images,while edge node 2 has 5800.There is no association between the datasets of the various nodes that are included in the network.By doing it in this manner,each of the nodes will have access to a separate image collection that has no correlation with each other.The diagnosis of COVID-19 has become considerably more efficient with the installation of the suggested algorithm and dataset,and the findings that we have obtained are quite encouraging.展开更多
The explosive increase in the number of images on the Internet has brought with it the great challenge of how to effectively index, retrieve, and organize these resources. Assigning proper tags to the visual content i...The explosive increase in the number of images on the Internet has brought with it the great challenge of how to effectively index, retrieve, and organize these resources. Assigning proper tags to the visual content is key to the success of many applications such as image retrieval and content mining. Although recent years have witnessed many advances in image tagging, these methods have limitations when applied to high-quality and large-scale training data that are expensive to obtain. In this paper, we propose a novel semantic neighbor learning method based on user-contributed social image datasets that can be acquired from the Web's inexhaustible social image content. In contrast to existing image tagging approaches that rely on high-quality image-tag supervision, we acquire weak supervision of our neighbor learning method by progressive neighborhood retrieval from noisy and diverse user-contributed image collections. The retrieved neighbor images are not only visually alike and partially correlated but also semantically related. We offer a step-by-step and easy-to-use implementation for the proposed method. Extensive experimentation on several datasets demonstrates that the performance of the proposed method significantly outperforms others.展开更多
This paper presents a large gathering dataset of images extracted from publicly filmed videos by 24 cameras installed on the premises of Masjid Al-Nabvi,Madinah,Saudi Arabia.This dataset consists of raw and processed ...This paper presents a large gathering dataset of images extracted from publicly filmed videos by 24 cameras installed on the premises of Masjid Al-Nabvi,Madinah,Saudi Arabia.This dataset consists of raw and processed images reflecting a highly challenging and unconstraint environment.The methodology for building the dataset consists of four core phases;that include acquisition of videos,extraction of frames,localization of face regions,and cropping and resizing of detected face regions.The raw images in the dataset consist of a total of 4613 frames obtained fromvideo sequences.The processed images in the dataset consist of the face regions of 250 persons extracted from raw data images to ensure the authenticity of the presented data.The dataset further consists of 8 images corresponding to each of the 250 subjects(persons)for a total of 2000 images.It portrays a highly unconstrained and challenging environment with human faces of varying sizes and pixel quality(resolution).Since the face regions in video sequences are severely degraded due to various unavoidable factors,it can be used as a benchmark to test and evaluate face detection and recognition algorithms for research purposes.We have also gathered and displayed records of the presence of subjects who appear in presented frames;in a temporal context.This can also be used as a temporal benchmark for tracking,finding persons,activity monitoring,and crowd counting in large crowd scenarios.展开更多
Historically,yarn-dyed plaid fabrics(YDPFs)have enjoyed enduring popularity with many rich plaid patterns,but production data are still classified and searched only according to production parameters.The process does ...Historically,yarn-dyed plaid fabrics(YDPFs)have enjoyed enduring popularity with many rich plaid patterns,but production data are still classified and searched only according to production parameters.The process does not satisfy the visual needs of sample order production,fabric design,and stock management.This study produced an image dataset for YDPFs,collected from 10,661 fabric samples.The authors believe that the dataset will have significant utility in further research into YDPFs.Convolutional neural networks,such as VGG,ResNet,and DenseNet,with different hyperparameter groups,seemed themost promising tools for the study.This paper reports on the authors’exhaustive evaluation of the YDPF dataset.With an overall accuracy of 88.78%,CNNs proved to be effective in YDPF image classification.This was true even for the low accuracy of Windowpane fabrics,which often mistakenly includes the Prince ofWales pattern.Image classification of traditional patterns is also improved by utilizing the strip pooling model to extract local detail features and horizontal and vertical directions.The strip pooling model characterizes the horizontal and vertical crisscross patterns of YDPFs with considerable success.The proposed method using the strip pooling model(SPM)improves the classification performance on the YDPF dataset by 2.64%for ResNet18,by 3.66%for VGG16,and by 3.54%for DenseNet121.The results reveal that the SPM significantly improves YDPF classification accuracy and reduces the error rate of Windowpane patterns as well.展开更多
Recently,many researchers have tried to develop a robust,fast,and accurate algorithm.This algorithm is for eye-tracking and detecting pupil position in many applications such as head-mounted eye tracking,gaze-based hu...Recently,many researchers have tried to develop a robust,fast,and accurate algorithm.This algorithm is for eye-tracking and detecting pupil position in many applications such as head-mounted eye tracking,gaze-based human-computer interaction,medical applications(such as deaf and diabetes patients),and attention analysis.Many real-world conditions challenge the eye appearance,such as illumination,reflections,and occasions.On the other hand,individual differences in eye physiology and other sources of noise,such as contact lenses or make-up.The present work introduces a robust pupil detection algorithm with and higher accuracy than the previous attempts for real-time analytics applications.The proposed circular hough transform with morphing canny edge detection for Pupillometery(CHMCEP)algorithm can detect even the blurred or noisy images by using different filtering methods in the pre-processing or start phase to remove the blur and noise and finally the second filtering process before the circular Hough transform for the center fitting to make sure better accuracy.The performance of the proposed CHMCEP algorithm was tested against recent pupil detection methods.Simulations and results show that the proposed CHMCEP algorithm achieved detection rates of 87.11,78.54,58,and 78 according to´Swirski,ExCuSe,Else,and labeled pupils in the wild(LPW)data sets,respectively.These results show that the proposed approach performs better than the other pupil detection methods by a large margin by providing exact and robust pupil positions on challenging ordinary eye pictures.展开更多
In the field of satellite imagery, remote sensing image captioning(RSIC) is a hot topic with the challenge of overfitting and difficulty of image and text alignment. To address these issues, this paper proposes a visi...In the field of satellite imagery, remote sensing image captioning(RSIC) is a hot topic with the challenge of overfitting and difficulty of image and text alignment. To address these issues, this paper proposes a vision-language aligning paradigm for RSIC to jointly represent vision and language. First, a new RSIC dataset DIOR-Captions is built for augmenting object detection in optical remote(DIOR) sensing images dataset with manually annotated Chinese and English contents. Second, a Vision-Language aligning model with Cross-modal Attention(VLCA) is presented to generate accurate and abundant bilingual descriptions for remote sensing images. Third, a crossmodal learning network is introduced to address the problem of visual-lingual alignment. Notably, VLCA is also applied to end-toend Chinese captions generation by using the pre-training language model of Chinese. The experiments are carried out with various baselines to validate VLCA on the proposed dataset. The results demonstrate that the proposed algorithm is more descriptive and informative than existing algorithms in producing captions.展开更多
During emergency evacuation,it is crucial to accurately detect and classify different groups of evacuees based on their behaviours using computer vision.Traditional object detection models trained on standard image da...During emergency evacuation,it is crucial to accurately detect and classify different groups of evacuees based on their behaviours using computer vision.Traditional object detection models trained on standard image databases often fail to recognise individuals in specific groups such as the elderly,disabled individuals and pregnant women,who require additional assistance during emergencies.To address this limitation,this study proposes a novel image dataset called the Human Behaviour Detection Dataset(HBDset),specifically collected and anno-tated for public safety and emergency response purposes.This dataset contains eight types of human behaviour categories,i.e.the normal adult,child,holding a crutch,holding a baby,using a wheelchair,pregnant woman,lugging luggage and using a mobile phone.The dataset comprises more than 1,5o0 images collected from various public scenarios,with more than 2,9oo bounding box annotations.The images were carefully selected,cleaned and subsequently manually annotated using the Labellmg tool.To demonstrate the effectiveness of the dataset,classical object detection algorithms were trained and tested based on the HBDset,and the average detection accuracy exceeds 90%,highlighting the robustness and universality of the dataset.The developed open HBDset has the potential to enhance public safety,provide early disaster warnings and prioritise the needs of vulnerable individuals during emergency evacuation.展开更多
Images are widely used by companies to advertise their products and promote awareness of their brands.The automatic synthesis of advertising images is challenging because the advertising message must be clearly convey...Images are widely used by companies to advertise their products and promote awareness of their brands.The automatic synthesis of advertising images is challenging because the advertising message must be clearly conveyed while complying with the style required for the product,brand,or target audience.In this study,we proposed a data-driven method to capture individual design attributes and the relationships between elements in advertising images with the aim of automatically synthesizing the input of elements into an advertising image according to a specified style.To achieve this multi-format advertisement design,we created a dataset containing 13280 advertising images with rich annotations that encompassed the outlines and colors of the elements,in addition to the classes and goals of the advertisements.Using our probabilistic models,users guided the style of synthesized advertisements via additional constraints(e.g.,context-based keywords).We applied our method to a variety of design tasks,and the results were evaluated in several perceptual studies,which showed that our method improved users’satisfaction by 7.1%compared to designs generated by nonprofessional students,and that more users preferred the coloring results of our designs to those generated by the color harmony model and Colormind.展开更多
Offline handwritten formula recognition is a challenging task due to the variety of handwritten symbols and two-dimensional formula structures.Recently,the deep neural network recognizers based on the encoder-decoder ...Offline handwritten formula recognition is a challenging task due to the variety of handwritten symbols and two-dimensional formula structures.Recently,the deep neural network recognizers based on the encoder-decoder frame-work have achieved great improvements on this task.However,the unsatisfactory recognition performance for formulas with long LTeX strings is one shortcoming of the existing work.Moreover,lacking sufficient training data also limits the capability of these recognizers.In this paper,we design a multimodal dependence attention(MDA)module to help the model learn visual and semantic dependencies among symbols in the same formula to improve the recognition perfor-mance of the formulas with long LTeX strings.To alleviate overfitting and further improve the recognition performance,we also propose a new dataset,Handwritten Formula Image Dataset(HFID),which contains 25620 handwritten formula images collected from real life.We conduct extensive experiments to demonstrate the effectiveness of our proposed MDA module and HFID dataset and achieve state-of-the-art performances,63.79%and 65.24%expression accuracy on CROHME 2014 and CROHME 2016,respectively.展开更多
Space object recognition plays an important role in spatial exploitation and surveillance, followed by two main problems: lacking of data and drastic changes in viewpoints. In this article, firstly, we build a three-...Space object recognition plays an important role in spatial exploitation and surveillance, followed by two main problems: lacking of data and drastic changes in viewpoints. In this article, firstly, we build a three-dimensional (3D) satellites dataset named BUAA Satellite Image Dataset (BUAA-SID 1.0) to supply data for 3D space object research. Then, based on the dataset, we propose to recognize full-viewpoint 3D space objects based on kernel locality preserving projections (KLPP). To obtain more accurate and separable description of the objects, firstly, we build feature vectors employing moment invariants, Fourier descriptors, region covariance and histogram of oriented gradients. Then, we map the features into kernel space followed by dimensionality reduction using KLPP to obtain the submanifold of the features. At last, k-nearest neighbor (kNN) is used to accomplish the classification. Experimental results show that the proposed approach is more appropriate for space object recognition mainly considering changes of viewpoints. Encouraging recognition rate could be obtained based on images in BUAA-SID 1.0, and the highest recognition result could achieve 95.87%.展开更多
The recent development of imaging and sequencing technologies enables systematic advances in the clinical study of lung cancer.Meanwhile,the human mind is limited in effectively handling and fully utilizing the accumu...The recent development of imaging and sequencing technologies enables systematic advances in the clinical study of lung cancer.Meanwhile,the human mind is limited in effectively handling and fully utilizing the accumulation of such enormous amounts of data.Machine learningbased approaches play a critical role in integrating and analyzing these large and complex datasets,which have extensively characterized lung cancer through the use of different perspectives from these accrued data.In this review,we provide an overview of machine learning-based approaches that strengthen the varying aspects of lung cancer diagnosis and therapy,including early detection,auxiliary diagnosis,prognosis prediction,and immunotherapy practice.Moreover,we highlight the challenges and opportunities for future applications of machine learning in lung cancer.展开更多
基金supported by Princess Nourah bint Abdulrahman University Researchers Supporting Project number(PNURSP2023R66)Princess Nourah bint Abdulrahman University,Riyadh,Saudi Arabia.
文摘The COVID-19 pandemic has devastated our daily lives,leaving horrific repercussions in its aftermath.Due to its rapid spread,it was quite difficult for medical personnel to diagnose it in such a big quantity.Patients who test positive for Covid-19 are diagnosed via a nasal PCR test.In comparison,polymerase chain reaction(PCR)findings take a few hours to a few days.The PCR test is expensive,although the government may bear expenses in certain places.Furthermore,subsets of the population resist invasive testing like swabs.Therefore,chest X-rays or Computerized Vomography(CT)scans are preferred in most cases,and more importantly,they are non-invasive,inexpensive,and provide a faster response time.Recent advances in Artificial Intelligence(AI),in combination with state-of-the-art methods,have allowed for the diagnosis of COVID-19 using chest x-rays.This article proposes a method for classifying COVID-19 as positive or negative on a decentralized dataset that is based on the Federated learning scheme.In order to build a progressive global COVID-19 classification model,two edge devices are employed to train the model on their respective localized dataset,and a 3-layered custom Convolutional Neural Network(CNN)model is used in the process of training the model,which can be deployed from the server.These two edge devices then communicate their learned parameter and weight to the server,where it aggregates and updates the globalmodel.The proposed model is trained using an image dataset that can be found on Kaggle.There are more than 13,000 X-ray images in Kaggle Database collection,from that collection 9000 images of Normal and COVID-19 positive images are used.Each edge node possesses a different number of images;edge node 1 has 3200 images,while edge node 2 has 5800.There is no association between the datasets of the various nodes that are included in the network.By doing it in this manner,each of the nodes will have access to a separate image collection that has no correlation with each other.The diagnosis of COVID-19 has become considerably more efficient with the installation of the suggested algorithm and dataset,and the findings that we have obtained are quite encouraging.
基金supported in part by the National Natural Science Foundation of China(Nos.61502094 and 61402099)Natural Science Foundation of Heilongjiang Province of China(Nos.F2016002 and F2015020)
文摘The explosive increase in the number of images on the Internet has brought with it the great challenge of how to effectively index, retrieve, and organize these resources. Assigning proper tags to the visual content is key to the success of many applications such as image retrieval and content mining. Although recent years have witnessed many advances in image tagging, these methods have limitations when applied to high-quality and large-scale training data that are expensive to obtain. In this paper, we propose a novel semantic neighbor learning method based on user-contributed social image datasets that can be acquired from the Web's inexhaustible social image content. In contrast to existing image tagging approaches that rely on high-quality image-tag supervision, we acquire weak supervision of our neighbor learning method by progressive neighborhood retrieval from noisy and diverse user-contributed image collections. The retrieved neighbor images are not only visually alike and partially correlated but also semantically related. We offer a step-by-step and easy-to-use implementation for the proposed method. Extensive experimentation on several datasets demonstrates that the performance of the proposed method significantly outperforms others.
基金This research was supported by the Deanship of Scientific Research,Islamic University of Madinah,Madinah(KSA),under Tammayuz program Grant Number 1442/505.
文摘This paper presents a large gathering dataset of images extracted from publicly filmed videos by 24 cameras installed on the premises of Masjid Al-Nabvi,Madinah,Saudi Arabia.This dataset consists of raw and processed images reflecting a highly challenging and unconstraint environment.The methodology for building the dataset consists of four core phases;that include acquisition of videos,extraction of frames,localization of face regions,and cropping and resizing of detected face regions.The raw images in the dataset consist of a total of 4613 frames obtained fromvideo sequences.The processed images in the dataset consist of the face regions of 250 persons extracted from raw data images to ensure the authenticity of the presented data.The dataset further consists of 8 images corresponding to each of the 250 subjects(persons)for a total of 2000 images.It portrays a highly unconstrained and challenging environment with human faces of varying sizes and pixel quality(resolution).Since the face regions in video sequences are severely degraded due to various unavoidable factors,it can be used as a benchmark to test and evaluate face detection and recognition algorithms for research purposes.We have also gathered and displayed records of the presence of subjects who appear in presented frames;in a temporal context.This can also be used as a temporal benchmark for tracking,finding persons,activity monitoring,and crowd counting in large crowd scenarios.
基金This work was supported by China Social Science Foundation under Grant[17CG209]The fabric samples were supported by Jiangsu Sunshine Group and Jiangsu Lianfa Textile Group.
文摘Historically,yarn-dyed plaid fabrics(YDPFs)have enjoyed enduring popularity with many rich plaid patterns,but production data are still classified and searched only according to production parameters.The process does not satisfy the visual needs of sample order production,fabric design,and stock management.This study produced an image dataset for YDPFs,collected from 10,661 fabric samples.The authors believe that the dataset will have significant utility in further research into YDPFs.Convolutional neural networks,such as VGG,ResNet,and DenseNet,with different hyperparameter groups,seemed themost promising tools for the study.This paper reports on the authors’exhaustive evaluation of the YDPF dataset.With an overall accuracy of 88.78%,CNNs proved to be effective in YDPF image classification.This was true even for the low accuracy of Windowpane fabrics,which often mistakenly includes the Prince ofWales pattern.Image classification of traditional patterns is also improved by utilizing the strip pooling model to extract local detail features and horizontal and vertical directions.The strip pooling model characterizes the horizontal and vertical crisscross patterns of YDPFs with considerable success.The proposed method using the strip pooling model(SPM)improves the classification performance on the YDPF dataset by 2.64%for ResNet18,by 3.66%for VGG16,and by 3.54%for DenseNet121.The results reveal that the SPM significantly improves YDPF classification accuracy and reduces the error rate of Windowpane patterns as well.
基金This research was funded by“TAIF UNIVERSITY RESEARCHERS SUPPORTING PROJECT,grant number TURSP-2020/345”,Taif University,Taif,Saudi Arabia.
文摘Recently,many researchers have tried to develop a robust,fast,and accurate algorithm.This algorithm is for eye-tracking and detecting pupil position in many applications such as head-mounted eye tracking,gaze-based human-computer interaction,medical applications(such as deaf and diabetes patients),and attention analysis.Many real-world conditions challenge the eye appearance,such as illumination,reflections,and occasions.On the other hand,individual differences in eye physiology and other sources of noise,such as contact lenses or make-up.The present work introduces a robust pupil detection algorithm with and higher accuracy than the previous attempts for real-time analytics applications.The proposed circular hough transform with morphing canny edge detection for Pupillometery(CHMCEP)algorithm can detect even the blurred or noisy images by using different filtering methods in the pre-processing or start phase to remove the blur and noise and finally the second filtering process before the circular Hough transform for the center fitting to make sure better accuracy.The performance of the proposed CHMCEP algorithm was tested against recent pupil detection methods.Simulations and results show that the proposed CHMCEP algorithm achieved detection rates of 87.11,78.54,58,and 78 according to´Swirski,ExCuSe,Else,and labeled pupils in the wild(LPW)data sets,respectively.These results show that the proposed approach performs better than the other pupil detection methods by a large margin by providing exact and robust pupil positions on challenging ordinary eye pictures.
基金supported by the National Natural Science Foundation of China (61702528,61806212)。
文摘In the field of satellite imagery, remote sensing image captioning(RSIC) is a hot topic with the challenge of overfitting and difficulty of image and text alignment. To address these issues, this paper proposes a vision-language aligning paradigm for RSIC to jointly represent vision and language. First, a new RSIC dataset DIOR-Captions is built for augmenting object detection in optical remote(DIOR) sensing images dataset with manually annotated Chinese and English contents. Second, a Vision-Language aligning model with Cross-modal Attention(VLCA) is presented to generate accurate and abundant bilingual descriptions for remote sensing images. Third, a crossmodal learning network is introduced to address the problem of visual-lingual alignment. Notably, VLCA is also applied to end-toend Chinese captions generation by using the pre-training language model of Chinese. The experiments are carried out with various baselines to validate VLCA on the proposed dataset. The results demonstrate that the proposed algorithm is more descriptive and informative than existing algorithms in producing captions.
基金funded by the Hong Kong Research Grants Council Theme-based Research Scheme(T22-505/19-N)the National Natural Science Foundation of China(52204232)MTR Research Fund(PTU-23005).
文摘During emergency evacuation,it is crucial to accurately detect and classify different groups of evacuees based on their behaviours using computer vision.Traditional object detection models trained on standard image databases often fail to recognise individuals in specific groups such as the elderly,disabled individuals and pregnant women,who require additional assistance during emergencies.To address this limitation,this study proposes a novel image dataset called the Human Behaviour Detection Dataset(HBDset),specifically collected and anno-tated for public safety and emergency response purposes.This dataset contains eight types of human behaviour categories,i.e.the normal adult,child,holding a crutch,holding a baby,using a wheelchair,pregnant woman,lugging luggage and using a mobile phone.The dataset comprises more than 1,5o0 images collected from various public scenarios,with more than 2,9oo bounding box annotations.The images were carefully selected,cleaned and subsequently manually annotated using the Labellmg tool.To demonstrate the effectiveness of the dataset,classical object detection algorithms were trained and tested based on the HBDset,and the average detection accuracy exceeds 90%,highlighting the robustness and universality of the dataset.The developed open HBDset has the potential to enhance public safety,provide early disaster warnings and prioritise the needs of vulnerable individuals during emergency evacuation.
基金Project supported by the National Science and Technology Innovation 2030 Major Project of the Ministry of Science and Technology of China(No.2018AAA0100700)the National Natural Science Foundation of China(No.61672451)+2 种基金the Provincial Key Research and Development Plan of Zhejiang Province,China(No.2019C03137)the China Postdoctoral Science Foundation(No.2018M630658)the Alibaba-Zhejiang University Joint Institute of Frontier Technologies。
文摘Images are widely used by companies to advertise their products and promote awareness of their brands.The automatic synthesis of advertising images is challenging because the advertising message must be clearly conveyed while complying with the style required for the product,brand,or target audience.In this study,we proposed a data-driven method to capture individual design attributes and the relationships between elements in advertising images with the aim of automatically synthesizing the input of elements into an advertising image according to a specified style.To achieve this multi-format advertisement design,we created a dataset containing 13280 advertising images with rich annotations that encompassed the outlines and colors of the elements,in addition to the classes and goals of the advertisements.Using our probabilistic models,users guided the style of synthesized advertisements via additional constraints(e.g.,context-based keywords).We applied our method to a variety of design tasks,and the results were evaluated in several perceptual studies,which showed that our method improved users’satisfaction by 7.1%compared to designs generated by nonprofessional students,and that more users preferred the coloring results of our designs to those generated by the color harmony model and Colormind.
基金supported by the National Key Research and Development Program of China under Grant No.2020YFB1313602.
文摘Offline handwritten formula recognition is a challenging task due to the variety of handwritten symbols and two-dimensional formula structures.Recently,the deep neural network recognizers based on the encoder-decoder frame-work have achieved great improvements on this task.However,the unsatisfactory recognition performance for formulas with long LTeX strings is one shortcoming of the existing work.Moreover,lacking sufficient training data also limits the capability of these recognizers.In this paper,we design a multimodal dependence attention(MDA)module to help the model learn visual and semantic dependencies among symbols in the same formula to improve the recognition perfor-mance of the formulas with long LTeX strings.To alleviate overfitting and further improve the recognition performance,we also propose a new dataset,Handwritten Formula Image Dataset(HFID),which contains 25620 handwritten formula images collected from real life.We conduct extensive experiments to demonstrate the effectiveness of our proposed MDA module and HFID dataset and achieve state-of-the-art performances,63.79%and 65.24%expression accuracy on CROHME 2014 and CROHME 2016,respectively.
基金National Natural Science Foundation of China (60776793,60802043)National Basic Research Program of China (2010CB327900)
文摘Space object recognition plays an important role in spatial exploitation and surveillance, followed by two main problems: lacking of data and drastic changes in viewpoints. In this article, firstly, we build a three-dimensional (3D) satellites dataset named BUAA Satellite Image Dataset (BUAA-SID 1.0) to supply data for 3D space object research. Then, based on the dataset, we propose to recognize full-viewpoint 3D space objects based on kernel locality preserving projections (KLPP). To obtain more accurate and separable description of the objects, firstly, we build feature vectors employing moment invariants, Fourier descriptors, region covariance and histogram of oriented gradients. Then, we map the features into kernel space followed by dimensionality reduction using KLPP to obtain the submanifold of the features. At last, k-nearest neighbor (kNN) is used to accomplish the classification. Experimental results show that the proposed approach is more appropriate for space object recognition mainly considering changes of viewpoints. Encouraging recognition rate could be obtained based on images in BUAA-SID 1.0, and the highest recognition result could achieve 95.87%.
基金supported in part by the National Institutes of Health,USA(Grant Nos.U01TR003528 and R01LM013337).
文摘The recent development of imaging and sequencing technologies enables systematic advances in the clinical study of lung cancer.Meanwhile,the human mind is limited in effectively handling and fully utilizing the accumulation of such enormous amounts of data.Machine learningbased approaches play a critical role in integrating and analyzing these large and complex datasets,which have extensively characterized lung cancer through the use of different perspectives from these accrued data.In this review,we provide an overview of machine learning-based approaches that strengthen the varying aspects of lung cancer diagnosis and therapy,including early detection,auxiliary diagnosis,prognosis prediction,and immunotherapy practice.Moreover,we highlight the challenges and opportunities for future applications of machine learning in lung cancer.