Attitude is one of the crucial parameters for space objects and plays a vital role in collision prediction and debris removal.Analyzing light curves to determine attitude is the most commonly used method.In photometri...Attitude is one of the crucial parameters for space objects and plays a vital role in collision prediction and debris removal.Analyzing light curves to determine attitude is the most commonly used method.In photometric observations,outliers may exist in the obtained light curves due to various reasons.Therefore,preprocessing is required to remove these outliers to obtain high quality light curves.Through statistical analysis,the reasons leading to outliers can be categorized into two main types:first,the brightness of the object significantly increases due to the passage of a star nearby,referred to as“stellar contamination,”and second,the brightness markedly decreases due to cloudy cover,referred to as“cloudy contamination.”The traditional approach of manually inspecting images for contamination is time-consuming and labor-intensive.However,we propose the utilization of machine learning methods as a substitute.Convolutional Neural Networks and SVMs are employed to identify cases of stellar contamination and cloudy contamination,achieving F1 scores of 1.00 and 0.98 on a test set,respectively.We also explore other machine learning methods such as ResNet-18 and Light Gradient Boosting Machine,then conduct comparative analyses of the results.展开更多
Big Data is reforming many industrial domains by providing decision support through analyzing large data volumes.Big Data testing aims to ensure that Big Data systems run smoothly and error-free while maintaining the ...Big Data is reforming many industrial domains by providing decision support through analyzing large data volumes.Big Data testing aims to ensure that Big Data systems run smoothly and error-free while maintaining the performance and quality of data.However,because of the diversity and complexity of data,testing Big Data is challenging.Though numerous research efforts deal with Big Data testing,a comprehensive review to address testing techniques and challenges of BigData is not available as yet.Therefore,we have systematically reviewed the Big Data testing techniques’evidence occurring in the period 2010–2021.This paper discusses testing data processing by highlighting the techniques used in every processing phase.Furthermore,we discuss the challenges and future directions.Our findings show that diverse functional,non-functional and combined(functional and non-functional)testing techniques have been used to solve specific problems related to Big Data.At the same time,most of the testing challenges have been faced during the MapReduce validation phase.In addition,the combinatorial testing technique is one of the most applied techniques in combination with other techniques(i.e.,random testing,mutation testing,input space partitioning and equivalence testing)to find various functional faults through Big Data testing.展开更多
To improve our understanding of the formation and evolution of the Moon, one of the payloads onboard the Chang'e-3 (CE-3) rover is Lunar Penetrating Radar (LPR). This investigation is the first attempt to explore...To improve our understanding of the formation and evolution of the Moon, one of the payloads onboard the Chang'e-3 (CE-3) rover is Lunar Penetrating Radar (LPR). This investigation is the first attempt to explore the lunar subsurface structure by using ground penetrating radar with high resolution. We have probed the subsur- face to a depth of several hundred meters using LPR. In-orbit testing, data processing and the preliminary results are presented. These observations have revealed the con- figuration of regolith where the thickness of regolith varies from about 4 m to 6 m. In addition, one layer of lunar rock, which is about 330 m deep and might have been accumulated during the depositional hiatus of mare basalts, was detected.展开更多
Artificial intelligence methods are indispensable to identifying pulsars from large amounts of candidates.We develop a new pulsar identification system that utilizes the CoAtNet to score two-dimensional features of ca...Artificial intelligence methods are indispensable to identifying pulsars from large amounts of candidates.We develop a new pulsar identification system that utilizes the CoAtNet to score two-dimensional features of candidates,implements a multilayer perceptron to score one-dimensional features,and relies on logistic regression to judge the corresponding scores.In the data preprocessing stage,we perform two feature fusions separately,one for one-dimensional features and the other for two-dimensional features,which are used as inputs for the multilayer perceptron and the CoAtNet respectively.The newly developed system achieves 98.77%recall,1.07%false positive rate(FPR)and 98.85%accuracy in our GPPS test set.展开更多
Classification of edge-on galaxies is important to astronomical studies due to our Milky Way galaxy being an edge-on galaxy.Edge-on galaxies pose a problem to classification due to their less overall brightness levels...Classification of edge-on galaxies is important to astronomical studies due to our Milky Way galaxy being an edge-on galaxy.Edge-on galaxies pose a problem to classification due to their less overall brightness levels and smaller numbers of pixels.In the current work,a novel technique for the classification of edge-on galaxies has been developed.This technique is based on the mathematical treatment of galaxy brightness data from their images.A special treatment for galaxies’brightness data is developed to enhance faint galaxies and eliminate adverse effects of high brightness backgrounds as well as adverse effects of background bright stars.A novel slimness weighting factor is developed to classify edge-on galaxies based on their slimness.The technique has the capacity to be optimized for different catalogs with different brightness levels.In the current work,the developed technique is optimized for the EFIGI catalog and is trained using a set of 1800 galaxies from this catalog.Upon classification of the full set of 4458 galaxies from the EFIGI catalog,an accuracy of 97.5% has been achieved,with an average processing time of about 0.26 seconds per galaxy on an average laptop.展开更多
Radio astronomy observations are frequently impacted by radio frequency interference(RFI).We propose a novel method,named 2σCRF,for cleaning RFI in the folded data of pulsar observations,utilizing a Bayesian-based mo...Radio astronomy observations are frequently impacted by radio frequency interference(RFI).We propose a novel method,named 2σCRF,for cleaning RFI in the folded data of pulsar observations,utilizing a Bayesian-based model called conditional random fields(CRFs).This algorithm minimizes the“energy”of every pixel given an initial label.The standard deviations(i.e.,rms values)of the folded pulsar data are utilized as pixels for all subintegrations and channels.Non-RFI data without obvious interference is treated as“background noise,”while RFI-affected data have different classes due to their exceptional rms values.This initial labeling can be automated and is adaptive to the actual data.The CRF algorithm optimizes the label category for each pixel of the image with the prior initial labels.We demonstrate the efficacy of the proposed method on pulsar folded data obtained from Five-hundred-meter Aperture Spherical radio Telescope observations.It can effectively recognize and tag various categories of RFIs,including broadband or narrowband,constant or instantaneous,and even weak RFIs that are unrecognizable in some pixels but picked out based on their neighborhoods.The results are comparable to those obtained via manual labeling but without the need for human intervention,saving time and effort.展开更多
Most existing star-galaxy classifiers depend on the reduced information from catalogs,necessitating careful data processing and feature extraction.In this study,we employ a supervised machine learning method(GoogLeNet...Most existing star-galaxy classifiers depend on the reduced information from catalogs,necessitating careful data processing and feature extraction.In this study,we employ a supervised machine learning method(GoogLeNet)to automatically classify stars and galaxies in the COSMOS field.Unlike traditional machine learning methods,we introduce several preprocessing techniques,including noise reduction and the unwrapping of denoised images in polar coordinates,applied to our carefully selected samples of stars and galaxies.By dividing the selected samples into training and validation sets in an 8:2 ratio,we evaluate the performance of the GoogLeNet model in distinguishing between stars and galaxies.The results indicate that the GoogLeNet model is highly effective,achieving accuracies of 99.6% and 99.9% for stars and galaxies,respectively.Furthermore,by comparing the results with and without preprocessing,we find that preprocessing can significantly improve classification accuracy(by approximately 2.0% to 6.0%)when the images are rotated.In preparation for the future launch of the China Space Station Telescope(CSST),we also evaluate the performance of the GoogLeNet model on the CSST simulation data.These results demonstrate a high level of accuracy(approximately 99.8%),indicating that this model can be effectively utilized for future observations with the CSST.展开更多
The Solar Polar-orbit Observatory(SPO),proposed by Chinese scientists,is designed to observe the solar polar regions in an unprecedented way with a spacecraft traveling in a large solar inclination angle and a small e...The Solar Polar-orbit Observatory(SPO),proposed by Chinese scientists,is designed to observe the solar polar regions in an unprecedented way with a spacecraft traveling in a large solar inclination angle and a small ellipticity.However,one of the most significant challenges lies in ultra-long-distance data transmission,particularly for the Magnetic and Helioseismic Imager(MHI),which is the most important payload and generates the largest volume of data in SPO.In this paper,we propose a tailored lossless data compression method based on the measurement mode and characteristics of MHI data.The background out of the solar disk is removed to decrease the pixel number of an image under compression.Multiple predictive coding methods are combined to eliminate the redundancy utilizing the correlation(space,spectrum,and polarization)in data set,improving the compression ratio.Experimental results demonstrate that our method achieves an average compression ratio of 3.67.The compression time is also less than the general observation period.The method exhibits strong feasibility and can be easily adapted to MHI.展开更多
Gravity as a fundamental force plays a dominant role in the formation and evolution of cosmic objects and leaves its effect in the emergence of symmetric and asymmetric structures.Thus,analyzing the symmetry criteria ...Gravity as a fundamental force plays a dominant role in the formation and evolution of cosmic objects and leaves its effect in the emergence of symmetric and asymmetric structures.Thus,analyzing the symmetry criteria allows us to uncover mechanisms behind the gravity interaction and understand the underlying physical processes that contribute to the formation of large-scale structures such as galaxies.We use a segmentation process using intensity thresholding and the k-means clustering algorithm to analyze radio galaxy images.We employ a symmetry criterion and explore the relation between morphological symmetry in radio maps and host galaxy properties.Optical properties(stellar mass,black hole mass,optical size(R_(50)),concentration,stellar mass surface density(μ_(50)),and stellar age)and radio properties(radio flux density,radio luminosity,and radio size)are considered.We found that there is a correlation between symmetry and radio size,indicating larger radio sources have smaller symmetry indices.Therefore,size of radio sources should be considered in any investigation of symmetry.Weak correlations are also observed with other properties,such as R_(50)for FRI galaxies and stellar age.We compare the symmetry differences between FRI and FRII radio galaxies.FRII galaxies show higher symmetry in 1.4 GHz and 150 MHz maps.Investigating the influence of radio source sizes,we discovered that this result is independent of the sizes of radio sources.These findings contribute to our understanding of the morphological properties and analyses of radio galaxies.展开更多
基金funded by the National Natural Science Foundation of China(NSFC,Nos.12373086 and 12303082)CAS“Light of West China”Program+2 种基金Yunnan Revitalization Talent Support Program in Yunnan ProvinceNational Key R&D Program of ChinaGravitational Wave Detection Project No.2022YFC2203800。
文摘Attitude is one of the crucial parameters for space objects and plays a vital role in collision prediction and debris removal.Analyzing light curves to determine attitude is the most commonly used method.In photometric observations,outliers may exist in the obtained light curves due to various reasons.Therefore,preprocessing is required to remove these outliers to obtain high quality light curves.Through statistical analysis,the reasons leading to outliers can be categorized into two main types:first,the brightness of the object significantly increases due to the passage of a star nearby,referred to as“stellar contamination,”and second,the brightness markedly decreases due to cloudy cover,referred to as“cloudy contamination.”The traditional approach of manually inspecting images for contamination is time-consuming and labor-intensive.However,we propose the utilization of machine learning methods as a substitute.Convolutional Neural Networks and SVMs are employed to identify cases of stellar contamination and cloudy contamination,achieving F1 scores of 1.00 and 0.98 on a test set,respectively.We also explore other machine learning methods such as ResNet-18 and Light Gradient Boosting Machine,then conduct comparative analyses of the results.
基金Science Foundation Ireland(SFI)under Grant Number SFI/16/RC/3918(Confirm)and Marie Sklodowska Curie Grant agreement No.847577 co-fundedthe European Regional Development Fund.Wasif Afzal has received funding from the European Union’s Horizon 2020 research and innovation program under CMC,2023,vol.74,no.22767 Grant agreement Nos.871319,957212from the ECSEL Joint Undertaking(JU)under Grant agreement No 101007350.
文摘Big Data is reforming many industrial domains by providing decision support through analyzing large data volumes.Big Data testing aims to ensure that Big Data systems run smoothly and error-free while maintaining the performance and quality of data.However,because of the diversity and complexity of data,testing Big Data is challenging.Though numerous research efforts deal with Big Data testing,a comprehensive review to address testing techniques and challenges of BigData is not available as yet.Therefore,we have systematically reviewed the Big Data testing techniques’evidence occurring in the period 2010–2021.This paper discusses testing data processing by highlighting the techniques used in every processing phase.Furthermore,we discuss the challenges and future directions.Our findings show that diverse functional,non-functional and combined(functional and non-functional)testing techniques have been used to solve specific problems related to Big Data.At the same time,most of the testing challenges have been faced during the MapReduce validation phase.In addition,the combinatorial testing technique is one of the most applied techniques in combination with other techniques(i.e.,random testing,mutation testing,input space partitioning and equivalence testing)to find various functional faults through Big Data testing.
基金Supported by the National Natural Science Foundation of China
文摘To improve our understanding of the formation and evolution of the Moon, one of the payloads onboard the Chang'e-3 (CE-3) rover is Lunar Penetrating Radar (LPR). This investigation is the first attempt to explore the lunar subsurface structure by using ground penetrating radar with high resolution. We have probed the subsur- face to a depth of several hundred meters using LPR. In-orbit testing, data processing and the preliminary results are presented. These observations have revealed the con- figuration of regolith where the thickness of regolith varies from about 4 m to 6 m. In addition, one layer of lunar rock, which is about 330 m deep and might have been accumulated during the depositional hiatus of mare basalts, was detected.
基金supported by the National Natural Science Foundation of China(NSFC,Nos.11988101 and 11833009)the Key Research Program of the Chinese Academy of Sciences(grant No.QYZDJ-SSW-SLH021)。
文摘Artificial intelligence methods are indispensable to identifying pulsars from large amounts of candidates.We develop a new pulsar identification system that utilizes the CoAtNet to score two-dimensional features of candidates,implements a multilayer perceptron to score one-dimensional features,and relies on logistic regression to judge the corresponding scores.In the data preprocessing stage,we perform two feature fusions separately,one for one-dimensional features and the other for two-dimensional features,which are used as inputs for the multilayer perceptron and the CoAtNet respectively.The newly developed system achieves 98.77%recall,1.07%false positive rate(FPR)and 98.85%accuracy in our GPPS test set.
文摘Classification of edge-on galaxies is important to astronomical studies due to our Milky Way galaxy being an edge-on galaxy.Edge-on galaxies pose a problem to classification due to their less overall brightness levels and smaller numbers of pixels.In the current work,a novel technique for the classification of edge-on galaxies has been developed.This technique is based on the mathematical treatment of galaxy brightness data from their images.A special treatment for galaxies’brightness data is developed to enhance faint galaxies and eliminate adverse effects of high brightness backgrounds as well as adverse effects of background bright stars.A novel slimness weighting factor is developed to classify edge-on galaxies based on their slimness.The technique has the capacity to be optimized for different catalogs with different brightness levels.In the current work,the developed technique is optimized for the EFIGI catalog and is trained using a set of 1800 galaxies from this catalog.Upon classification of the full set of 4458 galaxies from the EFIGI catalog,an accuracy of 97.5% has been achieved,with an average processing time of about 0.26 seconds per galaxy on an average laptop.
基金the GPPS survey project,as one of five key projects of FAST,a Chinese national mega-science facility,operated by the National Astronomical Observatories,Chinese Academy of Sciencessupported by the National Natural Science Foundation of China(NSFC,Nos.11988101 and 11833009)the Key Research Program of the Chinese Academy of Sciences(grant No.QYZDJ-SSW-SLH021)。
文摘Radio astronomy observations are frequently impacted by radio frequency interference(RFI).We propose a novel method,named 2σCRF,for cleaning RFI in the folded data of pulsar observations,utilizing a Bayesian-based model called conditional random fields(CRFs).This algorithm minimizes the“energy”of every pixel given an initial label.The standard deviations(i.e.,rms values)of the folded pulsar data are utilized as pixels for all subintegrations and channels.Non-RFI data without obvious interference is treated as“background noise,”while RFI-affected data have different classes due to their exceptional rms values.This initial labeling can be automated and is adaptive to the actual data.The CRF algorithm optimizes the label category for each pixel of the image with the prior initial labels.We demonstrate the efficacy of the proposed method on pulsar folded data obtained from Five-hundred-meter Aperture Spherical radio Telescope observations.It can effectively recognize and tag various categories of RFIs,including broadband or narrowband,constant or instantaneous,and even weak RFIs that are unrecognizable in some pixels but picked out based on their neighborhoods.The results are comparable to those obtained via manual labeling but without the need for human intervention,saving time and effort.
基金supported by the Strategic Priority Research Program of Chinese Academy of Sciences(grant No.XDB41000000)the National Natural Science Foundation of China(NSFC,Grant Nos.12233008 and 11973038)+2 种基金the China Manned Space Project(No.CMS-CSST-2021-A07)the Cyrus Chun Ying Tang Foundationsthe support from Hong Kong Innovation and Technology Fund through the Research Talent Hub program(GSP028)。
文摘Most existing star-galaxy classifiers depend on the reduced information from catalogs,necessitating careful data processing and feature extraction.In this study,we employ a supervised machine learning method(GoogLeNet)to automatically classify stars and galaxies in the COSMOS field.Unlike traditional machine learning methods,we introduce several preprocessing techniques,including noise reduction and the unwrapping of denoised images in polar coordinates,applied to our carefully selected samples of stars and galaxies.By dividing the selected samples into training and validation sets in an 8:2 ratio,we evaluate the performance of the GoogLeNet model in distinguishing between stars and galaxies.The results indicate that the GoogLeNet model is highly effective,achieving accuracies of 99.6% and 99.9% for stars and galaxies,respectively.Furthermore,by comparing the results with and without preprocessing,we find that preprocessing can significantly improve classification accuracy(by approximately 2.0% to 6.0%)when the images are rotated.In preparation for the future launch of the China Space Station Telescope(CSST),we also evaluate the performance of the GoogLeNet model on the CSST simulation data.These results demonstrate a high level of accuracy(approximately 99.8%),indicating that this model can be effectively utilized for future observations with the CSST.
基金supported by the National Key R&D Program of China(grant No.2022YFF0503800)by the National Natural Science Foundation of China(NSFC)(grant No.11427901)+1 种基金by the Strategic Priority Research Program of the Chinese Academy of Sciences(CAS-SPP)(grant No.XDA15320102)by the Youth Innovation Promotion Association(CAS No.2022057)。
文摘The Solar Polar-orbit Observatory(SPO),proposed by Chinese scientists,is designed to observe the solar polar regions in an unprecedented way with a spacecraft traveling in a large solar inclination angle and a small ellipticity.However,one of the most significant challenges lies in ultra-long-distance data transmission,particularly for the Magnetic and Helioseismic Imager(MHI),which is the most important payload and generates the largest volume of data in SPO.In this paper,we propose a tailored lossless data compression method based on the measurement mode and characteristics of MHI data.The background out of the solar disk is removed to decrease the pixel number of an image under compression.Multiple predictive coding methods are combined to eliminate the redundancy utilizing the correlation(space,spectrum,and polarization)in data set,improving the compression ratio.Experimental results demonstrate that our method achieves an average compression ratio of 3.67.The compression time is also less than the general observation period.The method exhibits strong feasibility and can be easily adapted to MHI.
文摘Gravity as a fundamental force plays a dominant role in the formation and evolution of cosmic objects and leaves its effect in the emergence of symmetric and asymmetric structures.Thus,analyzing the symmetry criteria allows us to uncover mechanisms behind the gravity interaction and understand the underlying physical processes that contribute to the formation of large-scale structures such as galaxies.We use a segmentation process using intensity thresholding and the k-means clustering algorithm to analyze radio galaxy images.We employ a symmetry criterion and explore the relation between morphological symmetry in radio maps and host galaxy properties.Optical properties(stellar mass,black hole mass,optical size(R_(50)),concentration,stellar mass surface density(μ_(50)),and stellar age)and radio properties(radio flux density,radio luminosity,and radio size)are considered.We found that there is a correlation between symmetry and radio size,indicating larger radio sources have smaller symmetry indices.Therefore,size of radio sources should be considered in any investigation of symmetry.Weak correlations are also observed with other properties,such as R_(50)for FRI galaxies and stellar age.We compare the symmetry differences between FRI and FRII radio galaxies.FRII galaxies show higher symmetry in 1.4 GHz and 150 MHz maps.Investigating the influence of radio source sizes,we discovered that this result is independent of the sizes of radio sources.These findings contribute to our understanding of the morphological properties and analyses of radio galaxies.