High-quality datasets are critical for the development of advanced machine-learning algorithms in seismology.Here,we present an earthquake dataset based on the ChinArray Phase I records(X1).ChinArray Phase I was deplo...High-quality datasets are critical for the development of advanced machine-learning algorithms in seismology.Here,we present an earthquake dataset based on the ChinArray Phase I records(X1).ChinArray Phase I was deployed in the southern north-south seismic zone(20°N-32°N,95°E-110°E)in 2011-2013 using 355 portable broadband seismic stations.CREDIT-X1local,the first release of the ChinArray Reference Earthquake Dataset for Innovative Techniques(CREDIT),includes comprehensive information for the 105,455 local events that occurred in the southern north-south seismic zone during array observation,incorporating them into a single HDF5 file.Original 100-Hz sampled three-component waveforms are organized by event for stations within epicenter distances of 1,000 km,and records of≥200 s are included for each waveform.Two types of phase labels are provided.The first includes manually picked labels for 5,999 events with magnitudes≥2.0,providing 66,507 Pg,42,310 Sg,12,823 Pn,and 546 Sn phases.The second contains automatically labeled phases for 105,442 events with magnitudes of−1.6 to 7.6.These phases were picked using a recurrent neural network phase picker and screened using the corresponding travel time curves,resulting in 1,179,808 Pg,884,281 Sg,176,089 Pn,and 22,986 Sn phases.Additionally,first-motion polarities are included for 31,273 Pg phases.The event and station locations are provided,so that deep learning networks for both conventional phase picking and phase association can be trained and validated.The CREDIT-X1local dataset is the first million-scale dataset constructed from a dense seismic array,which is designed to support various multi-station deep-learning methods,high-precision focal mechanism inversion,and seismic tomography studies.Additionally,owing to the high seismicity in the southern north-south seismic zone in China,this dataset has great potential for future scientific discoveries.展开更多
In recent years,artificial intelligence technology has exhibited great potential in seismic signal recognition,setting off a new wave of research.Vast amounts of high-quality labeled data are required to develop and a...In recent years,artificial intelligence technology has exhibited great potential in seismic signal recognition,setting off a new wave of research.Vast amounts of high-quality labeled data are required to develop and apply artificial intelligence in seismology research.In this study,based on the 2013–2020 seismic cataloging reports of the China Earthquake Networks Center,we constructed an artificial intelligence seismological training dataset(“DiTing”)with the largest known total time length.Data were recorded using broadband and short-period seismometers.The obtained dataset included 2,734,748 threecomponent waveform traces from 787,010 regional seismic events,the corresponding P-and S-phase arrival time labels,and 641,025 P-wave first-motion polarity labels.All waveforms were sampled at 50 Hz and cut to a time length of 180 s starting from a random number of seconds before the occurrence of an earthquake.Each three-component waveform contained a considerable amount of descriptive information,such as the epicentral distance,back azimuth,and signal-to-noise ratios.The magnitudes of seismic events,epicentral distance,signal-to-noise ratio of P-wave data,and signal-to-noise ratio of S-wave data ranged from 0 to 7.7,0 to 330 km,–0.05 to 5.31 dB,and–0.05 to 4.73 dB,respectively.The dataset compiled in this study can serve as a high-quality benchmark for machine learning model development and data-driven seismological research on earthquake detection,seismic phase picking,first-motion polarity determination,earthquake magnitude prediction,early warning systems,and strong ground-motion prediction.Such research will further promote the development and application of artificial intelligence in seismology.展开更多
基金funded by the National Key R&D Program of China (No. 2021YFC3000702)the Special Fund of the Institute of Geophysics, China Earthquake Administration (No. DQJB20B15)+2 种基金the National Natural Science Foundation of China youth Grant (No. 41804059)the Joint Funds of the National Natural Science Foundation of China (No. U223920029)the Science for Earthquake Resilience of China Earthquake Administration (No. XH211103)
文摘High-quality datasets are critical for the development of advanced machine-learning algorithms in seismology.Here,we present an earthquake dataset based on the ChinArray Phase I records(X1).ChinArray Phase I was deployed in the southern north-south seismic zone(20°N-32°N,95°E-110°E)in 2011-2013 using 355 portable broadband seismic stations.CREDIT-X1local,the first release of the ChinArray Reference Earthquake Dataset for Innovative Techniques(CREDIT),includes comprehensive information for the 105,455 local events that occurred in the southern north-south seismic zone during array observation,incorporating them into a single HDF5 file.Original 100-Hz sampled three-component waveforms are organized by event for stations within epicenter distances of 1,000 km,and records of≥200 s are included for each waveform.Two types of phase labels are provided.The first includes manually picked labels for 5,999 events with magnitudes≥2.0,providing 66,507 Pg,42,310 Sg,12,823 Pn,and 546 Sn phases.The second contains automatically labeled phases for 105,442 events with magnitudes of−1.6 to 7.6.These phases were picked using a recurrent neural network phase picker and screened using the corresponding travel time curves,resulting in 1,179,808 Pg,884,281 Sg,176,089 Pn,and 22,986 Sn phases.Additionally,first-motion polarities are included for 31,273 Pg phases.The event and station locations are provided,so that deep learning networks for both conventional phase picking and phase association can be trained and validated.The CREDIT-X1local dataset is the first million-scale dataset constructed from a dense seismic array,which is designed to support various multi-station deep-learning methods,high-precision focal mechanism inversion,and seismic tomography studies.Additionally,owing to the high seismicity in the southern north-south seismic zone in China,this dataset has great potential for future scientific discoveries.
基金the National Natural Science Foundation of China(Nos.41804047 and 42111540260)Fundamental Research Funds of the Institute of Geophysics,China Earthquake Administration(NO.DQJB19A0114)the Key Research Program of the Institute of Geology and Geophysics,Chinese Academy of Sciences(No.IGGCAS-201904).
文摘In recent years,artificial intelligence technology has exhibited great potential in seismic signal recognition,setting off a new wave of research.Vast amounts of high-quality labeled data are required to develop and apply artificial intelligence in seismology research.In this study,based on the 2013–2020 seismic cataloging reports of the China Earthquake Networks Center,we constructed an artificial intelligence seismological training dataset(“DiTing”)with the largest known total time length.Data were recorded using broadband and short-period seismometers.The obtained dataset included 2,734,748 threecomponent waveform traces from 787,010 regional seismic events,the corresponding P-and S-phase arrival time labels,and 641,025 P-wave first-motion polarity labels.All waveforms were sampled at 50 Hz and cut to a time length of 180 s starting from a random number of seconds before the occurrence of an earthquake.Each three-component waveform contained a considerable amount of descriptive information,such as the epicentral distance,back azimuth,and signal-to-noise ratios.The magnitudes of seismic events,epicentral distance,signal-to-noise ratio of P-wave data,and signal-to-noise ratio of S-wave data ranged from 0 to 7.7,0 to 330 km,–0.05 to 5.31 dB,and–0.05 to 4.73 dB,respectively.The dataset compiled in this study can serve as a high-quality benchmark for machine learning model development and data-driven seismological research on earthquake detection,seismic phase picking,first-motion polarity determination,earthquake magnitude prediction,early warning systems,and strong ground-motion prediction.Such research will further promote the development and application of artificial intelligence in seismology.