Both computer science and archival science are concerned with archiving large-scale data,but they have different focuses.Large-scale data archiving in computer science focuses on technical aspects that can reduce the ...Both computer science and archival science are concerned with archiving large-scale data,but they have different focuses.Large-scale data archiving in computer science focuses on technical aspects that can reduce the cost of data storage and improve the reliability and efficiency of Big Data management.Its weaknesses lie in inadequate and non-standardized management.Archiving in archival science focuses on the management aspects and neglects the necessary technical considerations,resulting in high storage and retention costs and poor ability to manage Big Data.Therefore,the integration of large-scale data archiving and archival theory can balance the existing research limitations of the two fields and propose two research topics for related research-archival management of Big Data and large-scale management of archived Big Data.展开更多
Currently,China has 32 Earth observation satellites in orbit.The satellites can provide various data such as optical,multispectral,infrared,and radar.The spatial resolution of China Earth observation satellites ranges...Currently,China has 32 Earth observation satellites in orbit.The satellites can provide various data such as optical,multispectral,infrared,and radar.The spatial resolution of China Earth observation satellites ranges from low to medium to high.The satellites possess the capability to observe across multiple spectral bands,under all weather conditions,and at all times.The data of China Earth observation satellites has been widely used in fields such as natural resource detection,environmental monitoring and protection,disaster prevention and reduction,urban planning and mapping,agricultural and forestry surveys,land survey and geological prospecting,and ocean forecasting,achieving huge social benefits.This article introduces the recent progress of Earth observation satellites in China since 2022,especially the satellite operation,data archiving,data distribution and data coverage.展开更多
A new possible data archive format for storing huge amounts of data for EAST control anddata acquisition system is presented. This new general-purpose data archive format is network-transparent, i.e, machine-independe...A new possible data archive format for storing huge amounts of data for EAST control anddata acquisition system is presented. This new general-purpose data archive format is network-transparent, i.e, machine-independent and has been implemented in terms of XDR (eXternal Data Representation). We test this format by using EFIT (Equilibrium Fitting) code on different operation systems, namely Linux and Windows, different processors, namely Sun and Pc, and different programs, namely in Fortran and C language. It can be easily used by different computers and different programming languages.展开更多
Data archiving is one of the most critical issues for modern astronomical observations.With the development of a new generation of radio telescopes,the transfer and archiving of massive remote data have become urgent ...Data archiving is one of the most critical issues for modern astronomical observations.With the development of a new generation of radio telescopes,the transfer and archiving of massive remote data have become urgent problems to be solved.Herein,we present a practical and robust file-level flow-control approach,called the Unlimited Sliding-Window(USW),by referring to the classic flow-control method in the TCP protocol.Based on the USW and the Next Generation Archive System(NGAS)developed for the Murchison Widefield Array telescope,we further implemented an enhanced archive system(ENGAS)using ZeroMQ middleware.The ENGAS substantially improves the transfer performance and ensures the integrity of transferred files.In the tests,the ENGAS is approximately three to twelve times faster than the NGAS and can fully utilize the bandwidth of network links.Thus,for archiving radio observation data,the ENGAS reduces the communication time,improves the bandwidth utilization,and solves the remote synchronous archiving of data from observatories such as Mingantu spectral radioheliograph.It also provides a better reference for the future construction of the Square Kilometer Array(SKA)Science Regional Center.展开更多
With the growing popularity of Internet applications and the widespread use of mobile Internet, Internet traffic has maintained rapid growth over the past two decades. Internet Traffic Archival Systems(ITAS) for pac...With the growing popularity of Internet applications and the widespread use of mobile Internet, Internet traffic has maintained rapid growth over the past two decades. Internet Traffic Archival Systems(ITAS) for packets or flow records have become more and more widely used in network monitoring, network troubleshooting, and user behavior and experience analysis. Among the three key technologies in ITAS, we focus on bitmap index compression algorithm and give a detailed survey in this paper. The current state-of-the-art bitmap index encoding schemes include: BBC, WAH, PLWAH, EWAH, PWAH, CONCISE, COMPAX, VLC, DF-WAH, and VAL-WAH. Based on differences in segmentation, chunking, merge compress, and Near Identical(NI) features, we provide a thorough categorization of the state-of-the-art bitmap index compression algorithms. We also propose some new bitmap index encoding algorithms, such as SECOMPAX, ICX, MASC, and PLWAH+, and present the state diagrams for their encoding algorithms. We then evaluate their CPU and GPU implementations with a real Internet trace from CAIDA. Finally, we summarize and discuss the future direction of bitmap index compression algorithms. Beyond the application in network security and network forensic, bitmap index compression with faster bitwise-logical operations and reduced search space is widely used in analysis in genome data, geographical information system, graph databases, image retrieval, Internet of things, etc. It is expected that bitmap index compression will thrive and be prosperous again in Big Data era since 1980s.展开更多
Background LEAF is a complicated and integrated facility,which includes several different subsystems.In order to realize the remote control of field equipment and meet the requirements of the beam commissioning,a LEAF...Background LEAF is a complicated and integrated facility,which includes several different subsystems.In order to realize the remote control of field equipment and meet the requirements of the beam commissioning,a LEAF control system has been designed.The developed control system includes the following sub-systems:timing systems,data archiving systems,personnel safety systems,and machine protection systems.Methods The control system for LEAF is developed using the EPICS software toolset and the distributed control architec-ture.This is designed in a three-layer structure.At the equipment layer,the control of the low-level equipment is mainly done by various industrial controllers,including programmable logic controllers,controllers for serial devices,and motion controllers based on EtherCAT fieldbus.At the middle layer,the Ethernet switches are used to implement a Gigabit local area network.At the operation layer,high-level application software has been developed for the beam commissioning and the operation of the accelerator.Results The designed system can realize remote monitoring and control of field devices,provide synchronous timing and machine protection for key equipment,and automatically archive historical data for on-site running equipment.Conclusion The complete system has a clear structure and stable operation and has been successfully applied to beam commissioning and operation of the LEAF facility.展开更多
基金supported by the National Natural Science Foundation of China(grant number 72074214).
文摘Both computer science and archival science are concerned with archiving large-scale data,but they have different focuses.Large-scale data archiving in computer science focuses on technical aspects that can reduce the cost of data storage and improve the reliability and efficiency of Big Data management.Its weaknesses lie in inadequate and non-standardized management.Archiving in archival science focuses on the management aspects and neglects the necessary technical considerations,resulting in high storage and retention costs and poor ability to manage Big Data.Therefore,the integration of large-scale data archiving and archival theory can balance the existing research limitations of the two fields and propose two research topics for related research-archival management of Big Data and large-scale management of archived Big Data.
文摘Currently,China has 32 Earth observation satellites in orbit.The satellites can provide various data such as optical,multispectral,infrared,and radar.The spatial resolution of China Earth observation satellites ranges from low to medium to high.The satellites possess the capability to observe across multiple spectral bands,under all weather conditions,and at all times.The data of China Earth observation satellites has been widely used in fields such as natural resource detection,environmental monitoring and protection,disaster prevention and reduction,urban planning and mapping,agricultural and forestry surveys,land survey and geological prospecting,and ocean forecasting,achieving huge social benefits.This article introduces the recent progress of Earth observation satellites in China since 2022,especially the satellite operation,data archiving,data distribution and data coverage.
基金supported by National Natural Science Foundation of China (No.10475079)
文摘A new possible data archive format for storing huge amounts of data for EAST control anddata acquisition system is presented. This new general-purpose data archive format is network-transparent, i.e, machine-independent and has been implemented in terms of XDR (eXternal Data Representation). We test this format by using EFIT (Equilibrium Fitting) code on different operation systems, namely Linux and Windows, different processors, namely Sun and Pc, and different programs, namely in Fortran and C language. It can be easily used by different computers and different programming languages.
基金supported by the National Key Research and Development Program of China(2020SKA0110300)the Joint Research Fund in Astronomy(U1831204 and U1931141)under cooperative agreement between the National Natural Science Foundation of China(NSFC)+7 种基金the Chinese Academy of Sciences(CAS)(NSFC,No.11903009)the Funds for International Cooperation and Exchange of the NSFC(11961141001)Yunnan Key Research and Development Program(2018IA054)The Key Science and Technology Program of Henan Province(Nos.202102210152,212102210611 and 202102210125)the Research and Cultivation Fund Project of Anyang Normal University(AYNUKPY-2019-24 and AYNUKPY-2020-25)supported by Astronomical Big Data Joint Research Centerco-founded by the National Astronomical ObservatoriesChinese Academy of Sciences and Alibaba Cloud。
文摘Data archiving is one of the most critical issues for modern astronomical observations.With the development of a new generation of radio telescopes,the transfer and archiving of massive remote data have become urgent problems to be solved.Herein,we present a practical and robust file-level flow-control approach,called the Unlimited Sliding-Window(USW),by referring to the classic flow-control method in the TCP protocol.Based on the USW and the Next Generation Archive System(NGAS)developed for the Murchison Widefield Array telescope,we further implemented an enhanced archive system(ENGAS)using ZeroMQ middleware.The ENGAS substantially improves the transfer performance and ensures the integrity of transferred files.In the tests,the ENGAS is approximately three to twelve times faster than the NGAS and can fully utilize the bandwidth of network links.Thus,for archiving radio observation data,the ENGAS reduces the communication time,improves the bandwidth utilization,and solves the remote synchronous archiving of data from observatories such as Mingantu spectral radioheliograph.It also provides a better reference for the future construction of the Square Kilometer Array(SKA)Science Regional Center.
基金supported by the National Key Basic Research and Development (973) Program of China (Nos. 2012CB315801 and 2013CB228206)the National Natural Science Foundation of China A3 Program (No. 61140320)+2 种基金the National Natural Science Foundation of China (Nos. 61233016 and 61472200)supported by the National Training Program of Innovation and Entrepreneurship for Undergraduates (Nos. 201410003033 and 201410003031)Hitachi (China) Research and Development Corporation
文摘With the growing popularity of Internet applications and the widespread use of mobile Internet, Internet traffic has maintained rapid growth over the past two decades. Internet Traffic Archival Systems(ITAS) for packets or flow records have become more and more widely used in network monitoring, network troubleshooting, and user behavior and experience analysis. Among the three key technologies in ITAS, we focus on bitmap index compression algorithm and give a detailed survey in this paper. The current state-of-the-art bitmap index encoding schemes include: BBC, WAH, PLWAH, EWAH, PWAH, CONCISE, COMPAX, VLC, DF-WAH, and VAL-WAH. Based on differences in segmentation, chunking, merge compress, and Near Identical(NI) features, we provide a thorough categorization of the state-of-the-art bitmap index compression algorithms. We also propose some new bitmap index encoding algorithms, such as SECOMPAX, ICX, MASC, and PLWAH+, and present the state diagrams for their encoding algorithms. We then evaluate their CPU and GPU implementations with a real Internet trace from CAIDA. Finally, we summarize and discuss the future direction of bitmap index compression algorithms. Beyond the application in network security and network forensic, bitmap index compression with faster bitwise-logical operations and reduced search space is widely used in analysis in genome data, geographical information system, graph databases, image retrieval, Internet of things, etc. It is expected that bitmap index compression will thrive and be prosperous again in Big Data era since 1980s.
基金Supported by National Nature Science Foundation of China(Contract No.11427904).
文摘Background LEAF is a complicated and integrated facility,which includes several different subsystems.In order to realize the remote control of field equipment and meet the requirements of the beam commissioning,a LEAF control system has been designed.The developed control system includes the following sub-systems:timing systems,data archiving systems,personnel safety systems,and machine protection systems.Methods The control system for LEAF is developed using the EPICS software toolset and the distributed control architec-ture.This is designed in a three-layer structure.At the equipment layer,the control of the low-level equipment is mainly done by various industrial controllers,including programmable logic controllers,controllers for serial devices,and motion controllers based on EtherCAT fieldbus.At the middle layer,the Ethernet switches are used to implement a Gigabit local area network.At the operation layer,high-level application software has been developed for the beam commissioning and the operation of the accelerator.Results The designed system can realize remote monitoring and control of field devices,provide synchronous timing and machine protection for key equipment,and automatically archive historical data for on-site running equipment.Conclusion The complete system has a clear structure and stable operation and has been successfully applied to beam commissioning and operation of the LEAF facility.