This article discusses the current status and development strategies of computer science and technology in the context of big data.Firstly,it explains the relationship between big data and computer science and technol...This article discusses the current status and development strategies of computer science and technology in the context of big data.Firstly,it explains the relationship between big data and computer science and technology,focusing on analyzing the current application status of computer science and technology in big data,including data storage,data processing,and data analysis.Then,it proposes development strategies for big data processing.Computer science and technology play a vital role in big data processing by providing strong technical support.展开更多
A typical building project has a long life in the maintenance stage. Also, the cost at this stage is enormously huge compared to planning, design and construction phases. In the earlier stage, which is planning or des...A typical building project has a long life in the maintenance stage. Also, the cost at this stage is enormously huge compared to planning, design and construction phases. In the earlier stage, which is planning or design phase, however, many project participants put little emphasis on the maintenance information. As a result, important maintenance data is missing and erroneously feedback to the 3D/BIM model. This research provides a generic process model for maintenance information management for building facilities. The authors have identified that there exist most-frequently used information areas: checking information, material information, equipment information, supplier information, and maintenance history information. Each information area should be embedded in the BIM model in order to effectively feedback to the operation and maintenance stage in the project. Thus, the study has proposed a novel data format structure which can effectively link the 3D/BIM object with the maintenance data. The demonstration project shows how the data format structure is used. The contribution of this study is to provide guidance to a project practitioner by step-by-step approach in dealing with the significant maintenance information in the earlier stage of the construction project.展开更多
With the speed gap between storage system access and processor computing, end-to-end data processing has become a bottleneck to improve the total performance of computer systems over the Internet. Based on the analysi...With the speed gap between storage system access and processor computing, end-to-end data processing has become a bottleneck to improve the total performance of computer systems over the Internet. Based on the analysis of data processing behavior, an adaptive cache organization scheme is proposed with fast address calculation. This scheme can make full use of the characteristics of stack space data access, adopt fast address calculation strategy, and reduce the hit time of stack access. Adaptively, the stack cache can be turned off from beginning to end, when a stack overflow occurs to avoid the effect of stack switching on processor performance. Also, through the instruction cache and the failure behavior for the data cache, a prefetching policy is developed, which is combined with the data capture of the failover queue state. Finally, the proposed method can maintain the order of instruction and data access, which facilitates the extraction of prefetching in the end-to-end data processing.展开更多
针对大规模数据输入输出的应用场景,提出了一种基于层次存储格式HDF5(Hierarchical Data Format 5)的多层次并行IO(Input/Output)方案。该并行IO方案分为节点间和节点内两层:节点间以节点为单位IO数据并允许节点内部协同或独立工作,根...针对大规模数据输入输出的应用场景,提出了一种基于层次存储格式HDF5(Hierarchical Data Format 5)的多层次并行IO(Input/Output)方案。该并行IO方案分为节点间和节点内两层:节点间以节点为单位IO数据并允许节点内部协同或独立工作,根据节点内部的工作方式分别提出了多层次并行IO算法和多层次哨兵并行IO算法,以有效提升IO效率并避免输出文件冗余。考虑异构计算和纯CPU计算两个典型应用场景,分别在曙光平台和Intel平台进行最大核数为4096、最大数据量为256G的多组实验。结果表明,多层次并行IO算法IO效率提高了1.97~25.87倍,多层次哨兵并行IO算法IO效率提高了6.53~9.36倍,且输出文件数量减少到多区并行IO算法的1/4和1/32。展开更多
期刊论文结构化加工在期刊界已经逐步形成共识,国内期刊平台多采用新版期刊文章标签集(Journal Article Tag Suite,JATS)标准进行加工,但JATS标准仅对数据属性提出建议值,自行拓展空间较大,导致实际的数据加工结果千差万别,数据交换困...期刊论文结构化加工在期刊界已经逐步形成共识,国内期刊平台多采用新版期刊文章标签集(Journal Article Tag Suite,JATS)标准进行加工,但JATS标准仅对数据属性提出建议值,自行拓展空间较大,导致实际的数据加工结果千差万别,数据交换困难重重。本文分析了国内外数字化加工和标准进化的历程及我国在XML结构化数据加工中存在的问题,进一步分析了存档及交换标签集、出版标签集等不同子集的特点,提出既能完整保留论文原始信息,又便于提取各类结构化信息的数据加工及存储解决方案,可以根据需要通过减法转换生成符合各平台标准的数据加工存储格式,从而真正实现一次加工、多渠道投放和传播。展开更多
内部电网地理信息系统(Geographic Information Systern,GIS)数据体量增加,对电网数据存储性能造成了极大的困难,为此,提出一种基于随机森林的电网GIS数据分布式存储方法。以跨域资源共享(Cross-Origin Resource Sharing,CORS)技术在电...内部电网地理信息系统(Geographic Information Systern,GIS)数据体量增加,对电网数据存储性能造成了极大的困难,为此,提出一种基于随机森林的电网GIS数据分布式存储方法。以跨域资源共享(Cross-Origin Resource Sharing,CORS)技术在电网GIS空间信息服务平台中获取的电网GIS数据为基础,根据类区分度数值选择电网GIS数据特征,引入随机森林算法分类处理电网GIS数据,将其合理分发给不同的服务器,采用并行处理手段存储分类数据,从而实现了电网GIS数据的分布式存储。实验数据显示:应用所提方法后,电网GIS数据分类精度达到了96.8%,电网GIS数据分布式存储时间最小值为5.2 s,充分证实了所提方法数据存储性能更佳。展开更多
在当今信息时代,数据的复杂性不断增加,传统的关系型数据库在大规模数据存储和处理方面面临着挑战。非关系型数据库(Not Only SQL,NoSQL)作为一种新的存储和处理数据的方法,受到了广泛关注,并在分布式存储领域取得了显著的成就。文章重...在当今信息时代,数据的复杂性不断增加,传统的关系型数据库在大规模数据存储和处理方面面临着挑战。非关系型数据库(Not Only SQL,NoSQL)作为一种新的存储和处理数据的方法,受到了广泛关注,并在分布式存储领域取得了显著的成就。文章重点探讨基于大数据技术的非关系型数据库分布式存储方法,并通过实验进行评估,发现其在可扩展性和安全性方面具有优势,可以为相关研究提供参考。展开更多
文摘This article discusses the current status and development strategies of computer science and technology in the context of big data.Firstly,it explains the relationship between big data and computer science and technology,focusing on analyzing the current application status of computer science and technology in big data,including data storage,data processing,and data analysis.Then,it proposes development strategies for big data processing.Computer science and technology play a vital role in big data processing by providing strong technical support.
文摘A typical building project has a long life in the maintenance stage. Also, the cost at this stage is enormously huge compared to planning, design and construction phases. In the earlier stage, which is planning or design phase, however, many project participants put little emphasis on the maintenance information. As a result, important maintenance data is missing and erroneously feedback to the 3D/BIM model. This research provides a generic process model for maintenance information management for building facilities. The authors have identified that there exist most-frequently used information areas: checking information, material information, equipment information, supplier information, and maintenance history information. Each information area should be embedded in the BIM model in order to effectively feedback to the operation and maintenance stage in the project. Thus, the study has proposed a novel data format structure which can effectively link the 3D/BIM object with the maintenance data. The demonstration project shows how the data format structure is used. The contribution of this study is to provide guidance to a project practitioner by step-by-step approach in dealing with the significant maintenance information in the earlier stage of the construction project.
文摘With the speed gap between storage system access and processor computing, end-to-end data processing has become a bottleneck to improve the total performance of computer systems over the Internet. Based on the analysis of data processing behavior, an adaptive cache organization scheme is proposed with fast address calculation. This scheme can make full use of the characteristics of stack space data access, adopt fast address calculation strategy, and reduce the hit time of stack access. Adaptively, the stack cache can be turned off from beginning to end, when a stack overflow occurs to avoid the effect of stack switching on processor performance. Also, through the instruction cache and the failure behavior for the data cache, a prefetching policy is developed, which is combined with the data capture of the failover queue state. Finally, the proposed method can maintain the order of instruction and data access, which facilitates the extraction of prefetching in the end-to-end data processing.
文摘针对大规模数据输入输出的应用场景,提出了一种基于层次存储格式HDF5(Hierarchical Data Format 5)的多层次并行IO(Input/Output)方案。该并行IO方案分为节点间和节点内两层:节点间以节点为单位IO数据并允许节点内部协同或独立工作,根据节点内部的工作方式分别提出了多层次并行IO算法和多层次哨兵并行IO算法,以有效提升IO效率并避免输出文件冗余。考虑异构计算和纯CPU计算两个典型应用场景,分别在曙光平台和Intel平台进行最大核数为4096、最大数据量为256G的多组实验。结果表明,多层次并行IO算法IO效率提高了1.97~25.87倍,多层次哨兵并行IO算法IO效率提高了6.53~9.36倍,且输出文件数量减少到多区并行IO算法的1/4和1/32。
文摘期刊论文结构化加工在期刊界已经逐步形成共识,国内期刊平台多采用新版期刊文章标签集(Journal Article Tag Suite,JATS)标准进行加工,但JATS标准仅对数据属性提出建议值,自行拓展空间较大,导致实际的数据加工结果千差万别,数据交换困难重重。本文分析了国内外数字化加工和标准进化的历程及我国在XML结构化数据加工中存在的问题,进一步分析了存档及交换标签集、出版标签集等不同子集的特点,提出既能完整保留论文原始信息,又便于提取各类结构化信息的数据加工及存储解决方案,可以根据需要通过减法转换生成符合各平台标准的数据加工存储格式,从而真正实现一次加工、多渠道投放和传播。
文摘在当今信息时代,数据的复杂性不断增加,传统的关系型数据库在大规模数据存储和处理方面面临着挑战。非关系型数据库(Not Only SQL,NoSQL)作为一种新的存储和处理数据的方法,受到了广泛关注,并在分布式存储领域取得了显著的成就。文章重点探讨基于大数据技术的非关系型数据库分布式存储方法,并通过实验进行评估,发现其在可扩展性和安全性方面具有优势,可以为相关研究提供参考。