Current methods for predicting missing values in datasets often rely on simplistic approaches such as taking median value of attributes, limiting their applicability. Real-world observations can be diverse, taking sto...Current methods for predicting missing values in datasets often rely on simplistic approaches such as taking median value of attributes, limiting their applicability. Real-world observations can be diverse, taking stock price as example, ranging from prices post-IPO to values before a company’s collapse, or instances where certain data points are missing due to stock suspension. In this paper, we propose a novel approach using Nonlinear Matrix Completion (NIMC) and Deep Matrix Completion (DIMC) to predict associations, and conduct experiment on financial data between dates and stocks. Our method leverages various types of stock observations to capture latent factors explaining the observed date-stock associations. Notably, our approach is nonlinear, making it suitable for datasets with nonlinear structures, such as the Russell 3000. Unlike traditional methods that may suffer from information loss, NIMC and DIMC maintain nearly complete information, especially in high-dimensional parameters. We compared our approach with state-of-the-art linear methods, including Inductive Matrix Completion, Nonlinear Inductive Matrix Completion, and Deep Inductive Matrix Completion. Our findings show that the nonlinear matrix completion method is particularly effective for handling nonlinear structured data, as exemplified by the Russell 3000. Additionally, we validate the information loss of the three methods across different dimensionalities.展开更多
In Additive Manufacturing field, the current researches of data processing mainly focus on a slicing process of large STL files or complicated CAD models. To improve the efficiency and reduce the slicing time, a paral...In Additive Manufacturing field, the current researches of data processing mainly focus on a slicing process of large STL files or complicated CAD models. To improve the efficiency and reduce the slicing time, a parallel algorithm has great advantages. However, traditional algorithms can't make full use of multi-core CPU hardware resources. In the paper, a fast parallel algorithm is presented to speed up data processing. A pipeline mode is adopted to design the parallel algorithm. And the complexity of the pipeline algorithm is analyzed theoretically. To evaluate the performance of the new algorithm, effects of threads number and layers number are investigated by a serial of experiments. The experimental results show that the threads number and layers number are two remarkable factors to the speedup ratio. The tendency of speedup versus threads number reveals a positive relationship which greatly agrees with the Amdahl's law, and the tendency of speedup versus layers number also keeps a positive relationship agreeing with Gustafson's law. The new algorithm uses topological information to compute contours with a parallel method of speedup. Another parallel algorithm based on data parallel is used in experiments to show that pipeline parallel mode is more efficient. A case study at last shows a suspending performance of the new parallel algorithm. Compared with the serial slicing algorithm, the new pipeline parallel algorithm can make full use of the multi-core CPU hardware, accelerate the slicing process, and compared with the data parallel slicing algorithm, the new slicing algorithm in this paper adopts a pipeline parallel model, and a much higher speedup ratio and efficiency is achieved.展开更多
A profound understanding of the costs to perform condition assessment on buried drinking water pipeline infrastructure is required for enhanced asset management. Toward this end, an automated and uniform method of col...A profound understanding of the costs to perform condition assessment on buried drinking water pipeline infrastructure is required for enhanced asset management. Toward this end, an automated and uniform method of collecting cost data can provide water utilities a means for viewing, understanding, interpreting and visualizing complex geographically referenced cost information to reveal data relationships, patterns and trends. However, there has been no standard data model that allows automated data collection and interoperability across platforms. The primary objective of this research is to develop a standard cost data model for drinking water pipeline condition assessment projects and to conflate disparate datasets from differing utilities. The capabilities of this model will be further demonstrated through performing trend analyses. Field mapping files will be generated from the standard data model and demonstrated in an interactive web map created using Google Maps API (application programming interface) for JavaScript that allows the user to toggle project examples and to perform regional comparisons. The aggregation of standardized data and further use in mapping applications will help in providing timely access to condition assessment cost information and resources that will lead to enhanced asset management and resource allocation for drinking water utilities.展开更多
Current research on Digital Twin(DT)is largely focused on the performance of built assets in their operational phases as well as on urban environment.However,Digital Twin has not been given enough attention to constru...Current research on Digital Twin(DT)is largely focused on the performance of built assets in their operational phases as well as on urban environment.However,Digital Twin has not been given enough attention to construction phases,for which this paper proposes a Digital Twin framework for the construction phase,develops a DT prototype and tests it for the use case of measuring the productivity and monitoring of earthwork operation.The DT framework and its prototype are underpinned by the principles of versatility,scalability,usability and automation to enable the DT to fulfil the requirements of large-sized earthwork projects and the dynamic nature of their operation.Cloud computing and dashboard visualisation were deployed to enable automated and repeatable data pipelines and data analytics at scale and to provide insights in near-real time.The testing of the DT prototype in a motorway project in the Northeast of England successfully demonstrated its ability to produce key insights by using the following approaches:(i)To predict equipment utilisation ratios and productivities;(ii)To detect the percentage of time spent on different tasks(i.e.,loading,hauling,dumping,returning or idling),the distance travelled by equipment over time and the speed distribution;and(iii)To visualise certain earthwork operations.展开更多
文摘Current methods for predicting missing values in datasets often rely on simplistic approaches such as taking median value of attributes, limiting their applicability. Real-world observations can be diverse, taking stock price as example, ranging from prices post-IPO to values before a company’s collapse, or instances where certain data points are missing due to stock suspension. In this paper, we propose a novel approach using Nonlinear Matrix Completion (NIMC) and Deep Matrix Completion (DIMC) to predict associations, and conduct experiment on financial data between dates and stocks. Our method leverages various types of stock observations to capture latent factors explaining the observed date-stock associations. Notably, our approach is nonlinear, making it suitable for datasets with nonlinear structures, such as the Russell 3000. Unlike traditional methods that may suffer from information loss, NIMC and DIMC maintain nearly complete information, especially in high-dimensional parameters. We compared our approach with state-of-the-art linear methods, including Inductive Matrix Completion, Nonlinear Inductive Matrix Completion, and Deep Inductive Matrix Completion. Our findings show that the nonlinear matrix completion method is particularly effective for handling nonlinear structured data, as exemplified by the Russell 3000. Additionally, we validate the information loss of the three methods across different dimensionalities.
文摘In Additive Manufacturing field, the current researches of data processing mainly focus on a slicing process of large STL files or complicated CAD models. To improve the efficiency and reduce the slicing time, a parallel algorithm has great advantages. However, traditional algorithms can't make full use of multi-core CPU hardware resources. In the paper, a fast parallel algorithm is presented to speed up data processing. A pipeline mode is adopted to design the parallel algorithm. And the complexity of the pipeline algorithm is analyzed theoretically. To evaluate the performance of the new algorithm, effects of threads number and layers number are investigated by a serial of experiments. The experimental results show that the threads number and layers number are two remarkable factors to the speedup ratio. The tendency of speedup versus threads number reveals a positive relationship which greatly agrees with the Amdahl's law, and the tendency of speedup versus layers number also keeps a positive relationship agreeing with Gustafson's law. The new algorithm uses topological information to compute contours with a parallel method of speedup. Another parallel algorithm based on data parallel is used in experiments to show that pipeline parallel mode is more efficient. A case study at last shows a suspending performance of the new parallel algorithm. Compared with the serial slicing algorithm, the new pipeline parallel algorithm can make full use of the multi-core CPU hardware, accelerate the slicing process, and compared with the data parallel slicing algorithm, the new slicing algorithm in this paper adopts a pipeline parallel model, and a much higher speedup ratio and efficiency is achieved.
文摘A profound understanding of the costs to perform condition assessment on buried drinking water pipeline infrastructure is required for enhanced asset management. Toward this end, an automated and uniform method of collecting cost data can provide water utilities a means for viewing, understanding, interpreting and visualizing complex geographically referenced cost information to reveal data relationships, patterns and trends. However, there has been no standard data model that allows automated data collection and interoperability across platforms. The primary objective of this research is to develop a standard cost data model for drinking water pipeline condition assessment projects and to conflate disparate datasets from differing utilities. The capabilities of this model will be further demonstrated through performing trend analyses. Field mapping files will be generated from the standard data model and demonstrated in an interactive web map created using Google Maps API (application programming interface) for JavaScript that allows the user to toggle project examples and to perform regional comparisons. The aggregation of standardized data and further use in mapping applications will help in providing timely access to condition assessment cost information and resources that will lead to enhanced asset management and resource allocation for drinking water utilities.
基金supported by UKRI Innovate UK (Project number:44584,Project Title AI-driven and real-time command and control centre for site equipment in infrastructure projectsProject number,Funding body:UKRI Innovate UK,United Kingdom)under the‘Increase productivity,performance and quality in UK construction’competition.
文摘Current research on Digital Twin(DT)is largely focused on the performance of built assets in their operational phases as well as on urban environment.However,Digital Twin has not been given enough attention to construction phases,for which this paper proposes a Digital Twin framework for the construction phase,develops a DT prototype and tests it for the use case of measuring the productivity and monitoring of earthwork operation.The DT framework and its prototype are underpinned by the principles of versatility,scalability,usability and automation to enable the DT to fulfil the requirements of large-sized earthwork projects and the dynamic nature of their operation.Cloud computing and dashboard visualisation were deployed to enable automated and repeatable data pipelines and data analytics at scale and to provide insights in near-real time.The testing of the DT prototype in a motorway project in the Northeast of England successfully demonstrated its ability to produce key insights by using the following approaches:(i)To predict equipment utilisation ratios and productivities;(ii)To detect the percentage of time spent on different tasks(i.e.,loading,hauling,dumping,returning or idling),the distance travelled by equipment over time and the speed distribution;and(iii)To visualise certain earthwork operations.