Big data analytics in business intelligence do not provide effective data retrieval methods and job scheduling that will cause execution inefficiency and low system throughput.This paper aims to enhance the capability...Big data analytics in business intelligence do not provide effective data retrieval methods and job scheduling that will cause execution inefficiency and low system throughput.This paper aims to enhance the capability of data retrieval and job scheduling to speed up the operation of big data analytics to overcome inefficiency and low throughput problems.First,integrating stacked sparse autoencoder and Elasticsearch indexing explored fast data searching and distributed indexing,which reduces the search scope of the database and dramatically speeds up data searching.Next,exploiting a deep neural network to predict the approximate execution time of a job gives prioritized job scheduling based on the shortest job first,which reduces the average waiting time of job execution.As a result,the proposed data retrieval approach outperforms the previous method using a deep autoencoder and Solr indexing,significantly improving the speed of data retrieval up to 53%and increasing system throughput by 53%.On the other hand,the proposed job scheduling algorithmdefeats both first-in-first-out andmemory-sensitive heterogeneous early finish time scheduling algorithms,effectively shortening the average waiting time up to 5%and average weighted turnaround time by 19%,respectively.展开更多
Operation control of power systems has become challenging with an increase in the scale and complexity of power distribution systems and extensive access to renewable energy.Therefore,improvement of the ability of dat...Operation control of power systems has become challenging with an increase in the scale and complexity of power distribution systems and extensive access to renewable energy.Therefore,improvement of the ability of data-driven operation management,intelligent analysis,and mining is urgently required.To investigate and explore similar regularities of the historical operating section of the power distribution system and assist the power grid in obtaining high-value historical operation,maintenance experience,and knowledge by rule and line,a neural information retrieval model with an attention mechanism is proposed based on graph data computing technology.Based on the processing flow of the operating data of the power distribution system,a technical framework of neural information retrieval is established.Combined with the natural graph characteristics of the power distribution system,a unified graph data structure and a data fusion method of data access,data complement,and multi-source data are constructed.Further,a graph node feature-embedding representation learning algorithm and a neural information retrieval algorithm model are constructed.The neural information retrieval algorithm model is trained and tested using the generated graph node feature representation vector set.The model is verified on the operating section of the power distribution system of a provincial grid area.The results show that the proposed method demonstrates high accuracy in the similarity matching of historical operation characteristics and effectively supports intelligent fault diagnosis and elimination in power distribution systems.展开更多
In recent years, the rapid decline of Arctic sea ice area (SIA) and sea ice extent (SIE), especially for the multiyear (MY) ice, has led to significant effect on climate change. The accurate retrieval of MY ice ...In recent years, the rapid decline of Arctic sea ice area (SIA) and sea ice extent (SIE), especially for the multiyear (MY) ice, has led to significant effect on climate change. The accurate retrieval of MY ice concentration retrieval is very important and challenging to understand the ongoing changes. Three MY ice concentration retrieval algorithms were systematically evaluated. A similar total ice concentration was yielded by these algorithms, while the retrieved MY sea ice concentrations differs from each other. The MY SIA derived from NASA TEAM algorithm is relatively stable. Other two algorithms created seasonal fluctuations of MY SIA, particularly in autumn and winter. In this paper, we proposed an ice concentration retrieval algorithm, which developed the NASA TEAM algorithm by adding to use AMSR-E 6.9 GHz brightness temperature data and sea ice concentration using 89.0 GHz data. Comparison with the reference MY SIA from reference MY ice, indicates that the mean difference and root mean square (rms) difference of MY SIA derived from the algorithm of this study are 0.65×10^6 km^2 and 0.69×10^6 km^2 during January to March, -0.06×10^6 km^2 and 0.14×10^6 km^2during September to December respectively. Comparison with MY SIE obtained from weekly ice age data provided by University of Colorado show that, the mean difference and rms difference are 0.69×10^6 km^2 and 0.84×10^6 km^2, respectively. The developed algorithm proposed in this study has smaller difference compared with the reference MY ice and MY SIE from ice age data than the Wang's, Lomax' and NASA TEAM algorithms.展开更多
Based on the atmospheric horizontal visibility data from forty-seven observational stations along the eastern coast of China near the Taiwan Strait and simultaneous NOAA/AVHRR multichannel satellite data during Januar...Based on the atmospheric horizontal visibility data from forty-seven observational stations along the eastern coast of China near the Taiwan Strait and simultaneous NOAA/AVHRR multichannel satellite data during January 2001 to December 2002, the spectral characters associated with visibility were investigated. Successful retrieval of visibility from multichannel NOAA/AVHRR data was performed using the principal component regression (PCR) method. A sample of retrieved visibility distribution was discussed with a sea fog process. The correlation coefficient between the observed and retrieved visibility was about 0.82, which is far higher than the 99.[KG-*7]9% confidence level by statistical test. The rate of successful retrieval is 94.98% of the 458 cases during 2001-2002. The error distribution showed that high visibilities were usually under-estimated and low visibilities were over-estimated and the relative error between the observed and retrieved visibilities was about 21.4%.展开更多
This paper adopts satellite channel brightness temperature simulation to study M-estimator variational retrieval. This approach combines both the advantages of classical variational inversion and robust M-estimators. ...This paper adopts satellite channel brightness temperature simulation to study M-estimator variational retrieval. This approach combines both the advantages of classical variational inversion and robust M-estimators. Classical variational inversion depends on prior quality control to elim- inate outliers, and its errors follow a Gaussian distribution. We coupled the M-estimators to the framework of classical variational inversion to obtain a M-estimator variational inversion. The cost function contains the M-estimator to guarantee the robustness to outliers and improve the retrieval re- sults. The experimental evaluation adopts Feng Yun-3A (FY-3A) simulated data to add to the Gaussian and Non-Gaussian error. The variational in- version is used to obtain the inversion brightness temperature, and temperature and humidity data are used for validation. The preliminary results demonstrate the potential of M-estimator variational retrieval.展开更多
We develop a data driven method(probability model) to construct a composite shape descriptor by combining a pair of scale-based shape descriptors. The selection of a pair of scale-based shape descriptors is modeled as...We develop a data driven method(probability model) to construct a composite shape descriptor by combining a pair of scale-based shape descriptors. The selection of a pair of scale-based shape descriptors is modeled as the computation of the union of two events, i.e.,retrieving similar shapes by using a single scale-based shape descriptor. The pair of scale-based shape descriptors with the highest probability forms the composite shape descriptor. Given a shape database, the composite shape descriptors for the shapes constitute a planar point set.A VoR-Tree of the planar point set is then used as an indexing structure for efficient query operation. Experiments and comparisons show the effectiveness and efficiency of the proposed composite shape descriptor.展开更多
Direct volume rendering(DVR)is a technique that emphasizes structures of interest(SOIs)within a volume visually,while simultaneously depicting adjacent regional information,e.g.,the spatial location of a structure con...Direct volume rendering(DVR)is a technique that emphasizes structures of interest(SOIs)within a volume visually,while simultaneously depicting adjacent regional information,e.g.,the spatial location of a structure concerning its neighbors.In DVR,transfer function(TF)plays a key role by enabling accurate identification of SOIs interactively as well as ensuring appropriate visibility of them.TF generation typically involves non-intuitive trial-and-error optimization of rendering parameters,which is time-consuming and inefficient.Attempts at mitigating this manual process have led to approaches that make use of a knowledge database consisting of pre-designed TFs by domain experts.In these approaches,a user navigates the knowledge database to find the most suitable pre-designed TF for their input volume to visualize the SOIs.Although these approaches potentially reduce the workload to generate the TFs,they,however,require manual TF navigation of the knowledge database,as well as the likely fine tuning of the selected TF to suit the input.In this work,we propose a TF design approach,CBR-TF,where we introduce a new content-based retrieval(CBR)method to automatically navigate the knowledge database.Instead of pre-designed TFs,our knowledge database contains volumes with SOI labels.Given an input volume,our CBR-TF approach retrieves relevant volumes(with SOI labels)from the knowledge database;the retrieved labels are then used to generate and optimize TFs of the input.This approach largely reduces manual TF navigation and fine tuning.For our CBR-TF approach,we introduce a novel volumetric image feature which includes both a local primitive intensity profile along the SOIs and regional spatial semantics available from the co-planar images to the profile.For the regional spatial semantics,we adopt a convolutional neural network to obtain high-level image feature representations.For the intensity profile,we extend the dynamic time warping technique to address subtle alignment differences between similar profiles(SOIs).Finally,we propose a two-stage CBR scheme to enable the use of these two different feature representations in a complementary manner,thereby improving SOI retrieval performance.We demonstrate the capabilities of our CBR-TF approach with comparison with a conventional approach in visualization,where an intensity profile matching algorithm is used,and also with potential use-cases in medical volume visualization.展开更多
基金supported and granted by the Ministry of Science and Technology,Taiwan(MOST110-2622-E-390-001 and MOST109-2622-E-390-002-CC3).
文摘Big data analytics in business intelligence do not provide effective data retrieval methods and job scheduling that will cause execution inefficiency and low system throughput.This paper aims to enhance the capability of data retrieval and job scheduling to speed up the operation of big data analytics to overcome inefficiency and low throughput problems.First,integrating stacked sparse autoencoder and Elasticsearch indexing explored fast data searching and distributed indexing,which reduces the search scope of the database and dramatically speeds up data searching.Next,exploiting a deep neural network to predict the approximate execution time of a job gives prioritized job scheduling based on the shortest job first,which reduces the average waiting time of job execution.As a result,the proposed data retrieval approach outperforms the previous method using a deep autoencoder and Solr indexing,significantly improving the speed of data retrieval up to 53%and increasing system throughput by 53%.On the other hand,the proposed job scheduling algorithmdefeats both first-in-first-out andmemory-sensitive heterogeneous early finish time scheduling algorithms,effectively shortening the average waiting time up to 5%and average weighted turnaround time by 19%,respectively.
基金supported by the National Key R&D Program of China(2020YFB0905900).
文摘Operation control of power systems has become challenging with an increase in the scale and complexity of power distribution systems and extensive access to renewable energy.Therefore,improvement of the ability of data-driven operation management,intelligent analysis,and mining is urgently required.To investigate and explore similar regularities of the historical operating section of the power distribution system and assist the power grid in obtaining high-value historical operation,maintenance experience,and knowledge by rule and line,a neural information retrieval model with an attention mechanism is proposed based on graph data computing technology.Based on the processing flow of the operating data of the power distribution system,a technical framework of neural information retrieval is established.Combined with the natural graph characteristics of the power distribution system,a unified graph data structure and a data fusion method of data access,data complement,and multi-source data are constructed.Further,a graph node feature-embedding representation learning algorithm and a neural information retrieval algorithm model are constructed.The neural information retrieval algorithm model is trained and tested using the generated graph node feature representation vector set.The model is verified on the operating section of the power distribution system of a provincial grid area.The results show that the proposed method demonstrates high accuracy in the similarity matching of historical operation characteristics and effectively supports intelligent fault diagnosis and elimination in power distribution systems.
基金The National Natural Science Foundation of China under contract Nos 41330960 and 41276193 and 41206184
文摘In recent years, the rapid decline of Arctic sea ice area (SIA) and sea ice extent (SIE), especially for the multiyear (MY) ice, has led to significant effect on climate change. The accurate retrieval of MY ice concentration retrieval is very important and challenging to understand the ongoing changes. Three MY ice concentration retrieval algorithms were systematically evaluated. A similar total ice concentration was yielded by these algorithms, while the retrieved MY sea ice concentrations differs from each other. The MY SIA derived from NASA TEAM algorithm is relatively stable. Other two algorithms created seasonal fluctuations of MY SIA, particularly in autumn and winter. In this paper, we proposed an ice concentration retrieval algorithm, which developed the NASA TEAM algorithm by adding to use AMSR-E 6.9 GHz brightness temperature data and sea ice concentration using 89.0 GHz data. Comparison with the reference MY SIA from reference MY ice, indicates that the mean difference and root mean square (rms) difference of MY SIA derived from the algorithm of this study are 0.65×10^6 km^2 and 0.69×10^6 km^2 during January to March, -0.06×10^6 km^2 and 0.14×10^6 km^2during September to December respectively. Comparison with MY SIE obtained from weekly ice age data provided by University of Colorado show that, the mean difference and rms difference are 0.69×10^6 km^2 and 0.84×10^6 km^2, respectively. The developed algorithm proposed in this study has smaller difference compared with the reference MY ice and MY SIE from ice age data than the Wang's, Lomax' and NASA TEAM algorithms.
基金This research is supported by the National High Technology Development Project (863) of China (Grant No. 2002AA639500) the Natural Science Foundation of Guangdong Province (Grant No. 032212)+1 种基金 National Basic Research Program of China (973 Program) (No. 2005CB422301) Program for New Century Excellent Talents in University ( NCET-05-0591 ).
文摘Based on the atmospheric horizontal visibility data from forty-seven observational stations along the eastern coast of China near the Taiwan Strait and simultaneous NOAA/AVHRR multichannel satellite data during January 2001 to December 2002, the spectral characters associated with visibility were investigated. Successful retrieval of visibility from multichannel NOAA/AVHRR data was performed using the principal component regression (PCR) method. A sample of retrieved visibility distribution was discussed with a sea fog process. The correlation coefficient between the observed and retrieved visibility was about 0.82, which is far higher than the 99.[KG-*7]9% confidence level by statistical test. The rate of successful retrieval is 94.98% of the 458 cases during 2001-2002. The error distribution showed that high visibilities were usually under-estimated and low visibilities were over-estimated and the relative error between the observed and retrieved visibilities was about 21.4%.
基金Supported by Special Scientific Research Fund of Meteorological Public Welfare Profession of China(GYHY201406028)Meteorological Open Research Fund for Huaihe River Basin(HRM201407)Anhui Meteorological Bureau Science and Technology Development Fund(RC201506)
文摘This paper adopts satellite channel brightness temperature simulation to study M-estimator variational retrieval. This approach combines both the advantages of classical variational inversion and robust M-estimators. Classical variational inversion depends on prior quality control to elim- inate outliers, and its errors follow a Gaussian distribution. We coupled the M-estimators to the framework of classical variational inversion to obtain a M-estimator variational inversion. The cost function contains the M-estimator to guarantee the robustness to outliers and improve the retrieval re- sults. The experimental evaluation adopts Feng Yun-3A (FY-3A) simulated data to add to the Gaussian and Non-Gaussian error. The variational in- version is used to obtain the inversion brightness temperature, and temperature and humidity data are used for validation. The preliminary results demonstrate the potential of M-estimator variational retrieval.
基金supported by the National Key R&D Plan of China(2016YFB1001501)
文摘We develop a data driven method(probability model) to construct a composite shape descriptor by combining a pair of scale-based shape descriptors. The selection of a pair of scale-based shape descriptors is modeled as the computation of the union of two events, i.e.,retrieving similar shapes by using a single scale-based shape descriptor. The pair of scale-based shape descriptors with the highest probability forms the composite shape descriptor. Given a shape database, the composite shape descriptors for the shapes constitute a planar point set.A VoR-Tree of the planar point set is then used as an indexing structure for efficient query operation. Experiments and comparisons show the effectiveness and efficiency of the proposed composite shape descriptor.
基金supported by the Korea Health Technology Research and Development Project through the Korea Health Industry Development Institute under Grant No.HI22C1651the National Research Foundation of Korea(NRF)under Grant No.2021R1F1A1059554the Culture,Sports and Tourism Research and Development Program through the Korea Creative Content Agency Grant funded by the Ministry of Culture,Sports and Tourism of Korea under Grant No.RS-2023-00227648.
文摘Direct volume rendering(DVR)is a technique that emphasizes structures of interest(SOIs)within a volume visually,while simultaneously depicting adjacent regional information,e.g.,the spatial location of a structure concerning its neighbors.In DVR,transfer function(TF)plays a key role by enabling accurate identification of SOIs interactively as well as ensuring appropriate visibility of them.TF generation typically involves non-intuitive trial-and-error optimization of rendering parameters,which is time-consuming and inefficient.Attempts at mitigating this manual process have led to approaches that make use of a knowledge database consisting of pre-designed TFs by domain experts.In these approaches,a user navigates the knowledge database to find the most suitable pre-designed TF for their input volume to visualize the SOIs.Although these approaches potentially reduce the workload to generate the TFs,they,however,require manual TF navigation of the knowledge database,as well as the likely fine tuning of the selected TF to suit the input.In this work,we propose a TF design approach,CBR-TF,where we introduce a new content-based retrieval(CBR)method to automatically navigate the knowledge database.Instead of pre-designed TFs,our knowledge database contains volumes with SOI labels.Given an input volume,our CBR-TF approach retrieves relevant volumes(with SOI labels)from the knowledge database;the retrieved labels are then used to generate and optimize TFs of the input.This approach largely reduces manual TF navigation and fine tuning.For our CBR-TF approach,we introduce a novel volumetric image feature which includes both a local primitive intensity profile along the SOIs and regional spatial semantics available from the co-planar images to the profile.For the regional spatial semantics,we adopt a convolutional neural network to obtain high-level image feature representations.For the intensity profile,we extend the dynamic time warping technique to address subtle alignment differences between similar profiles(SOIs).Finally,we propose a two-stage CBR scheme to enable the use of these two different feature representations in a complementary manner,thereby improving SOI retrieval performance.We demonstrate the capabilities of our CBR-TF approach with comparison with a conventional approach in visualization,where an intensity profile matching algorithm is used,and also with potential use-cases in medical volume visualization.