As smart grid technology rapidly advances,the vast amount of user data collected by smart meter presents significant challenges in data security and privacy protection.Current research emphasizes data security and use...As smart grid technology rapidly advances,the vast amount of user data collected by smart meter presents significant challenges in data security and privacy protection.Current research emphasizes data security and user privacy concerns within smart grids.However,existing methods struggle with efficiency and security when processing large-scale data.Balancing efficient data processing with stringent privacy protection during data aggregation in smart grids remains an urgent challenge.This paper proposes an AI-based multi-type data aggregation method designed to enhance aggregation efficiency and security by standardizing and normalizing various data modalities.The approach optimizes data preprocessing,integrates Long Short-Term Memory(LSTM)networks for handling time-series data,and employs homomorphic encryption to safeguard user privacy.It also explores the application of Boneh Lynn Shacham(BLS)signatures for user authentication.The proposed scheme’s efficiency,security,and privacy protection capabilities are validated through rigorous security proofs and experimental analysis.展开更多
Many fields,such as neuroscience,are experiencing the vast prolife ration of cellular data,underscoring the need fo r organizing and interpreting large datasets.A popular approach partitions data into manageable subse...Many fields,such as neuroscience,are experiencing the vast prolife ration of cellular data,underscoring the need fo r organizing and interpreting large datasets.A popular approach partitions data into manageable subsets via hierarchical clustering,but objective methods to determine the appropriate classification granularity are missing.We recently introduced a technique to systematically identify when to stop subdividing clusters based on the fundamental principle that cells must differ more between than within clusters.Here we present the corresponding protocol to classify cellular datasets by combining datadriven unsupervised hierarchical clustering with statistical testing.These general-purpose functions are applicable to any cellular dataset that can be organized as two-dimensional matrices of numerical values,including molecula r,physiological,and anatomical datasets.We demonstrate the protocol using cellular data from the Janelia MouseLight project to chara cterize morphological aspects of neurons.展开更多
BACKGROUND Dyslipidemia was strongly linked to stroke,however the relationship between dyslipidemia and its components and ischemic stroke remained unexplained.AIM To investigate the link between longitudinal changes ...BACKGROUND Dyslipidemia was strongly linked to stroke,however the relationship between dyslipidemia and its components and ischemic stroke remained unexplained.AIM To investigate the link between longitudinal changes in lipid profiles and dyslipidemia and ischemic stroke in a hypertensive population.METHODS Between 2013 and 2014,6094 hypertension individuals were included in this,and ischemic stroke cases were documented to the end of 2018.Longitudinal changes of lipid were stratified into four groups:(1)Normal was transformed into normal group;(2)Abnormal was transformed into normal group;(3)Normal was transformed into abnormal group;and(4)Abnormal was transformed into abnormal group.To examine the link between longitudinal changes in dyslipidemia along with its components and the risk of ischemic stroke,we utilized multivariate Cox proportional hazards models with hazard ratio(HR)and 95%CI.RESULTS The average age of the participants was 62.32 years±13.00 years,with 329 women making up 54.0%of the sample.Over the course of a mean follow-up of 4.8 years,143 ischemic strokes happened.When normal was transformed into normal group was used as a reference,after full adjustments,the HR for dyslipidemia and ischemic stroke among abnormal was transformed into normal group,normal was transformed into abnormal group and abnormal was transformed into abnormal Wei CC et al.Dyslipidemia changed and ischemic stroke WJCC https://www.wjgnet.com 2 February 6,2025 Volume 13 Issue 4 group were 1.089(95%CI:0.598-1.982;P=0.779),2.369(95%CI:1.424-3.941;P<0.001)and 1.448(95%CI:1.002-2.298;P=0.047)(P for trend was 0.233),respectively.CONCLUSION In individuals with hypertension,longitudinal shifts from normal to abnormal in dyslipidemia-particularly in total and low-density lipoprotein cholesterol-were significantly associated with the risk of ischemic stroke.展开更多
Open networks and heterogeneous services in the Internet of Vehicles(IoV)can lead to security and privacy challenges.One key requirement for such systems is the preservation of user privacy,ensuring a seamless experie...Open networks and heterogeneous services in the Internet of Vehicles(IoV)can lead to security and privacy challenges.One key requirement for such systems is the preservation of user privacy,ensuring a seamless experience in driving,navigation,and communication.These privacy needs are influenced by various factors,such as data collected at different intervals,trip durations,and user interactions.To address this,the paper proposes a Support Vector Machine(SVM)model designed to process large amounts of aggregated data and recommend privacy preserving measures.The model analyzes data based on user demands and interactions with service providers or neighboring infrastructure.It aims to minimize privacy risks while ensuring service continuity and sustainability.The SVMmodel helps validate the system’s reliability by creating a hyperplane that distinguishes between maximum and minimum privacy recommendations.The results demonstrate the effectiveness of the proposed SVM model in enhancing both privacy and service performance.展开更多
Iced transmission line galloping poses a significant threat to the safety and reliability of power systems,leading directly to line tripping,disconnections,and power outages.Existing early warning methods of iced tran...Iced transmission line galloping poses a significant threat to the safety and reliability of power systems,leading directly to line tripping,disconnections,and power outages.Existing early warning methods of iced transmission line galloping suffer from issues such as reliance on a single data source,neglect of irregular time series,and lack of attention-based closed-loop feedback,resulting in high rates of missed and false alarms.To address these challenges,we propose an Internet of Things(IoT)empowered early warning method of transmission line galloping that integrates time series data from optical fiber sensing and weather forecast.Initially,the method applies a primary adaptive weighted fusion to the IoT empowered optical fiber real-time sensing data and weather forecast data,followed by a secondary fusion based on a Back Propagation(BP)neural network,and uses the K-medoids algorithm for clustering the fused data.Furthermore,an adaptive irregular time series perception adjustment module is introduced into the traditional Gated Recurrent Unit(GRU)network,and closed-loop feedback based on attentionmechanism is employed to update network parameters through gradient feedback of the loss function,enabling closed-loop training and time series data prediction of the GRU network model.Subsequently,considering various types of prediction data and the duration of icing,an iced transmission line galloping risk coefficient is established,and warnings are categorized based on this coefficient.Finally,using an IoT-driven realistic dataset of iced transmission line galloping,the effectiveness of the proposed method is validated through multi-dimensional simulation scenarios.展开更多
This study examines the effectiveness of artificial intelligence techniques in generating high-quality environmental data for species introductory site selection systems.Combining Strengths,Weaknesses,Opportunities,Th...This study examines the effectiveness of artificial intelligence techniques in generating high-quality environmental data for species introductory site selection systems.Combining Strengths,Weaknesses,Opportunities,Threats(SWOT)analysis data with Variation Autoencoder(VAE)and Generative AdversarialNetwork(GAN)the network framework model(SAE-GAN),is proposed for environmental data reconstruction.The model combines two popular generative models,GAN and VAE,to generate features conditional on categorical data embedding after SWOT Analysis.The model is capable of generating features that resemble real feature distributions and adding sample factors to more accurately track individual sample data.Reconstructed data is used to retain more semantic information to generate features.The model was applied to species in Southern California,USA,citing SWOT analysis data to train the model.Experiments show that the model is capable of integrating data from more comprehensive analyses than traditional methods and generating high-quality reconstructed data from them,effectively solving the problem of insufficient data collection in development environments.The model is further validated by the Technique for Order Preference by Similarity to an Ideal Solution(TOPSIS)classification assessment commonly used in the environmental data domain.This study provides a reliable and rich source of training data for species introduction site selection systems and makes a significant contribution to ecological and sustainable development.展开更多
Urban railways are vital means of public transportation in Korea.More than 30%of metropolitan residents use the railways,and this proportion is expected to increase.To enhance safety,the government has mandated the in...Urban railways are vital means of public transportation in Korea.More than 30%of metropolitan residents use the railways,and this proportion is expected to increase.To enhance safety,the government has mandated the installation of closed-circuit televisions in all carriages by 2024.However,cameras still monitored humans.To address this limitation,we developed a dataset of risk factors and a smart detection system that enables an immediate response to any abnormal behavior and intensive monitoring thereof.We created an innovative learning dataset that takes into account seven unique risk factors specific to Korean railway passengers.Detailed data collection was conducted across the Shinbundang Line of the Incheon Transportation Corporation,and the Ui-Shinseol Line.We observed several behavioral characteristics and assigned unique annotations to them.We also considered carriage congestion.Recognition performance was evaluated by camera placement and number.Then the camera installation plan was optimized.The dataset will find immediate applications in domestic railway operations.The artificial intelligence algorithms will be verified shortly.展开更多
A remarkable marine heatwave,known as the“Blob”,occurred in the Northeast Pacific Ocean from late 2013 to early 2016,which displayed strong warm anomalies extending from the surface to a depth of 300 m.This study em...A remarkable marine heatwave,known as the“Blob”,occurred in the Northeast Pacific Ocean from late 2013 to early 2016,which displayed strong warm anomalies extending from the surface to a depth of 300 m.This study employed two assimilation schemes based on the global Climate Forecast System of Nanjing University of Information Science(NUIST-CFS 1.0)to investigate the impact of ocean data assimilation on the seasonal prediction of this extreme marine heatwave.The sea surface temperature(SST)nudging scheme assimilates SST only,while the deterministic ensemble Kalman filter(EnKF)scheme assimilates observations from the surface to the deep ocean.The latter notably improves the forecasting skill for subsurface temperature anomalies,especially at the depth of 100-300 m(the lower layer),outperforming the SST nudging scheme.It excels in predicting both horizontal and vertical heat transport in the lower layer,contributing to improved forecasts of the lower-layer warming during the Blob.These improvements stem from the assimilation of subsurface observational data,which are important in predicting the upper-ocean conditions.The results suggest that assimilating ocean data with the EnKF scheme significantly enhances the accuracy in predicting subsurface temperature anomalies during the Blob and offers better understanding of its underlying mechanisms.展开更多
Thedeployment of the Internet of Things(IoT)with smart sensors has facilitated the emergence of fog computing as an important technology for delivering services to smart environments such as campuses,smart cities,and ...Thedeployment of the Internet of Things(IoT)with smart sensors has facilitated the emergence of fog computing as an important technology for delivering services to smart environments such as campuses,smart cities,and smart transportation systems.Fog computing tackles a range of challenges,including processing,storage,bandwidth,latency,and reliability,by locally distributing secure information through end nodes.Consisting of endpoints,fog nodes,and back-end cloud infrastructure,it provides advanced capabilities beyond traditional cloud computing.In smart environments,particularly within smart city transportation systems,the abundance of devices and nodes poses significant challenges related to power consumption and system reliability.To address the challenges of latency,energy consumption,and fault tolerance in these environments,this paper proposes a latency-aware,faulttolerant framework for resource scheduling and data management,referred to as the FORD framework,for smart cities in fog environments.This framework is designed to meet the demands of time-sensitive applications,such as those in smart transportation systems.The FORD framework incorporates latency-aware resource scheduling to optimize task execution in smart city environments,leveraging resources from both fog and cloud environments.Through simulation-based executions,tasks are allocated to the nearest available nodes with minimum latency.In the event of execution failure,a fault-tolerantmechanism is employed to ensure the successful completion of tasks.Upon successful execution,data is efficiently stored in the cloud data center,ensuring data integrity and reliability within the smart city ecosystem.展开更多
There is a growing body of clinical research on the utility of synthetic data derivatives,an emerging research tool in medicine.In nephrology,clinicians can use machine learning and artificial intelligence as powerful...There is a growing body of clinical research on the utility of synthetic data derivatives,an emerging research tool in medicine.In nephrology,clinicians can use machine learning and artificial intelligence as powerful aids in their clinical decision-making while also preserving patient privacy.This is especially important given the epidemiology of chronic kidney disease,renal oncology,and hypertension worldwide.However,there remains a need to create a framework for guidance regarding how to better utilize synthetic data as a practical application in this research.展开更多
In this article, a partially linear single-index model /or longitudinal data is investigated. The generalized penalized spline least squares estimates of the unknown parameters are suggested. All parameters can be est...In this article, a partially linear single-index model /or longitudinal data is investigated. The generalized penalized spline least squares estimates of the unknown parameters are suggested. All parameters can be estimated simultaneously by the proposed method while the feature of longitudinal data is considered. The existence, strong consistency and asymptotic normality of the estimators are proved under suitable conditions. A simulation study is conducted to investigate the finite sample performance of the proposed method. Our approach can also be used to study the pure single-index model for longitudinal data.展开更多
In this paper, it is discussed that two tests for varying dispersion of binomial data in the framework of nonlinear logistic models with random effects, which are widely used in analyzing longitudinal binomial data. O...In this paper, it is discussed that two tests for varying dispersion of binomial data in the framework of nonlinear logistic models with random effects, which are widely used in analyzing longitudinal binomial data. One is the individual test and power calculation for varying dispersion through testing the randomness of cluster effects, which is extensions of Dean(1992) and Commenges et al (1994). The second test is the composite test for varying dispersion through simultaneously testing the randomness of cluster effects and the equality of random-effect means. The score test statistics are constructed and expressed in simple, easy to use, matrix formulas. The authors illustrate their test methods using the insecticide data (Giltinan, Capizzi & Malani (1988)).展开更多
The parameter estimation and the coefficient of contamination for the regression models with repeated measures are studied when its response variables are contaminated by another random variable sequence.Under the sui...The parameter estimation and the coefficient of contamination for the regression models with repeated measures are studied when its response variables are contaminated by another random variable sequence.Under the suitable conditions it is proved that the estimators which are established in the paper are strongly consistent estimators.展开更多
In longitudinal data analysis, our primary interest is in the estimation of regression parameters for the marginal expectations of the longitudinal responses, and the longitudinal correlation parameters are of seconda...In longitudinal data analysis, our primary interest is in the estimation of regression parameters for the marginal expectations of the longitudinal responses, and the longitudinal correlation parameters are of secondary interest. The joint likelihood function for longitudinal data is challenging, particularly due to correlated responses. Marginal models, such as generalized estimating equations (GEEs), have received much attention based on the assumption of the first two moments of the data and a working correlation structure. The confidence regions and hypothesis tests are constructed based on the asymptotic normality. This approach is sensitive to the misspecification of the variance function and the working correlation structure which may yield inefficient and inconsistent estimates leading to wrong conclusions. To overcome this problem, we propose an empirical likelihood (EL) procedure based on a set of estimating equations for the parameter of interest and discuss its <span style="font-family:Verdana;">characteristics and asymptotic properties. We also provide an algorithm base</span><span style="font-family:Verdana;">d on EL principles for the estimation of the regression parameters and the construction of its confidence region. We have applied the proposed method in two case examples.</span>展开更多
Local arterials can be significantly impacted by diversions from adjacent work zones. These diversions often occur on unofficial detour routes due to guidance received on personal navigation devices. Often, these rout...Local arterials can be significantly impacted by diversions from adjacent work zones. These diversions often occur on unofficial detour routes due to guidance received on personal navigation devices. Often, these routes do not have sufficien<span style="font-family:Verdana;">t sensing or communication equipment to obtain infrastructure-based tra</span><span style="font-family:Verdana;">ffic signal performance measures, so other data sources are required to identify locations being significantly affected by diversions. This paper examines the network impact caused by the start of an 18-month closure of the I-65/70 interchange (North Split), which usually serves approximately 214,000 vehicles per day in Indianapolis, IN. In anticipation of some proportion of the public diverting from official detour routes to local streets, a connected vehicle monitoring program was established to provide daily performances measures for over 100 intersections in the area without the need for vehicle sensing equipment. This study reports on 13 of the most impacted signals on an alternative arterial to identify locations and time of day where operations are most degraded, so that decision makers have quantitative information to make informed adjustments to the system. Individual vehicle movements at the studied locations are analyzed to estimate changes in volume, split failures, downstream blockage, arrivals on green, and travel times. Over 130,000 trajectories were analyzed in an 11-week period. Weekly afternoon peak period volumes increased by approximately 455%, split failures increased 3%, downstream blockage increased 10%, arrivals on green decreased 16%, and travel time increase 74%. The analysis performed in this paper will serve as a framework for any agency that wants to assess traffic signal performance at hundreds of locations with little or no existing sensing or communication infrastructure to prioritize tactical retiming and/or longer-term infrastructure investments.</span>展开更多
In this paper we reparameterize covariance structures in longitudinal data analysis through the modified Cholesky decomposition of itself. Based on this modified Cholesky decomposition, the within-subject covariance m...In this paper we reparameterize covariance structures in longitudinal data analysis through the modified Cholesky decomposition of itself. Based on this modified Cholesky decomposition, the within-subject covariance matrix is decomposed into a unit lower triangular matrix involving moving average coefficients and a diagonal matrix involving innovation variances, which are modeled as linear functions of covariates. Then, we propose a penalized maximum likelihood method for variable selection in joint mean and covariance models based on this decomposition. Under certain regularity conditions, we establish the consistency and asymptotic normality of the penalized maximum likelihood estimators of parameters in the models. Simulation studies are undertaken to assess the finite sample performance of the proposed variable selection procedure.展开更多
Logic regression is an adaptive regression method which searches for Boolean (logic) combinations of binary variables that best explain the variability in the outcome, and thus, it reveals interaction effects which ar...Logic regression is an adaptive regression method which searches for Boolean (logic) combinations of binary variables that best explain the variability in the outcome, and thus, it reveals interaction effects which are associated with the response. In this study, we extended logic regression to longitudinal data with binary response and proposed “Transition Logic Regression Method” to find interactions related to response. In this method, interaction effects over time were found by Annealing Algorithm with AIC (Akaike Information Criterion) as the score function of the model. Also, first and second orders Markov dependence were allowed to capture the correlation among successive observations of the same individual in longitudinal binary response. Performance of the method was evaluated with simulation study in various conditions. Proposed method was used to find interactions of SNPs and other risk factors related to low HDL over time in data of 329 participants of longitudinal TLGS study.展开更多
For the regression model about longitudinal data, we combine the robust estimation equation with the elemental empirical likelihood method, and propose an efficient robust estimator, where the robust estimation equati...For the regression model about longitudinal data, we combine the robust estimation equation with the elemental empirical likelihood method, and propose an efficient robust estimator, where the robust estimation equation is based on bounded scoring function and the covariate depended weight function. This method reduces the influence of outliers in response variables and covariates on parameter estimation, takes into account the correlation between data, and improves the efficiency of estimation. The simulation results show that the proposed method is robust and efficient.展开更多
High-dimensional longitudinal data arise frequently in biomedical and genomic research. It is important to select relevant covariates when the dimension of the parameters diverges as the sample size increases. We cons...High-dimensional longitudinal data arise frequently in biomedical and genomic research. It is important to select relevant covariates when the dimension of the parameters diverges as the sample size increases. We consider the problem of variable selection in high-dimensional linear models with longitudinal data. A new variable selection procedure is proposed using the smooth-threshold generalized estimating equation and quadratic inference functions (SGEE-QIF) to incorporate correlation information. The proposed procedure automatically eliminates inactive predictors by setting the corresponding parameters to be zero, and simultaneously estimates the nonzero regression coefficients by solving the SGEE-QIF. The proposed procedure avoids the convex optimization problem and is flexible and easy to implement. We establish the asymptotic properties in a high-dimensional framework where the number of covariates increases as the number of cluster increases. Extensive Monte Carlo simulation studies are conducted to examine the finite sample performance of the proposed variable selection procedure.展开更多
基金supported by the National Key R&D Program of China(No.2023YFB2703700)the National Natural Science Foundation of China(Nos.U21A20465,62302457,62402444,62172292)+4 种基金the Fundamental Research Funds of Zhejiang Sci-Tech University(Nos.23222092-Y,22222266-Y)the Program for Leading Innovative Research Team of Zhejiang Province(No.2023R01001)the Zhejiang Provincial Natural Science Foundation of China(Nos.LQ24F020008,LQ24F020012)the Foundation of State Key Laboratory of Public Big Data(No.[2022]417)the“Pioneer”and“Leading Goose”R&D Program of Zhejiang(No.2023C01119).
文摘As smart grid technology rapidly advances,the vast amount of user data collected by smart meter presents significant challenges in data security and privacy protection.Current research emphasizes data security and user privacy concerns within smart grids.However,existing methods struggle with efficiency and security when processing large-scale data.Balancing efficient data processing with stringent privacy protection during data aggregation in smart grids remains an urgent challenge.This paper proposes an AI-based multi-type data aggregation method designed to enhance aggregation efficiency and security by standardizing and normalizing various data modalities.The approach optimizes data preprocessing,integrates Long Short-Term Memory(LSTM)networks for handling time-series data,and employs homomorphic encryption to safeguard user privacy.It also explores the application of Boneh Lynn Shacham(BLS)signatures for user authentication.The proposed scheme’s efficiency,security,and privacy protection capabilities are validated through rigorous security proofs and experimental analysis.
基金supported in part by NIH grants R01NS39600,U01MH114829RF1MH128693(to GAA)。
文摘Many fields,such as neuroscience,are experiencing the vast prolife ration of cellular data,underscoring the need fo r organizing and interpreting large datasets.A popular approach partitions data into manageable subsets via hierarchical clustering,but objective methods to determine the appropriate classification granularity are missing.We recently introduced a technique to systematically identify when to stop subdividing clusters based on the fundamental principle that cells must differ more between than within clusters.Here we present the corresponding protocol to classify cellular datasets by combining datadriven unsupervised hierarchical clustering with statistical testing.These general-purpose functions are applicable to any cellular dataset that can be organized as two-dimensional matrices of numerical values,including molecula r,physiological,and anatomical datasets.We demonstrate the protocol using cellular data from the Janelia MouseLight project to chara cterize morphological aspects of neurons.
文摘BACKGROUND Dyslipidemia was strongly linked to stroke,however the relationship between dyslipidemia and its components and ischemic stroke remained unexplained.AIM To investigate the link between longitudinal changes in lipid profiles and dyslipidemia and ischemic stroke in a hypertensive population.METHODS Between 2013 and 2014,6094 hypertension individuals were included in this,and ischemic stroke cases were documented to the end of 2018.Longitudinal changes of lipid were stratified into four groups:(1)Normal was transformed into normal group;(2)Abnormal was transformed into normal group;(3)Normal was transformed into abnormal group;and(4)Abnormal was transformed into abnormal group.To examine the link between longitudinal changes in dyslipidemia along with its components and the risk of ischemic stroke,we utilized multivariate Cox proportional hazards models with hazard ratio(HR)and 95%CI.RESULTS The average age of the participants was 62.32 years±13.00 years,with 329 women making up 54.0%of the sample.Over the course of a mean follow-up of 4.8 years,143 ischemic strokes happened.When normal was transformed into normal group was used as a reference,after full adjustments,the HR for dyslipidemia and ischemic stroke among abnormal was transformed into normal group,normal was transformed into abnormal group and abnormal was transformed into abnormal Wei CC et al.Dyslipidemia changed and ischemic stroke WJCC https://www.wjgnet.com 2 February 6,2025 Volume 13 Issue 4 group were 1.089(95%CI:0.598-1.982;P=0.779),2.369(95%CI:1.424-3.941;P<0.001)and 1.448(95%CI:1.002-2.298;P=0.047)(P for trend was 0.233),respectively.CONCLUSION In individuals with hypertension,longitudinal shifts from normal to abnormal in dyslipidemia-particularly in total and low-density lipoprotein cholesterol-were significantly associated with the risk of ischemic stroke.
基金supported by the Deanship of Graduate Studies and Scientific Research at University of Bisha for funding this research through the promising program under grant number(UB-Promising-33-1445).
文摘Open networks and heterogeneous services in the Internet of Vehicles(IoV)can lead to security and privacy challenges.One key requirement for such systems is the preservation of user privacy,ensuring a seamless experience in driving,navigation,and communication.These privacy needs are influenced by various factors,such as data collected at different intervals,trip durations,and user interactions.To address this,the paper proposes a Support Vector Machine(SVM)model designed to process large amounts of aggregated data and recommend privacy preserving measures.The model analyzes data based on user demands and interactions with service providers or neighboring infrastructure.It aims to minimize privacy risks while ensuring service continuity and sustainability.The SVMmodel helps validate the system’s reliability by creating a hyperplane that distinguishes between maximum and minimum privacy recommendations.The results demonstrate the effectiveness of the proposed SVM model in enhancing both privacy and service performance.
基金research was funded by Science and Technology Project of State Grid Corporation of China under grant number 5200-202319382A-2-3-XG.
文摘Iced transmission line galloping poses a significant threat to the safety and reliability of power systems,leading directly to line tripping,disconnections,and power outages.Existing early warning methods of iced transmission line galloping suffer from issues such as reliance on a single data source,neglect of irregular time series,and lack of attention-based closed-loop feedback,resulting in high rates of missed and false alarms.To address these challenges,we propose an Internet of Things(IoT)empowered early warning method of transmission line galloping that integrates time series data from optical fiber sensing and weather forecast.Initially,the method applies a primary adaptive weighted fusion to the IoT empowered optical fiber real-time sensing data and weather forecast data,followed by a secondary fusion based on a Back Propagation(BP)neural network,and uses the K-medoids algorithm for clustering the fused data.Furthermore,an adaptive irregular time series perception adjustment module is introduced into the traditional Gated Recurrent Unit(GRU)network,and closed-loop feedback based on attentionmechanism is employed to update network parameters through gradient feedback of the loss function,enabling closed-loop training and time series data prediction of the GRU network model.Subsequently,considering various types of prediction data and the duration of icing,an iced transmission line galloping risk coefficient is established,and warnings are categorized based on this coefficient.Finally,using an IoT-driven realistic dataset of iced transmission line galloping,the effectiveness of the proposed method is validated through multi-dimensional simulation scenarios.
基金supported by the Fundamental Research Funds for the Liaoning Universities(LJ212410146025).
文摘This study examines the effectiveness of artificial intelligence techniques in generating high-quality environmental data for species introductory site selection systems.Combining Strengths,Weaknesses,Opportunities,Threats(SWOT)analysis data with Variation Autoencoder(VAE)and Generative AdversarialNetwork(GAN)the network framework model(SAE-GAN),is proposed for environmental data reconstruction.The model combines two popular generative models,GAN and VAE,to generate features conditional on categorical data embedding after SWOT Analysis.The model is capable of generating features that resemble real feature distributions and adding sample factors to more accurately track individual sample data.Reconstructed data is used to retain more semantic information to generate features.The model was applied to species in Southern California,USA,citing SWOT analysis data to train the model.Experiments show that the model is capable of integrating data from more comprehensive analyses than traditional methods and generating high-quality reconstructed data from them,effectively solving the problem of insufficient data collection in development environments.The model is further validated by the Technique for Order Preference by Similarity to an Ideal Solution(TOPSIS)classification assessment commonly used in the environmental data domain.This study provides a reliable and rich source of training data for species introduction site selection systems and makes a significant contribution to ecological and sustainable development.
基金supported by a Korean Agency for Infrastructure Technology Advancement(KAIA)grant funded by the Ministry of Land,Infrastructure and Transport(grant no.RS-2023-00239464).
文摘Urban railways are vital means of public transportation in Korea.More than 30%of metropolitan residents use the railways,and this proportion is expected to increase.To enhance safety,the government has mandated the installation of closed-circuit televisions in all carriages by 2024.However,cameras still monitored humans.To address this limitation,we developed a dataset of risk factors and a smart detection system that enables an immediate response to any abnormal behavior and intensive monitoring thereof.We created an innovative learning dataset that takes into account seven unique risk factors specific to Korean railway passengers.Detailed data collection was conducted across the Shinbundang Line of the Incheon Transportation Corporation,and the Ui-Shinseol Line.We observed several behavioral characteristics and assigned unique annotations to them.We also considered carriage congestion.Recognition performance was evaluated by camera placement and number.Then the camera installation plan was optimized.The dataset will find immediate applications in domestic railway operations.The artificial intelligence algorithms will be verified shortly.
基金supported by the National Natural Science Foundation of China [grant number 42030605]the National Key R&D Program of China [grant number 2020YFA0608004]。
文摘A remarkable marine heatwave,known as the“Blob”,occurred in the Northeast Pacific Ocean from late 2013 to early 2016,which displayed strong warm anomalies extending from the surface to a depth of 300 m.This study employed two assimilation schemes based on the global Climate Forecast System of Nanjing University of Information Science(NUIST-CFS 1.0)to investigate the impact of ocean data assimilation on the seasonal prediction of this extreme marine heatwave.The sea surface temperature(SST)nudging scheme assimilates SST only,while the deterministic ensemble Kalman filter(EnKF)scheme assimilates observations from the surface to the deep ocean.The latter notably improves the forecasting skill for subsurface temperature anomalies,especially at the depth of 100-300 m(the lower layer),outperforming the SST nudging scheme.It excels in predicting both horizontal and vertical heat transport in the lower layer,contributing to improved forecasts of the lower-layer warming during the Blob.These improvements stem from the assimilation of subsurface observational data,which are important in predicting the upper-ocean conditions.The results suggest that assimilating ocean data with the EnKF scheme significantly enhances the accuracy in predicting subsurface temperature anomalies during the Blob and offers better understanding of its underlying mechanisms.
基金supported by the Deanship of Scientific Research and Graduate Studies at King Khalid University under research grant number(R.G.P.2/93/45).
文摘Thedeployment of the Internet of Things(IoT)with smart sensors has facilitated the emergence of fog computing as an important technology for delivering services to smart environments such as campuses,smart cities,and smart transportation systems.Fog computing tackles a range of challenges,including processing,storage,bandwidth,latency,and reliability,by locally distributing secure information through end nodes.Consisting of endpoints,fog nodes,and back-end cloud infrastructure,it provides advanced capabilities beyond traditional cloud computing.In smart environments,particularly within smart city transportation systems,the abundance of devices and nodes poses significant challenges related to power consumption and system reliability.To address the challenges of latency,energy consumption,and fault tolerance in these environments,this paper proposes a latency-aware,faulttolerant framework for resource scheduling and data management,referred to as the FORD framework,for smart cities in fog environments.This framework is designed to meet the demands of time-sensitive applications,such as those in smart transportation systems.The FORD framework incorporates latency-aware resource scheduling to optimize task execution in smart city environments,leveraging resources from both fog and cloud environments.Through simulation-based executions,tasks are allocated to the nearest available nodes with minimum latency.In the event of execution failure,a fault-tolerantmechanism is employed to ensure the successful completion of tasks.Upon successful execution,data is efficiently stored in the cloud data center,ensuring data integrity and reliability within the smart city ecosystem.
文摘There is a growing body of clinical research on the utility of synthetic data derivatives,an emerging research tool in medicine.In nephrology,clinicians can use machine learning and artificial intelligence as powerful aids in their clinical decision-making while also preserving patient privacy.This is especially important given the epidemiology of chronic kidney disease,renal oncology,and hypertension worldwide.However,there remains a need to create a framework for guidance regarding how to better utilize synthetic data as a practical application in this research.
基金Supported by the National Natural Science Foundation of China (10571008)the Natural Science Foundation of Henan (092300410149)the Core Teacher Foundationof Henan (2006141)
文摘In this article, a partially linear single-index model /or longitudinal data is investigated. The generalized penalized spline least squares estimates of the unknown parameters are suggested. All parameters can be estimated simultaneously by the proposed method while the feature of longitudinal data is considered. The existence, strong consistency and asymptotic normality of the estimators are proved under suitable conditions. A simulation study is conducted to investigate the finite sample performance of the proposed method. Our approach can also be used to study the pure single-index model for longitudinal data.
基金The project supported by NNSFC (19631040), NSSFC (04BTJ002) and the grant for post-doctor fellows in SELF.
文摘In this paper, it is discussed that two tests for varying dispersion of binomial data in the framework of nonlinear logistic models with random effects, which are widely used in analyzing longitudinal binomial data. One is the individual test and power calculation for varying dispersion through testing the randomness of cluster effects, which is extensions of Dean(1992) and Commenges et al (1994). The second test is the composite test for varying dispersion through simultaneously testing the randomness of cluster effects and the equality of random-effect means. The score test statistics are constructed and expressed in simple, easy to use, matrix formulas. The authors illustrate their test methods using the insecticide data (Giltinan, Capizzi & Malani (1988)).
文摘The parameter estimation and the coefficient of contamination for the regression models with repeated measures are studied when its response variables are contaminated by another random variable sequence.Under the suitable conditions it is proved that the estimators which are established in the paper are strongly consistent estimators.
文摘In longitudinal data analysis, our primary interest is in the estimation of regression parameters for the marginal expectations of the longitudinal responses, and the longitudinal correlation parameters are of secondary interest. The joint likelihood function for longitudinal data is challenging, particularly due to correlated responses. Marginal models, such as generalized estimating equations (GEEs), have received much attention based on the assumption of the first two moments of the data and a working correlation structure. The confidence regions and hypothesis tests are constructed based on the asymptotic normality. This approach is sensitive to the misspecification of the variance function and the working correlation structure which may yield inefficient and inconsistent estimates leading to wrong conclusions. To overcome this problem, we propose an empirical likelihood (EL) procedure based on a set of estimating equations for the parameter of interest and discuss its <span style="font-family:Verdana;">characteristics and asymptotic properties. We also provide an algorithm base</span><span style="font-family:Verdana;">d on EL principles for the estimation of the regression parameters and the construction of its confidence region. We have applied the proposed method in two case examples.</span>
文摘Local arterials can be significantly impacted by diversions from adjacent work zones. These diversions often occur on unofficial detour routes due to guidance received on personal navigation devices. Often, these routes do not have sufficien<span style="font-family:Verdana;">t sensing or communication equipment to obtain infrastructure-based tra</span><span style="font-family:Verdana;">ffic signal performance measures, so other data sources are required to identify locations being significantly affected by diversions. This paper examines the network impact caused by the start of an 18-month closure of the I-65/70 interchange (North Split), which usually serves approximately 214,000 vehicles per day in Indianapolis, IN. In anticipation of some proportion of the public diverting from official detour routes to local streets, a connected vehicle monitoring program was established to provide daily performances measures for over 100 intersections in the area without the need for vehicle sensing equipment. This study reports on 13 of the most impacted signals on an alternative arterial to identify locations and time of day where operations are most degraded, so that decision makers have quantitative information to make informed adjustments to the system. Individual vehicle movements at the studied locations are analyzed to estimate changes in volume, split failures, downstream blockage, arrivals on green, and travel times. Over 130,000 trajectories were analyzed in an 11-week period. Weekly afternoon peak period volumes increased by approximately 455%, split failures increased 3%, downstream blockage increased 10%, arrivals on green decreased 16%, and travel time increase 74%. The analysis performed in this paper will serve as a framework for any agency that wants to assess traffic signal performance at hundreds of locations with little or no existing sensing or communication infrastructure to prioritize tactical retiming and/or longer-term infrastructure investments.</span>
文摘In this paper we reparameterize covariance structures in longitudinal data analysis through the modified Cholesky decomposition of itself. Based on this modified Cholesky decomposition, the within-subject covariance matrix is decomposed into a unit lower triangular matrix involving moving average coefficients and a diagonal matrix involving innovation variances, which are modeled as linear functions of covariates. Then, we propose a penalized maximum likelihood method for variable selection in joint mean and covariance models based on this decomposition. Under certain regularity conditions, we establish the consistency and asymptotic normality of the penalized maximum likelihood estimators of parameters in the models. Simulation studies are undertaken to assess the finite sample performance of the proposed variable selection procedure.
文摘Logic regression is an adaptive regression method which searches for Boolean (logic) combinations of binary variables that best explain the variability in the outcome, and thus, it reveals interaction effects which are associated with the response. In this study, we extended logic regression to longitudinal data with binary response and proposed “Transition Logic Regression Method” to find interactions related to response. In this method, interaction effects over time were found by Annealing Algorithm with AIC (Akaike Information Criterion) as the score function of the model. Also, first and second orders Markov dependence were allowed to capture the correlation among successive observations of the same individual in longitudinal binary response. Performance of the method was evaluated with simulation study in various conditions. Proposed method was used to find interactions of SNPs and other risk factors related to low HDL over time in data of 329 participants of longitudinal TLGS study.
文摘For the regression model about longitudinal data, we combine the robust estimation equation with the elemental empirical likelihood method, and propose an efficient robust estimator, where the robust estimation equation is based on bounded scoring function and the covariate depended weight function. This method reduces the influence of outliers in response variables and covariates on parameter estimation, takes into account the correlation between data, and improves the efficiency of estimation. The simulation results show that the proposed method is robust and efficient.
文摘High-dimensional longitudinal data arise frequently in biomedical and genomic research. It is important to select relevant covariates when the dimension of the parameters diverges as the sample size increases. We consider the problem of variable selection in high-dimensional linear models with longitudinal data. A new variable selection procedure is proposed using the smooth-threshold generalized estimating equation and quadratic inference functions (SGEE-QIF) to incorporate correlation information. The proposed procedure automatically eliminates inactive predictors by setting the corresponding parameters to be zero, and simultaneously estimates the nonzero regression coefficients by solving the SGEE-QIF. The proposed procedure avoids the convex optimization problem and is flexible and easy to implement. We establish the asymptotic properties in a high-dimensional framework where the number of covariates increases as the number of cluster increases. Extensive Monte Carlo simulation studies are conducted to examine the finite sample performance of the proposed variable selection procedure.