Radial basis function neural network(RBFNN) is an effective algorithm in nonlinear system identification. How to properly adjust the structure and parameters of RBFNN is quite challenging. To solve this problem, a dis...Radial basis function neural network(RBFNN) is an effective algorithm in nonlinear system identification. How to properly adjust the structure and parameters of RBFNN is quite challenging. To solve this problem, a distance concentration immune algorithm(DCIA) is proposed to self-organize the structure and parameters of the RBFNN in this paper. First, the distance concentration algorithm, which increases the diversity of antibodies, is used to find the global optimal solution. Secondly,the information processing strength(IPS) algorithm is used to avoid the instability that is caused by the hidden layer with neurons split or deleted randomly. However, to improve the forecasting accuracy and reduce the computation time, a sample with the most frequent occurrence of maximum error is proposed to regulate the parameters of the new neuron. In addition, the convergence proof of a self-organizing RBF neural network based on distance concentration immune algorithm(DCIA-SORBFNN) is applied to guarantee the feasibility of algorithm. Finally, several nonlinear functions are used to validate the effectiveness of the algorithm. Experimental results show that the proposed DCIASORBFNN has achieved better nonlinear approximation ability than that of the art relevant competitors.展开更多
As an effective approach to achieve the“dual-carbon”goal,the grid-connected capacity of renewable energy increases constantly.Photovoltaics are the most widely used renewable energy sources and have been applied on ...As an effective approach to achieve the“dual-carbon”goal,the grid-connected capacity of renewable energy increases constantly.Photovoltaics are the most widely used renewable energy sources and have been applied on various occasions.However,the inherent randomness,intermittency,and weak support of grid-connected equipment not only cause changes in the original flow characteristics of the grid but also result in complex fault characteristics.Traditional overcurrent and differential protection methods cannot respond accurately due to the effects of unknown renewable energy sources.Therefore,a longitudinal protection method based on virtual measurement of current restraint is proposed in this paper.The positive sequence current data and the network parameters are used to calculate the virtual measurement current which compensates for the output current of photovoltaic(PV).The waveform difference between the virtual measured current and the terminal current for internal and external faults is used to construct the protection method.An improved edit distance algorithm is proposed to measure the similarity between virtual measurement current and terminal measurement current.Finally,the feasibility of the protection method is verified through PSCAD simulation.展开更多
A high-precision nominal flight profile,involving controllers′intentions is critical for 4Dtrajectory estimation in modern automatic air traffic control systems.We proposed a novel method to effectively improve the a...A high-precision nominal flight profile,involving controllers′intentions is critical for 4Dtrajectory estimation in modern automatic air traffic control systems.We proposed a novel method to effectively improve the accuracy of the nominal flight profile,including the nominal altitude profile and the speed profile.First,considering the characteristics of trajectory data,we developed an improved K-means algorithm.The approach was to measure the similarity between different altitude profiles by integrating the space warp edit distance algorithm,thereby to acquire several fitted nominal flight altitude profiles.This approach breaks the constraints of traditional K-means algorithms.Second,to eliminate the influence of meteorological factors,we introduced historical gridded binary data to determine the en-route wind speed and temperature via inverse distance weighted interpolation.Finally,we facilitated the true airspeed determined by speed triangle relationships and the calibrated airspeed determined by aircraft data model to extract a more accurate nominal speed profile from each cluster,therefore we could describe the airspeed profiles above and below the airspeed transition altitude,respectively.Our experimental results showed that the proposed method could obtain a highly accurate nominal flight profile,which reflects the actual aircraft flight status.展开更多
The massive web-based information resources have led to an increasing demand for effective automatic retrieval of target information for web applications. This paper introduces a web-based data extraction tool that de...The massive web-based information resources have led to an increasing demand for effective automatic retrieval of target information for web applications. This paper introduces a web-based data extraction tool that deploys various algorithms to locate, extract and filter tabular data from HTML pages and to transform them into new web-based representations. The tool has been applied in an aquaculture web application platform for extracting and generating aquatic product market information. Results prove that this tool is very effective in extracting the required data from web pages.展开更多
With the progress of plant genome research, more than 50 plant metallothionein_like (MT_L) genes have been found, but only several MT_L proteins have been detected and no experimental structural information for MT_L p...With the progress of plant genome research, more than 50 plant metallothionein_like (MT_L) genes have been found, but only several MT_L proteins have been detected and no experimental structural information for MT_L proteins has been reported so far. Since detailed knowledge of the protein tertiary structure is required to understand its biological function, a method is needed to determine the structure of these proteins. In this study, the structural data of known mammal MT was used to determine the interatomic distance constraints of the CXC and CXXC motifs and the metal_sulfur chelating cluster. Then several possible MT conformations were predicted using a distance geometry algorithm. The statistical analysis was used to select those with much lower target function values and lower conformation energies as the predicted tertiary structural models of the cysteine_rich (CR) domains of these proteins. A suitable prediction method for modeling the CR domain of the plant MT_L protein was constructed. The accurately predicted result for the known structure of an MT protein from blue crab suggests that this method is practicable. The tertiary structures of CR domains of rape MT_L protein LSC54 was then modeled with this method.展开更多
Text categorization(TC)is one of the widely studied branches of text mining and has many applications in different domains.It tries to automatically assign a text document to one of the predefined categories often by ...Text categorization(TC)is one of the widely studied branches of text mining and has many applications in different domains.It tries to automatically assign a text document to one of the predefined categories often by using machine learning(ML)techniques.Choosing the best classifier in this task is the most important step in which k-Nearest Neighbor(KNN)is widely employed as a classifier as well as several other well-known ones such as Support Vector Machine,Multinomial Naive Bayes,Logistic Regression,and so on.The KNN has been extensively used for TC tasks and is one of the oldest and simplest methods for pattern classification.Its performance crucially relies on the distance metric used to identify nearest neighbors such that the most frequently observed label among these neighbors is used to classify an unseen test instance.Hence,in this paper,a comparative analysis of the KNN classifier is performed on a subset(i.e.,R8)of the Reuters-21578 benchmark dataset for TC.Experimental results are obtained by using different distance metrics as well as recently proposed distance learning metrics under different cases where the feature model and term weighting scheme are different.Our comparative evaluation of the results shows that Bray-Curtis and Linear Discriminant Analysis(LDA)are often superior to the other metrics and work well with raw term frequency weights.展开更多
Fault section location of a single-phase grounding fault is affected by the neutral grounding mode of the system, transition resistance, and the blind zone. A fault section locating method based on an amplitude featur...Fault section location of a single-phase grounding fault is affected by the neutral grounding mode of the system, transition resistance, and the blind zone. A fault section locating method based on an amplitude feature and an intelligent distance algorithm is proposed to eliminate the influence of the above factors. By analyzing and comparing the amplitude characteristics of the zero-sequence current transient components at both ends of the healthy section and the faulty section, a distance algorithm with strong abnormal data immune capability is introduced in this paper. The matching degree of the amplitude characteristics at both ends of the feeder section are used as the criterion and by comparing with the set threshold, the faulty section is effectively determined. Finally, simulations using Matlab/Simulink and PSCAD/EMTDC show that the proposed section locating method can locate the faulty section accurately, and is not affected by grounding mode, grounding resistance, or the blind zone.展开更多
Entity perception of ambiguous user comments is a critical problem of target identification for huge amount of public opinions.In this paper,a Two-Step-Matching method is proposed to identify the precise target entity...Entity perception of ambiguous user comments is a critical problem of target identification for huge amount of public opinions.In this paper,a Two-Step-Matching method is proposed to identify the precise target entity from multiple entities mentioned.Firstly,potential entities are extracted by BiLSTM-CRF model and characteristic words by TF-IDF model from public comments.Secondly,the first matching is implemented between potential entities and an official business directory by Jaro-Winkler distance algorithm.Then,in order to find the pre-cise one,an industry-characteristic dictionary is developed into the second matching process.The precise entity is identified according to the count of characteristic words matching to industry-characteristic dictionary.In addition,associated rate(global indicator)and accuracy rate(sample indicator)are defined for evaluation of matching accuracy.The results for three data sets of public opinions about major public health events show that the highest associated rate and accuracy rate arrive at 0.93 and 0.95,averagely enhanced by 32%and 30%above the case of using the first matching process alone.This framework provides the method to find the true target entity of really wanted expression from public opinions.展开更多
基金supported by the National Natural Science Foundation of China(61890930-5,61533002,61603012)the Major Science and Technology Program for Water Pollution Control and Treatment of China(2018ZX07111005)+1 种基金the National Key Research and Development Project(2018YFC1900800-5)Beijing Municipal Education Commission Foundation(KM201710005025)
文摘Radial basis function neural network(RBFNN) is an effective algorithm in nonlinear system identification. How to properly adjust the structure and parameters of RBFNN is quite challenging. To solve this problem, a distance concentration immune algorithm(DCIA) is proposed to self-organize the structure and parameters of the RBFNN in this paper. First, the distance concentration algorithm, which increases the diversity of antibodies, is used to find the global optimal solution. Secondly,the information processing strength(IPS) algorithm is used to avoid the instability that is caused by the hidden layer with neurons split or deleted randomly. However, to improve the forecasting accuracy and reduce the computation time, a sample with the most frequent occurrence of maximum error is proposed to regulate the parameters of the new neuron. In addition, the convergence proof of a self-organizing RBF neural network based on distance concentration immune algorithm(DCIA-SORBFNN) is applied to guarantee the feasibility of algorithm. Finally, several nonlinear functions are used to validate the effectiveness of the algorithm. Experimental results show that the proposed DCIASORBFNN has achieved better nonlinear approximation ability than that of the art relevant competitors.
基金funded by State Grid Anhui Electric Power Co.,Ltd.Science and Technology Project(52120021N00L)the National Key Research and Development Program of China(2022YFB2400015).
文摘As an effective approach to achieve the“dual-carbon”goal,the grid-connected capacity of renewable energy increases constantly.Photovoltaics are the most widely used renewable energy sources and have been applied on various occasions.However,the inherent randomness,intermittency,and weak support of grid-connected equipment not only cause changes in the original flow characteristics of the grid but also result in complex fault characteristics.Traditional overcurrent and differential protection methods cannot respond accurately due to the effects of unknown renewable energy sources.Therefore,a longitudinal protection method based on virtual measurement of current restraint is proposed in this paper.The positive sequence current data and the network parameters are used to calculate the virtual measurement current which compensates for the output current of photovoltaic(PV).The waveform difference between the virtual measured current and the terminal current for internal and external faults is used to construct the protection method.An improved edit distance algorithm is proposed to measure the similarity between virtual measurement current and terminal measurement current.Finally,the feasibility of the protection method is verified through PSCAD simulation.
基金supported by the National Natural Science Foundation of China(Nos.61174180,U1433125)the Jiangsu Province Science Foundation (No.BK20141413)the Chinese Postdoctoral Science Foundation (No.2014M550291)
文摘A high-precision nominal flight profile,involving controllers′intentions is critical for 4Dtrajectory estimation in modern automatic air traffic control systems.We proposed a novel method to effectively improve the accuracy of the nominal flight profile,including the nominal altitude profile and the speed profile.First,considering the characteristics of trajectory data,we developed an improved K-means algorithm.The approach was to measure the similarity between different altitude profiles by integrating the space warp edit distance algorithm,thereby to acquire several fitted nominal flight altitude profiles.This approach breaks the constraints of traditional K-means algorithms.Second,to eliminate the influence of meteorological factors,we introduced historical gridded binary data to determine the en-route wind speed and temperature via inverse distance weighted interpolation.Finally,we facilitated the true airspeed determined by speed triangle relationships and the calibrated airspeed determined by aircraft data model to extract a more accurate nominal speed profile from each cluster,therefore we could describe the airspeed profiles above and below the airspeed transition altitude,respectively.Our experimental results showed that the proposed method could obtain a highly accurate nominal flight profile,which reflects the actual aircraft flight status.
基金Supported by the Shanghai Education Committee (No.06KZ016)
文摘The massive web-based information resources have led to an increasing demand for effective automatic retrieval of target information for web applications. This paper introduces a web-based data extraction tool that deploys various algorithms to locate, extract and filter tabular data from HTML pages and to transform them into new web-based representations. The tool has been applied in an aquaculture web application platform for extracting and generating aquatic product market information. Results prove that this tool is very effective in extracting the required data from web pages.
文摘With the progress of plant genome research, more than 50 plant metallothionein_like (MT_L) genes have been found, but only several MT_L proteins have been detected and no experimental structural information for MT_L proteins has been reported so far. Since detailed knowledge of the protein tertiary structure is required to understand its biological function, a method is needed to determine the structure of these proteins. In this study, the structural data of known mammal MT was used to determine the interatomic distance constraints of the CXC and CXXC motifs and the metal_sulfur chelating cluster. Then several possible MT conformations were predicted using a distance geometry algorithm. The statistical analysis was used to select those with much lower target function values and lower conformation energies as the predicted tertiary structural models of the cysteine_rich (CR) domains of these proteins. A suitable prediction method for modeling the CR domain of the plant MT_L protein was constructed. The accurately predicted result for the known structure of an MT protein from blue crab suggests that this method is practicable. The tertiary structures of CR domains of rape MT_L protein LSC54 was then modeled with this method.
文摘Text categorization(TC)is one of the widely studied branches of text mining and has many applications in different domains.It tries to automatically assign a text document to one of the predefined categories often by using machine learning(ML)techniques.Choosing the best classifier in this task is the most important step in which k-Nearest Neighbor(KNN)is widely employed as a classifier as well as several other well-known ones such as Support Vector Machine,Multinomial Naive Bayes,Logistic Regression,and so on.The KNN has been extensively used for TC tasks and is one of the oldest and simplest methods for pattern classification.Its performance crucially relies on the distance metric used to identify nearest neighbors such that the most frequently observed label among these neighbors is used to classify an unseen test instance.Hence,in this paper,a comparative analysis of the KNN classifier is performed on a subset(i.e.,R8)of the Reuters-21578 benchmark dataset for TC.Experimental results are obtained by using different distance metrics as well as recently proposed distance learning metrics under different cases where the feature model and term weighting scheme are different.Our comparative evaluation of the results shows that Bray-Curtis and Linear Discriminant Analysis(LDA)are often superior to the other metrics and work well with raw term frequency weights.
基金supporting by the National Natural Science Foundation of China(52077120)Research Fund for Excellent Dissertation of China Three Gorges University(2021SSPY056).
文摘Fault section location of a single-phase grounding fault is affected by the neutral grounding mode of the system, transition resistance, and the blind zone. A fault section locating method based on an amplitude feature and an intelligent distance algorithm is proposed to eliminate the influence of the above factors. By analyzing and comparing the amplitude characteristics of the zero-sequence current transient components at both ends of the healthy section and the faulty section, a distance algorithm with strong abnormal data immune capability is introduced in this paper. The matching degree of the amplitude characteristics at both ends of the feeder section are used as the criterion and by comparing with the set threshold, the faulty section is effectively determined. Finally, simulations using Matlab/Simulink and PSCAD/EMTDC show that the proposed section locating method can locate the faulty section accurately, and is not affected by grounding mode, grounding resistance, or the blind zone.
基金This work is partially supported by the National Natural Science Foundation of China(Grant Nos.71901144,71771152,61773248)the Major Program of National Fund of Philosophy and Social Science of China(18ZDA088,20ZDA060)+2 种基金Shanghai Planning Office of Philosophy and Social Science Foundation(Grant No.2019EXW001)Foundation of University of Finance and Economics(Grant No.2017110709)S-Tech internet communication project(Grant Nos.2018PHD005 and 2018TECH003).
文摘Entity perception of ambiguous user comments is a critical problem of target identification for huge amount of public opinions.In this paper,a Two-Step-Matching method is proposed to identify the precise target entity from multiple entities mentioned.Firstly,potential entities are extracted by BiLSTM-CRF model and characteristic words by TF-IDF model from public comments.Secondly,the first matching is implemented between potential entities and an official business directory by Jaro-Winkler distance algorithm.Then,in order to find the pre-cise one,an industry-characteristic dictionary is developed into the second matching process.The precise entity is identified according to the count of characteristic words matching to industry-characteristic dictionary.In addition,associated rate(global indicator)and accuracy rate(sample indicator)are defined for evaluation of matching accuracy.The results for three data sets of public opinions about major public health events show that the highest associated rate and accuracy rate arrive at 0.93 and 0.95,averagely enhanced by 32%and 30%above the case of using the first matching process alone.This framework provides the method to find the true target entity of really wanted expression from public opinions.