Identification of reservoir types in deep carbonates has always been a great challenge due to complex logging responses caused by the heterogeneous scale and distribution of storage spaces.Traditional cross-plot analy...Identification of reservoir types in deep carbonates has always been a great challenge due to complex logging responses caused by the heterogeneous scale and distribution of storage spaces.Traditional cross-plot analysis and empirical formula methods for identifying reservoir types using geophysical logging data have high uncertainty and low efficiency,which cannot accurately reflect the nonlinear relationship between reservoir types and logging data.Recently,the kernel Fisher discriminant analysis(KFD),a kernel-based machine learning technique,attracts attention in many fields because of its strong nonlinear processing ability.However,the overall performance of KFD model may be limited as a single kernel function cannot simultaneously extrapolate and interpolate well,especially for highly complex data cases.To address this issue,in this study,a mixed kernel Fisher discriminant analysis(MKFD)model was established and applied to identify reservoir types of the deep Sinian carbonates in central Sichuan Basin,China.The MKFD model was trained and tested with 453 datasets from 7 coring wells,utilizing GR,CAL,DEN,AC,CNL and RT logs as input variables.The particle swarm optimization(PSO)was adopted for hyper-parameter optimization of MKFD model.To evaluate the model performance,prediction results of MKFD were compared with those of basic-kernel based KFD,RF and SVM models.Subsequently,the built MKFD model was applied in a blind well test,and a variable importance analysis was conducted.The comparison and blind test results demonstrated that MKFD outperformed traditional KFD,RF and SVM in the identification of reservoir types,which provided higher accuracy and stronger generalization.The MKFD can therefore be a reliable method for identifying reservoir types of deep carbonates.展开更多
To address the problem of identifying multiple types of additives in lubricating oil,a method based on midinfrared spectral band selection using the eXtreme Gradient Boosting(XGBoost)algorithm combined with the ant co...To address the problem of identifying multiple types of additives in lubricating oil,a method based on midinfrared spectral band selection using the eXtreme Gradient Boosting(XGBoost)algorithm combined with the ant colony optimization(ACO)algorithm is proposed.The XGBoost algorithm was used to train and test three additives,T534(alkyl diphenylamine),T308(isooctyl acid thiophospholipid octadecylamine),and T306(trimethylphenol phosphate),separately,in order to screen for the optimal combination of spectral bands for each additive.The ACO algorithm was used to optimize the parameters of the XGBoost algorithm to improve the identification accuracy.During this process,the support vector machine(SVM)and hybrid bat algorithms(HBA)were included as a comparison,generating four models:ACO-XGBoost,ACO-SVM,HBA-XGboost,and HBA-SVM.The results showed that all four models could identify the three additives efficiently,with the ACO-XGBoost model achieving 100%recognition of all three additives.In addition,the generalizability of the ACO-XGBoost model was further demonstrated by predicting a lubricating oil containing the three additives prepared in our laboratory and a collected sample of commercial oil currently in use。展开更多
The hybrid dc circuit breaker(HCB)has the advantages of fast action speed and low operating loss,which is an idealmethod for fault isolation ofmulti-terminal dc grids.Formulti-terminal dc grids that transmit power thr...The hybrid dc circuit breaker(HCB)has the advantages of fast action speed and low operating loss,which is an idealmethod for fault isolation ofmulti-terminal dc grids.Formulti-terminal dc grids that transmit power through overhead lines,HCBs are required to have reclosing capability due to the high fault probability and the fact that most of the faults are temporary faults.To avoid the secondary fault strike and equipment damage that may be caused by the reclosing of the HCB when the permanent fault occurs,an adaptive reclosing scheme based on traveling wave injection is proposed in this paper.The scheme injects traveling wave signal into the fault dc line through the additionally configured auxiliary discharge branch in the HCB,and then uses the reflection characteristic of the traveling wave signal on the dc line to identify temporary and permanent faults,to be able to realize fast reclosing when the temporary fault occurs and reliably avoid reclosing after the permanent fault occurs.The test results in the simulation model of the four-terminal dc grid show that the proposed adaptive reclosing scheme can quickly and reliably identify temporary and permanent faults,greatly shorten the power outage time of temporary faults.In addition,it has the advantages of easiness to implement,high reliability,robustness to high-resistance fault and no dead zone,etc.展开更多
[Objective] This study aimed to investigate the genetic variation of porcine circovirus type 2 (PCV2) in China. [Method] The strain was isolated from infected samples by cel passage and preliminarily identified by P...[Objective] This study aimed to investigate the genetic variation of porcine circovirus type 2 (PCV2) in China. [Method] The strain was isolated from infected samples by cel passage and preliminarily identified by PCR and IFA. Ful-length genome of the isolated strain was obtained by specific amplification for homology and phylogenetic analysis. [Result] A PCV2 strain was successful y isolated and named 201105ZJ, which could proliferate in PK15 cel lines. Specific fragments could be amplified by specific PCR assay. According to results of IFA assay, specif-ic immunofluorescence was observed; the TCID50 was low (102.67); the ful-length genome sequence of the isolated strain was 1 768 bp, sharing 94.1%-96.8% ho-mology with 13 reference strains; to be specific, the isolated strain exhibited the highest homology of 96.8% with AF055392PCV2a; the isolated strain 201105ZJ and reference strain AF055392 belonged to genotype PCV2a, exhibiting a distant genetic relationship with genotype PCV2c. [Conclusion] Characteristics of genetic variation of PCV2 isolate 201105ZJ provided theoretical basis for vaccine development, investi-gation of PCV2 pathogenesis, and prevention and control of porcine circovirus-as-sociated diseases (PCVAD) in East China.展开更多
Identifying semantic types for attributes in relations,known as attribute semantic type(AST)identification,plays an important role in many data analysis tasks,such as data cleaning,schema matching,and keyword search i...Identifying semantic types for attributes in relations,known as attribute semantic type(AST)identification,plays an important role in many data analysis tasks,such as data cleaning,schema matching,and keyword search in databases.However,due to a lack of unified naming standards across prevalent information systems(a.k.a.information islands),AST identification still remains as an open problem.To tackle this problem,we propose a context-aware method to figure out the ASTs for relations in this paper.We transform the AST identification into a multi-class classification problem and propose a schema context aware(SCA)model to learn the representation from a collection of relations associated with attribute values and schema context.Based on the learned representation,we predict the AST for a given attribute from an underlying relation,wherein the predicted AST is mapped to one of the labeled ASTs.To improve the performance for AST identification,especially for the case that the predicted semantic types of attributes are not included in the labeled ASTs,we then introduce knowledge base embeddings(a.k.a.KBVec)to enhance the above representation and construct a schema context aware model with knowledge base enhanced(SCA-KB)to get a stable and robust model.Extensive experiments based on real datasets demonstrate that our context-aware method outperforms the state-of-the-art approaches by a large margin,up to 6.14%and 25.17%in terms of macro average F1 score,and up to 0.28%and 9.56%in terms of weighted F1 score over high-quality and low-quality datasets respectively.展开更多
Internet application identification is needed by network management in many aspects,such as quality of service (QoS) management,intrusion detection,traffic engineering,accounting,and so on. This article makes an in-...Internet application identification is needed by network management in many aspects,such as quality of service (QoS) management,intrusion detection,traffic engineering,accounting,and so on. This article makes an in-depth study of precise identification of Internet applications by using flow characteristics instead of well-known port or application signature match. A novel approach that identifies the application type of an Internet protocol (IP) flow by finding what flow the flow looks the most like based on medium mathematics system (MMS) is proposed. The approach differs from previous ones mainly in two aspects:it has inherent scalability due to its use of the measure of n-dimensional medium truth degree; not only features of a flow,but also the association between the flow and the other flows of the same host as well as the relation among all flows of a host are employed to recognize a flow's application type. For the present,some popular applications are concentrated on,and up to six application types can be identified with better accuracy. The results of experiments conducted on Internet show that the proposed methodology is effective and deserves attention.展开更多
As a basic property of cloud,accurate identification of cloud type is useful in forecasting the evolution of landfalling typhoons.Millimeter-wave cloud radar is an important means of identifying cloud type.Here,we dev...As a basic property of cloud,accurate identification of cloud type is useful in forecasting the evolution of landfalling typhoons.Millimeter-wave cloud radar is an important means of identifying cloud type.Here,we develop a fuzzy logic algorithm that depends on radar range-height-indicator(RHI)data and takes into account the fundamental physical features of different cloud types.The algorithm is applied to a ground-based Ka-band millimeter-wave cloud radar.The input parameters of the algorithm include average reflectivity factor intensity,ellipse long axis orientation,cloud base height,cloud thickness,presence/absence of precipitation,ratio of horizontal extent to vertical extent,maximum echo intensity,and standard variance of intensities.The identified cloud types are stratus(St),stratocumulus(Sc),cumulus(Cu),cumulonimbus(Cb),nimbostratus(Ns),altostratus(As),altocumulus(Ac)and high cloud.The cloud types identified using the algorithm are in good agreement with those identified by a human observer.As a case study,the algorithm was applied to typhoon Khanun(1720),which made landfall in south-eastern China in October 2017.Sequential identification results from the algorithm clearly reflected changes in cloud type and provided indicative information for forecasting of the typhoon.展开更多
The binding between indirubin and calf thymus DNA in vitro has been verified by meansof the isotope labelling method, spectrophotometric method and thermal denaturation meas-urements. The λ_max 207 nm of indirubin sh...The binding between indirubin and calf thymus DNA in vitro has been verified by meansof the isotope labelling method, spectrophotometric method and thermal denaturation meas-urements. The λ_max 207 nm of indirubin shifted toward longer wave length with decrease ofabsorbance after the incubation of indirubin with DNA. The escalation of Tm value of DNAinduced by indirubin was about 2.4°C and it was reproducible. The binding force between themwas rather weak, as indirubin molecules were easily released during the precipitation withalcohol or the gel filtration. The binding was not affected by sodium chloride even at high con-centration but greatly decreased (to 20-30% of the control) in the presence of 8 M urea.These results showed that the binding between indirubin and DNA might be of hydrogen bondrather than ionic. The amount of bound ~3H-indirubin was directly proportional to the con-centration of indirubin. However, it increased abruptly when the concentration of indirubinreached 1.5×10^(-4) M. This suggested another binding force in the latter instance. By using spectrophotometric analysis and Scatchard plot it was found that calf thymusDNA bound 46 indirubin molecules/1000 nucleotides. The association constant (K) was5.7×10~6.展开更多
Recently,the emergence of single-cell RNA-sequencing(scRNA-seq)technology makes it possible to solve biological problems at the single-cell resolution.One of the critical steps in cellular heterogeneity analysis is th...Recently,the emergence of single-cell RNA-sequencing(scRNA-seq)technology makes it possible to solve biological problems at the single-cell resolution.One of the critical steps in cellular heterogeneity analysis is the cell type identification.Diverse scRNA-seq clustering methods have been proposed to partition cells into clusters.Among all the methods,hierarchical clustering and spectral clustering are the most popular approaches in the downstream clustering analysis with different preprocessing strategies such as similarity learning,dropout imputation,and dimensionality reduction.In this study,we carry out a comprehensive analysis by combining different strategies with these two categories of clustering methods on scRNA-seq datasets under different biological conditions.The analysis results show that the methods with spectral clustering tend to perform better on datasets with continuous shapes in two-dimension,while those with hierarchical clustering achieve better results on datasets with obvious boundaries between clusters in two-dimension.Motivated by this finding,a new strategy,called QRS,is developed to quantitatively evaluate the latent representative shape of a dataset to distinguish whether it has clear boundaries or not.Finally,a data-driven clustering recommendation method,called DDCR,is proposed to recommend hierarchical clustering or spectral clustering for scRNA-seq data.We perform DDCR on two typical single cell clustering methods,SC3 and RAFSIL,and the results show that DDCR recommends a more suitable downstream clustering method for different scRNA-seq datasets and obtains more robust and accurate results.展开更多
基金supported by the National Natural Science Foundation of China(No.U21B2062)the Natural Science Foundation of Hubei Province(No.2023AFB307)。
文摘Identification of reservoir types in deep carbonates has always been a great challenge due to complex logging responses caused by the heterogeneous scale and distribution of storage spaces.Traditional cross-plot analysis and empirical formula methods for identifying reservoir types using geophysical logging data have high uncertainty and low efficiency,which cannot accurately reflect the nonlinear relationship between reservoir types and logging data.Recently,the kernel Fisher discriminant analysis(KFD),a kernel-based machine learning technique,attracts attention in many fields because of its strong nonlinear processing ability.However,the overall performance of KFD model may be limited as a single kernel function cannot simultaneously extrapolate and interpolate well,especially for highly complex data cases.To address this issue,in this study,a mixed kernel Fisher discriminant analysis(MKFD)model was established and applied to identify reservoir types of the deep Sinian carbonates in central Sichuan Basin,China.The MKFD model was trained and tested with 453 datasets from 7 coring wells,utilizing GR,CAL,DEN,AC,CNL and RT logs as input variables.The particle swarm optimization(PSO)was adopted for hyper-parameter optimization of MKFD model.To evaluate the model performance,prediction results of MKFD were compared with those of basic-kernel based KFD,RF and SVM models.Subsequently,the built MKFD model was applied in a blind well test,and a variable importance analysis was conducted.The comparison and blind test results demonstrated that MKFD outperformed traditional KFD,RF and SVM in the identification of reservoir types,which provided higher accuracy and stronger generalization.The MKFD can therefore be a reliable method for identifying reservoir types of deep carbonates.
基金the Beijing Natural Science Foundation(Grant No.2232066)the Open Project Foundation of State Key Laboratory of Solid Lubrication(Grant LSL-2212).
文摘To address the problem of identifying multiple types of additives in lubricating oil,a method based on midinfrared spectral band selection using the eXtreme Gradient Boosting(XGBoost)algorithm combined with the ant colony optimization(ACO)algorithm is proposed.The XGBoost algorithm was used to train and test three additives,T534(alkyl diphenylamine),T308(isooctyl acid thiophospholipid octadecylamine),and T306(trimethylphenol phosphate),separately,in order to screen for the optimal combination of spectral bands for each additive.The ACO algorithm was used to optimize the parameters of the XGBoost algorithm to improve the identification accuracy.During this process,the support vector machine(SVM)and hybrid bat algorithms(HBA)were included as a comparison,generating four models:ACO-XGBoost,ACO-SVM,HBA-XGboost,and HBA-SVM.The results showed that all four models could identify the three additives efficiently,with the ACO-XGBoost model achieving 100%recognition of all three additives.In addition,the generalizability of the ACO-XGBoost model was further demonstrated by predicting a lubricating oil containing the three additives prepared in our laboratory and a collected sample of commercial oil currently in use。
基金supported by the Science and Technology Project of State Grid Corporation of China under Grant 520201210025。
文摘The hybrid dc circuit breaker(HCB)has the advantages of fast action speed and low operating loss,which is an idealmethod for fault isolation ofmulti-terminal dc grids.Formulti-terminal dc grids that transmit power through overhead lines,HCBs are required to have reclosing capability due to the high fault probability and the fact that most of the faults are temporary faults.To avoid the secondary fault strike and equipment damage that may be caused by the reclosing of the HCB when the permanent fault occurs,an adaptive reclosing scheme based on traveling wave injection is proposed in this paper.The scheme injects traveling wave signal into the fault dc line through the additionally configured auxiliary discharge branch in the HCB,and then uses the reflection characteristic of the traveling wave signal on the dc line to identify temporary and permanent faults,to be able to realize fast reclosing when the temporary fault occurs and reliably avoid reclosing after the permanent fault occurs.The test results in the simulation model of the four-terminal dc grid show that the proposed adaptive reclosing scheme can quickly and reliably identify temporary and permanent faults,greatly shorten the power outage time of temporary faults.In addition,it has the advantages of easiness to implement,high reliability,robustness to high-resistance fault and no dead zone,etc.
基金Supported by National Natural Science Foundation of China(31302071)Special Fund for Agro-scientific Research in the Public Interest(201303046)+1 种基金Independent Innovation Project of Jiangsu Province[CX(13)3065]Project of the Fourth Period of "333" High-level Personnel Training Program of Jiangsu Province(BRA2012194)~~
文摘[Objective] This study aimed to investigate the genetic variation of porcine circovirus type 2 (PCV2) in China. [Method] The strain was isolated from infected samples by cel passage and preliminarily identified by PCR and IFA. Ful-length genome of the isolated strain was obtained by specific amplification for homology and phylogenetic analysis. [Result] A PCV2 strain was successful y isolated and named 201105ZJ, which could proliferate in PK15 cel lines. Specific fragments could be amplified by specific PCR assay. According to results of IFA assay, specif-ic immunofluorescence was observed; the TCID50 was low (102.67); the ful-length genome sequence of the isolated strain was 1 768 bp, sharing 94.1%-96.8% ho-mology with 13 reference strains; to be specific, the isolated strain exhibited the highest homology of 96.8% with AF055392PCV2a; the isolated strain 201105ZJ and reference strain AF055392 belonged to genotype PCV2a, exhibiting a distant genetic relationship with genotype PCV2c. [Conclusion] Characteristics of genetic variation of PCV2 isolate 201105ZJ provided theoretical basis for vaccine development, investi-gation of PCV2 pathogenesis, and prevention and control of porcine circovirus-as-sociated diseases (PCVAD) in East China.
基金supported by the National Key Research and Development Program of China under Grant No.2020YFB2104100the National Natural Science Foundation of China under Grant Nos.61972403 and U1711261the Fundamental Research Funds for the Central Universities of China,the Research Funds of Renmin University of China,and Tencent Rhino-Bird Joint Research Program.
文摘Identifying semantic types for attributes in relations,known as attribute semantic type(AST)identification,plays an important role in many data analysis tasks,such as data cleaning,schema matching,and keyword search in databases.However,due to a lack of unified naming standards across prevalent information systems(a.k.a.information islands),AST identification still remains as an open problem.To tackle this problem,we propose a context-aware method to figure out the ASTs for relations in this paper.We transform the AST identification into a multi-class classification problem and propose a schema context aware(SCA)model to learn the representation from a collection of relations associated with attribute values and schema context.Based on the learned representation,we predict the AST for a given attribute from an underlying relation,wherein the predicted AST is mapped to one of the labeled ASTs.To improve the performance for AST identification,especially for the case that the predicted semantic types of attributes are not included in the labeled ASTs,we then introduce knowledge base embeddings(a.k.a.KBVec)to enhance the above representation and construct a schema context aware model with knowledge base enhanced(SCA-KB)to get a stable and robust model.Extensive experiments based on real datasets demonstrate that our context-aware method outperforms the state-of-the-art approaches by a large margin,up to 6.14%and 25.17%in terms of macro average F1 score,and up to 0.28%and 9.56%in terms of weighted F1 score over high-quality and low-quality datasets respectively.
基金supported by the Open Fund of the State Key Laboratory of Software Development Environment (BUAA-SKLSDE-09KF-03)the National Basic Research Program of China (2005CB321901, 2009CB320505)+2 种基金the National Natural Science Foundation of China (60973140)the Natural Science Foundation of Jiangsu Province (BK2009425)the Academic Natural Science Foundation of Jiangsu Province (08KJB520005)
文摘Internet application identification is needed by network management in many aspects,such as quality of service (QoS) management,intrusion detection,traffic engineering,accounting,and so on. This article makes an in-depth study of precise identification of Internet applications by using flow characteristics instead of well-known port or application signature match. A novel approach that identifies the application type of an Internet protocol (IP) flow by finding what flow the flow looks the most like based on medium mathematics system (MMS) is proposed. The approach differs from previous ones mainly in two aspects:it has inherent scalability due to its use of the measure of n-dimensional medium truth degree; not only features of a flow,but also the association between the flow and the other flows of the same host as well as the relation among all flows of a host are employed to recognize a flow's application type. For the present,some popular applications are concentrated on,and up to six application types can be identified with better accuracy. The results of experiments conducted on Internet show that the proposed methodology is effective and deserves attention.
基金This work was supported by the National Natural Science Foundation of China(Grant No.41675029)the National Basic Research Program of China(No.2013CB430102).
文摘As a basic property of cloud,accurate identification of cloud type is useful in forecasting the evolution of landfalling typhoons.Millimeter-wave cloud radar is an important means of identifying cloud type.Here,we develop a fuzzy logic algorithm that depends on radar range-height-indicator(RHI)data and takes into account the fundamental physical features of different cloud types.The algorithm is applied to a ground-based Ka-band millimeter-wave cloud radar.The input parameters of the algorithm include average reflectivity factor intensity,ellipse long axis orientation,cloud base height,cloud thickness,presence/absence of precipitation,ratio of horizontal extent to vertical extent,maximum echo intensity,and standard variance of intensities.The identified cloud types are stratus(St),stratocumulus(Sc),cumulus(Cu),cumulonimbus(Cb),nimbostratus(Ns),altostratus(As),altocumulus(Ac)and high cloud.The cloud types identified using the algorithm are in good agreement with those identified by a human observer.As a case study,the algorithm was applied to typhoon Khanun(1720),which made landfall in south-eastern China in October 2017.Sequential identification results from the algorithm clearly reflected changes in cloud type and provided indicative information for forecasting of the typhoon.
文摘The binding between indirubin and calf thymus DNA in vitro has been verified by meansof the isotope labelling method, spectrophotometric method and thermal denaturation meas-urements. The λ_max 207 nm of indirubin shifted toward longer wave length with decrease ofabsorbance after the incubation of indirubin with DNA. The escalation of Tm value of DNAinduced by indirubin was about 2.4°C and it was reproducible. The binding force between themwas rather weak, as indirubin molecules were easily released during the precipitation withalcohol or the gel filtration. The binding was not affected by sodium chloride even at high con-centration but greatly decreased (to 20-30% of the control) in the presence of 8 M urea.These results showed that the binding between indirubin and DNA might be of hydrogen bondrather than ionic. The amount of bound ~3H-indirubin was directly proportional to the con-centration of indirubin. However, it increased abruptly when the concentration of indirubinreached 1.5×10^(-4) M. This suggested another binding force in the latter instance. By using spectrophotometric analysis and Scatchard plot it was found that calf thymusDNA bound 46 indirubin molecules/1000 nucleotides. The association constant (K) was5.7×10~6.
基金supported in part by the National Natural Science Foundation of China(No.U19A2064)the Hunan Provincial Science and Technology Program(No.2019CB1007)+1 种基金the Fundamental Research Funds for the Central Universities,CSU(No.2282019SYLB004)the Fundamental Research Funds for the Central Universities of Central South University(No.2020zzts593)。
文摘Recently,the emergence of single-cell RNA-sequencing(scRNA-seq)technology makes it possible to solve biological problems at the single-cell resolution.One of the critical steps in cellular heterogeneity analysis is the cell type identification.Diverse scRNA-seq clustering methods have been proposed to partition cells into clusters.Among all the methods,hierarchical clustering and spectral clustering are the most popular approaches in the downstream clustering analysis with different preprocessing strategies such as similarity learning,dropout imputation,and dimensionality reduction.In this study,we carry out a comprehensive analysis by combining different strategies with these two categories of clustering methods on scRNA-seq datasets under different biological conditions.The analysis results show that the methods with spectral clustering tend to perform better on datasets with continuous shapes in two-dimension,while those with hierarchical clustering achieve better results on datasets with obvious boundaries between clusters in two-dimension.Motivated by this finding,a new strategy,called QRS,is developed to quantitatively evaluate the latent representative shape of a dataset to distinguish whether it has clear boundaries or not.Finally,a data-driven clustering recommendation method,called DDCR,is proposed to recommend hierarchical clustering or spectral clustering for scRNA-seq data.We perform DDCR on two typical single cell clustering methods,SC3 and RAFSIL,and the results show that DDCR recommends a more suitable downstream clustering method for different scRNA-seq datasets and obtains more robust and accurate results.