The intricate distribution of oil and water in tight rocks makes pinpointing oil layers challenging.While conventional identification methods offer potential solutions,their limited accuracy precludes them from being ...The intricate distribution of oil and water in tight rocks makes pinpointing oil layers challenging.While conventional identification methods offer potential solutions,their limited accuracy precludes them from being effective in their applications to unconventional reservoirs.This study employed nuclear magnetic resonance(NMR)spectrum decomposition to dissect the NMR T_(2)spectrum into multiple subspectra.Furthermore,it employed laboratory NMR experiments to ascertain the fluid properties of these sub-spectra,aiming to enhance identification accuracy.The findings indicate that fluids of distinct properties overlap in the T_(2)spectra,with bound water,movable water,bound oil,and movable oil appearing sequentially from the low-value zone to the high-value zone.Consequently,an oil layer classification scheme was proposed,which considers the physical properties of reservoirs,oil-bearing capacity,and the characteristics of both mobility and the oil-water two-phase flow.When applied to tight oil layer identification,the scheme's outcomes align closely with actual test results.A horizontal well,deployed based on these findings,has produced high-yield industrial oil flow,underscoring the precision and dependability of this new approach.展开更多
We apply stochastic seismic inversion and Bayesian facies classification for porosity modeling and igneous rock identification in the presalt interval of the Santos Basin. This integration of seismic and well-derived ...We apply stochastic seismic inversion and Bayesian facies classification for porosity modeling and igneous rock identification in the presalt interval of the Santos Basin. This integration of seismic and well-derived information enhances reservoir characterization. Stochastic inversion and Bayesian classification are powerful tools because they permit addressing the uncertainties in the model. We used the ES-MDA algorithm to achieve the realizations equivalent to the percentiles P10, P50, and P90 of acoustic impedance, a novel method for acoustic inversion in presalt. The facies were divided into five: reservoir 1,reservoir 2, tight carbonates, clayey rocks, and igneous rocks. To deal with the overlaps in acoustic impedance values of facies, we included geological information using a priori probability, indicating that structural highs are reservoir-dominated. To illustrate our approach, we conducted porosity modeling using facies-related rock-physics models for rock-physics inversion in an area with a well drilled in a coquina bank and evaluated the thickness and extension of an igneous intrusion near the carbonate-salt interface. The modeled porosity and the classified seismic facies are in good agreement with the ones observed in the wells. Notably, the coquinas bank presents an improvement in the porosity towards the top. The a priori probability model was crucial for limiting the clayey rocks to the structural lows. In Well B, the hit rate of the igneous rock in the three scenarios is higher than 60%, showing an excellent thickness-prediction capability.展开更多
During the transient process of gas drilling conditions,the monitoring data often has obvious nonlinear fluctuation features,which leads to large classification errors and time delays in the commonly used intelligent ...During the transient process of gas drilling conditions,the monitoring data often has obvious nonlinear fluctuation features,which leads to large classification errors and time delays in the commonly used intelligent classification models.Combined with the structural features of data samples obtained from monitoring while drilling,this paper uses convolution algorithm to extract the correlation features of multiple monitoring while drilling parameters changing with time,and applies RBF network with nonlinear classification ability to classify the features.In the training process,the loss function component based on distance mean square error is used to effectively adjust the best clustering center in RBF.Many field applications show that,the recognition accuracy of the above nonlinear classification network model for gas production,water production and drill sticking is 97.32%,95.25%and 93.78%.Compared with the traditional convolutional neural network(CNN)model,the network structure not only improves the classification accuracy of conditions in the transition stage of conditions,but also greatly advances the time points of risk identification,especially for the three common risk identification points of gas production,water production and drill sticking,which are advanced by 56,16 and 8 s.It has won valuable time for the site to take correct risk disposal measures in time,and fully demonstrated the applicability of nonlinear classification neural network in oil and gas field exploration and development.展开更多
Single-molecule force spectroscopy(SMFS)measurements of the dynamics of biomolecules typically require identifying massive events and states from large data sets,such as extracting rupture forces from force-extension ...Single-molecule force spectroscopy(SMFS)measurements of the dynamics of biomolecules typically require identifying massive events and states from large data sets,such as extracting rupture forces from force-extension curves(FECs)in pulling experiments and identifying states from extension-time trajectories(ETTs)in force-clamp experiments.The former is often accomplished manually and hence is time-consuming and laborious while the latter is always impeded by the presence of baseline drift.In this study,we attempt to accurately and automatically identify the events and states from SMFS experiments with a machine learning approach,which combines clustering and classification for event identification of SMFS(ACCESS).As demonstrated by analysis of a series of data sets,ACCESS can extract the rupture forces from FECs containing multiple unfolding steps and classify the rupture forces into the corresponding conformational transitions.Moreover,ACCESS successfully identifies the unfolded and folded states even though the ETTs display severe nonmonotonic baseline drift.Besides,ACCESS is straightforward in use as it requires only three easy-to-interpret parameters.As such,we anticipate that ACCESS will be a useful,easy-to-implement and high-performance tool for event and state identification across a range of single-molecule experiments.展开更多
The KT-II layer in the Zananor Oilfield,Caspian Basin,Kazakhstan,contains carbonate reservoirs of various types.The complex pore structure of the reservoirs have made it difficult to identify watered-out zones with tr...The KT-II layer in the Zananor Oilfield,Caspian Basin,Kazakhstan,contains carbonate reservoirs of various types.The complex pore structure of the reservoirs have made it difficult to identify watered-out zones with traditional logging interpretation methods.This study classifies the reservoirs on the basis of core analysis and establishes an identification model for watered-out layers in the field to effectively improve the interpretation accuracy.Thin section analysis shows that there are three types of pores in the reservoirs,i.e.,the matrix pore,fracture and dissolution vug.A triple porosity model is used to calculate the porosities of the reservoirs and the results are combined with core analysis to classify the reservoirs into the fractured,matrix pore,fracture-pore as well as composite types.A classification standard is also proposed.There are differences in resistivity logging responses from the reservoirs of different types before and after watering-out.The preewatering-out resistivities are reconstructed using generalized neural network for different types of reservoirs.The watered-out layers can be effectively identified according to the difference in resistivity curves before and after watering-out.The results show that the watered-out layers identified with the method are consistent with measured data,thus serving as a reference for the evaluation of watered-out layers in the study area.展开更多
Recently, there have been some attempts of Transformer in 3D point cloud classification. In order to reduce computations, most existing methods focus on local spatial attention,but ignore their content and fail to est...Recently, there have been some attempts of Transformer in 3D point cloud classification. In order to reduce computations, most existing methods focus on local spatial attention,but ignore their content and fail to establish relationships between distant but relevant points. To overcome the limitation of local spatial attention, we propose a point content-based Transformer architecture, called PointConT for short. It exploits the locality of points in the feature space(content-based), which clusters the sampled points with similar features into the same class and computes the self-attention within each class, thus enabling an effective trade-off between capturing long-range dependencies and computational complexity. We further introduce an inception feature aggregator for point cloud classification, which uses parallel structures to aggregate high-frequency and low-frequency information in each branch separately. Extensive experiments show that our PointConT model achieves a remarkable performance on point cloud shape classification. Especially, our method exhibits 90.3% Top-1 accuracy on the hardest setting of ScanObjectN N. Source code of this paper is available at https://github.com/yahuiliu99/PointC onT.展开更多
Identification of the ice channel is the basic technology for developing intelligent ships in ice-covered waters,which is important to ensure the safety and economy of navigation.In the Arctic,merchant ships with low ...Identification of the ice channel is the basic technology for developing intelligent ships in ice-covered waters,which is important to ensure the safety and economy of navigation.In the Arctic,merchant ships with low ice class often navigate in channels opened up by icebreakers.Navigation in the ice channel often depends on good maneuverability skills and abundant experience from the captain to a large extent.The ship may get stuck if steered into ice fields off the channel.Under this circumstance,it is very important to study how to identify the boundary lines of ice channels with a reliable method.In this paper,a two-staged ice channel identification method is developed based on image segmentation and corner point regression.The first stage employs the image segmentation method to extract channel regions.In the second stage,an intelligent corner regression network is proposed to extract the channel boundary lines from the channel region.A non-intelligent angle-based filtering and clustering method is proposed and compared with corner point regression network.The training and evaluation of the segmentation method and corner regression network are carried out on the synthetic and real ice channel dataset.The evaluation results show that the accuracy of the method using the corner point regression network in the second stage is achieved as high as 73.33%on the synthetic ice channel dataset and 70.66%on the real ice channel dataset,and the processing speed can reach up to 14.58frames per second.展开更多
When building a classification model,the scenario where the samples of one class are significantly more than those of the other class is called data imbalance.Data imbalance causes the trained classification model to ...When building a classification model,the scenario where the samples of one class are significantly more than those of the other class is called data imbalance.Data imbalance causes the trained classification model to be in favor of the majority class(usually defined as the negative class),which may do harm to the accuracy of the minority class(usually defined as the positive class),and then lead to poor overall performance of the model.A method called MSHR-FCSSVM for solving imbalanced data classification is proposed in this article,which is based on a new hybrid resampling approach(MSHR)and a new fine cost-sensitive support vector machine(CS-SVM)classifier(FCSSVM).The MSHR measures the separability of each negative sample through its Silhouette value calculated by Mahalanobis distance between samples,based on which,the so-called pseudo-negative samples are screened out to generate new positive samples(over-sampling step)through linear interpolation and are deleted finally(under-sampling step).This approach replaces pseudo-negative samples with generated new positive samples one by one to clear up the inter-class overlap on the borderline,without changing the overall scale of the dataset.The FCSSVM is an improved version of the traditional CS-SVM.It considers influences of both the imbalance of sample number and the class distribution on classification simultaneously,and through finely tuning the class cost weights by using the efficient optimization algorithm based on the physical phenomenon of rime-ice(RIME)algorithm with cross-validation accuracy as the fitness function to accurately adjust the classification borderline.To verify the effectiveness of the proposed method,a series of experiments are carried out based on 20 imbalanced datasets including both mildly and extremely imbalanced datasets.The experimental results show that the MSHR-FCSSVM method performs better than the methods for comparison in most cases,and both the MSHR and the FCSSVM played significant roles.展开更多
The complex sand-casting process combined with the interactions between process parameters makes it difficult to control the casting quality,resulting in a high scrap rate.A strategy based on a data-driven model was p...The complex sand-casting process combined with the interactions between process parameters makes it difficult to control the casting quality,resulting in a high scrap rate.A strategy based on a data-driven model was proposed to reduce casting defects and improve production efficiency,which includes the random forest(RF)classification model,the feature importance analysis,and the process parameters optimization with Monte Carlo simulation.The collected data includes four types of defects and corresponding process parameters were used to construct the RF model.Classification results show a recall rate above 90% for all categories.The Gini Index was used to assess the importance of the process parameters in the formation of various defects in the RF model.Finally,the classification model was applied to different production conditions for quality prediction.In the case of process parameters optimization for gas porosity defects,this model serves as an experimental process in the Monte Carlo method to estimate a better temperature distribution.The prediction model,when applied to the factory,greatly improved the efficiency of defect detection.Results show that the scrap rate decreased from 10.16% to 6.68%.展开更多
Person identification is one of the most vital tasks for network security. People are more concerned about theirsecurity due to traditional passwords becoming weaker or leaking in various attacks. In recent decades, f...Person identification is one of the most vital tasks for network security. People are more concerned about theirsecurity due to traditional passwords becoming weaker or leaking in various attacks. In recent decades, fingerprintsand faces have been widely used for person identification, which has the risk of information leakage as a resultof reproducing fingers or faces by taking a snapshot. Recently, people have focused on creating an identifiablepattern, which will not be reproducible falsely by capturing psychological and behavioral information of a personusing vision and sensor-based techniques. In existing studies, most of the researchers used very complex patternsin this direction, which need special training and attention to remember the patterns and failed to capturethe psychological and behavioral information of a person properly. To overcome these problems, this researchdevised a novel dynamic hand gesture-based person identification system using a Leap Motion sensor. Thisstudy developed two hand gesture-based pattern datasets for performing the experiments, which contained morethan 500 samples, collected from 25 subjects. Various static and dynamic features were extracted from the handgeometry. Randomforest was used to measure feature importance using the Gini Index. Finally, the support vectormachinewas implemented for person identification and evaluate its performance using identification accuracy. Theexperimental results showed that the proposed system produced an identification accuracy of 99.8% for arbitraryhand gesture-based patterns and 99.6% for the same dynamic hand gesture-based patterns. This result indicatedthat the proposed system can be used for person identification in the field of security.展开更多
Network traffic identification is critical for maintaining network security and further meeting various demands of network applications.However,network traffic data typically possesses high dimensionality and complexi...Network traffic identification is critical for maintaining network security and further meeting various demands of network applications.However,network traffic data typically possesses high dimensionality and complexity,leading to practical problems in traffic identification data analytics.Since the original Dung Beetle Optimizer(DBO)algorithm,Grey Wolf Optimization(GWO)algorithm,Whale Optimization Algorithm(WOA),and Particle Swarm Optimization(PSO)algorithm have the shortcomings of slow convergence and easily fall into the local optimal solution,an Improved Dung Beetle Optimizer(IDBO)algorithm is proposed for network traffic identification.Firstly,the Sobol sequence is utilized to initialize the dung beetle population,laying the foundation for finding the global optimal solution.Next,an integration of levy flight and golden sine strategy is suggested to give dung beetles a greater probability of exploring unvisited areas,escaping from the local optimal solution,and converging more effectively towards a global optimal solution.Finally,an adaptive weight factor is utilized to enhance the search capabilities of the original DBO algorithm and accelerate convergence.With the improvements above,the proposed IDBO algorithm is then applied to traffic identification data analytics and feature selection,as so to find the optimal subset for K-Nearest Neighbor(KNN)classification.The simulation experiments use the CICIDS2017 dataset to verify the effectiveness of the proposed IDBO algorithm and compare it with the original DBO,GWO,WOA,and PSO algorithms.The experimental results show that,compared with other algorithms,the accuracy and recall are improved by 1.53%and 0.88%in binary classification,and the Distributed Denial of Service(DDoS)class identification is the most effective in multi-classification,with an improvement of 5.80%and 0.33%for accuracy and recall,respectively.Therefore,the proposed IDBO algorithm is effective in increasing the efficiency of traffic identification and solving the problem of the original DBO algorithm that converges slowly and falls into the local optimal solution when dealing with high-dimensional data analytics and feature selection for network traffic identification.展开更多
Taking the Lower Permian Fengcheng Formation shale in Mahu Sag of Junggar Basin,NW China,as an example,core observation,test analysis,geological analysis and numerical simulation were applied to identify the shale oil...Taking the Lower Permian Fengcheng Formation shale in Mahu Sag of Junggar Basin,NW China,as an example,core observation,test analysis,geological analysis and numerical simulation were applied to identify the shale oil micro-migration phenomenon.The hydrocarbon micro-migration in shale oil was quantitatively evaluated and verified by a self-created hydrocarbon expulsion potential method,and the petroleum geological significance of shale oil micro-migration evaluation was determined.Results show that significant micro-migration can be recognized between the organic-rich lamina and organic-poor lamina.The organic-rich lamina has strong hydrocarbon generation ability.The heavy components of hydrocarbon preferentially retained by kerogen swelling or adsorption,while the light components of hydrocarbon were migrated and accumulated to the interbedded felsic or carbonate organic-poor laminae as free oil.About 69% of the Fengcheng Formation shale samples in Well MY1 exhibit hydrocarbon charging phenomenon,while 31% of those exhibit hydrocarbon expulsion phenomenon.The reliability of the micro-migration evaluation results was verified by combining the group components based on the geochromatography effect,two-dimension nuclear magnetic resonance analysis,and the geochemical behavior of inorganic manganese elements in the process of hydrocarbon migration.Micro-migration is a bridge connecting the hydrocarbon accumulation elements in shale formations,which reflects the whole process of shale oil generation,expulsion and accumulation,and controls the content and composition of shale oil.The identification and evaluation of shale oil micro-migration will provide new perspectives for dynamically differential enrichment mechanism of shale oil and establishing a“multi-peak model in oil generation”of shale.展开更多
The network of Himalayan roadways and highways connects some remote regions of valleys or hill slopes,which is vital for India’s socio-economic growth.Due to natural and artificial factors,frequency of slope instabil...The network of Himalayan roadways and highways connects some remote regions of valleys or hill slopes,which is vital for India’s socio-economic growth.Due to natural and artificial factors,frequency of slope instabilities along the networks has been increasing over last few decades.Assessment of stability of natural and artificial slopes due to construction of these connecting road networks is significant in safely executing these roads throughout the year.Several rock mass classification methods are generally used to assess the strength and deformability of rock mass.This study assesses slope stability along the NH-1A of Ramban district of North Western Himalayas.Various structurally and non-structurally controlled rock mass classification systems have been applied to assess the stability conditions of 14 slopes.For evaluating the stability of these slopes,kinematic analysis was performed along with geological strength index(GSI),rock mass rating(RMR),continuous slope mass rating(CoSMR),slope mass rating(SMR),and Q-slope in the present study.The SMR gives three slopes as completely unstable while CoSMR suggests four slopes as completely unstable.The stability of all slopes was also analyzed using a design chart under dynamic and static conditions by slope stability rating(SSR)for the factor of safety(FoS)of 1.2 and 1 respectively.Q-slope with probability of failure(PoF)1%gives two slopes as stable slopes.Stable slope angle has been determined based on the Q-slope safe angle equation and SSR design chart based on the FoS.The value ranges given by different empirical classifications were RMR(37-74),GSI(27.3-58.5),SMR(11-59),and CoSMR(3.39-74.56).Good relationship was found among RMR&SSR and RMR&GSI with correlation coefficient(R 2)value of 0.815 and 0.6866,respectively.Lastly,a comparative stability of all these slopes based on the above classification has been performed to identify the most critical slope along this road.展开更多
Background: Cavernous transformation of the portal vein(CTPV) due to portal vein obstruction is a rare vascular anomaly defined as the formation of multiple collateral vessels in the hepatic hilum. This study aimed to...Background: Cavernous transformation of the portal vein(CTPV) due to portal vein obstruction is a rare vascular anomaly defined as the formation of multiple collateral vessels in the hepatic hilum. This study aimed to investigate the imaging features of intrahepatic portal vein in adult patients with CTPV and establish the relationship between the manifestations of intrahepatic portal vein and the progression of CTPV. Methods: We retrospectively analyzed 14 CTPV patients in Beijing Tsinghua Changgung Hospital. All patients underwent both direct portal venography(DPV) and computed tomography angiography(CTA) to reveal the manifestations of the portal venous system. The vessels measured included the left portal vein(LPV), right portal vein(RPV), main portal vein(MPV) and the portal vein bifurcation(PVB). Results: Nine males and 5 females, with a median age of 40.5 years, were included in the study. No significant difference was found in the diameters of the LPV or RPV measured by DPV and CTA. The visualization in terms of LPV, RPV and PVB measured by DPV was higher than that by CTA. There was a significant association between LPV/RPV and PVB/MPV in term of visibility revealed with DPV( P = 0.01), while this association was not observed with CTA. According to the imaging features of the portal vein measured by DPV, CTPV was classified into three categories to facilitate the diagnosis and treatment. Conclusions: DPV was more accurate than CTA for revealing the course of the intrahepatic portal vein in patients with CTPV. The classification of CTPV, that originated from the imaging features of the portal vein revealed by DPV, may provide a new perspective for the diagnosis and treatment of CTPV.展开更多
In this study,our aim is to address the problem of gene selection by proposing a hybrid bio-inspired evolutionary algorithm that combines Grey Wolf Optimization(GWO)with Harris Hawks Optimization(HHO)for feature selec...In this study,our aim is to address the problem of gene selection by proposing a hybrid bio-inspired evolutionary algorithm that combines Grey Wolf Optimization(GWO)with Harris Hawks Optimization(HHO)for feature selection.Themotivation for utilizingGWOandHHOstems fromtheir bio-inspired nature and their demonstrated success in optimization problems.We aimto leverage the strengths of these algorithms to enhance the effectiveness of feature selection in microarray-based cancer classification.We selected leave-one-out cross-validation(LOOCV)to evaluate the performance of both two widely used classifiers,k-nearest neighbors(KNN)and support vector machine(SVM),on high-dimensional cancer microarray data.The proposed method is extensively tested on six publicly available cancer microarray datasets,and a comprehensive comparison with recently published methods is conducted.Our hybrid algorithm demonstrates its effectiveness in improving classification performance,Surpassing alternative approaches in terms of precision.The outcomes confirm the capability of our method to substantially improve both the precision and efficiency of cancer classification,thereby advancing the development ofmore efficient treatment strategies.The proposed hybridmethod offers a promising solution to the gene selection problem in microarray-based cancer classification.It improves the accuracy and efficiency of cancer diagnosis and treatment,and its superior performance compared to other methods highlights its potential applicability in realworld cancer classification tasks.By harnessing the complementary search mechanisms of GWO and HHO,we leverage their bio-inspired behavior to identify informative genes relevant to cancer diagnosis and treatment.展开更多
Groundwater is an important source of drinking water.Groundwater pollution severely endangers drinking water safety and sustainable social development.In the case of groundwater pollution,the top priority is to identi...Groundwater is an important source of drinking water.Groundwater pollution severely endangers drinking water safety and sustainable social development.In the case of groundwater pollution,the top priority is to identify pollution sources,and accurate information on pollution sources is the premise of efficient remediation.Then,an appropriate pollution remediation scheme should be developed according to information on pollution sources,site conditions,and economic costs.The methods for identifying pollution sources mainly include geophysical exploration,geochemistry,isotopic tracing,and numerical modeling.Among these identification methods,only the numerical modeling can recognize various information on pollution sources,while other methods can only identify a certain aspect of pollution sources.The remediation technologies of groundwater can be divided into in-situ and ex-situ remediation technologies according to the remediation location.The in-situ remediation technologies enjoy low costs and a wide remediation range,but their remediation performance is prone to be affected by environmental conditions and cause secondary pollution.The ex-situ remediation technologies boast high remediation efficiency,high processing capacity,and high treatment concentration but suffer high costs.Different methods for pollution source identification and remediation technologies are applicable to different conditions.To achieve the expected identification and remediation results,it is feasible to combine several methods and technologies according to the actual hydrogeological conditions of contaminated sites and the nature of pollutants.Additionally,detailed knowledge about the hydrogeological conditions and stratigraphic structure of the contaminated site is the basis of all work regardless of the adopted identification methods or remediation technologies.展开更多
Purpose:Many science,technology and innovation(STI)resources are attached with several different labels.To assign automatically the resulting labels to an interested instance,many approaches with good performance on t...Purpose:Many science,technology and innovation(STI)resources are attached with several different labels.To assign automatically the resulting labels to an interested instance,many approaches with good performance on the benchmark datasets have been proposed for multi-label classification task in the literature.Furthermore,several open-source tools implementing these approaches have also been developed.However,the characteristics of real-world multi-label patent and publication datasets are not completely in line with those of benchmark ones.Therefore,the main purpose of this paper is to evaluate comprehensively seven multi-label classification methods on real-world datasets.Research limitations:Three real-world datasets differ in the following aspects:statement,data quality,and purposes.Additionally,open-source tools designed for multi-label classification also have intrinsic differences in their approaches for data processing and feature selection,which in turn impacts the performance of a multi-label classification approach.In the near future,we will enhance experimental precision and reinforce the validity of conclusions by employing more rigorous control over variables through introducing expanded parameter settings.Practical implications:The observed Macro F1 and Micro F1 scores on real-world datasets typically fall short of those achieved on benchmark datasets,underscoring the complexity of real-world multi-label classification tasks.Approaches leveraging deep learning techniques offer promising solutions by accommodating the hierarchical relationships and interdependencies among labels.With ongoing enhancements in deep learning algorithms and large-scale models,it is expected that the efficacy of multi-label classification tasks will be significantly improved,reaching a level of practical utility in the foreseeable future.Originality/value:(1)Seven multi-label classification methods are comprehensively compared on three real-world datasets.(2)The TextCNN and TextRCNN models perform better on small-scale datasets with more complex hierarchical structure of labels and more balanced document-label distribution.(3)The MLkNN method works better on the larger-scale dataset with more unbalanced document-label distribution.展开更多
Currently,telecom fraud is expanding from the traditional telephone network to the Internet,and identifying fraudulent IPs is of great significance for reducing Internet telecom fraud and protecting consumer rights.Ho...Currently,telecom fraud is expanding from the traditional telephone network to the Internet,and identifying fraudulent IPs is of great significance for reducing Internet telecom fraud and protecting consumer rights.However,existing telecom fraud identification methods based on blacklists,reputation,content and behavioral characteristics have good identification performance in the telephone network,but it is difficult to apply to the Internet where IP(Internet Protocol)addresses change dynamically.To address this issue,we propose a fraudulent IP identification method based on homology detection and DBSCAN(Density-Based Spatial Clustering of Applications with Noise)clustering(DC-FIPD).First,we analyze the aggregation of fraudulent IP geographies and the homology of IP addresses.Next,the collected fraudulent IPs are clustered geographically to obtain the regional distribution of fraudulent IPs.Then,we constructed the fraudulent IP feature set,used the genetic optimization algorithm to determine the weights of the fraudulent IP features,and designed the calculation method of the IP risk value to give the risk value threshold of the fraudulent IP.Finally,the risk value of the target IP is calculated and the IP is identified based on the risk value threshold.Experimental results on a real-world telecom fraud detection dataset show that the DC-FIPD method achieves an average identification accuracy of 86.64%for fraudulent IPs.Additionally,the method records a precision of 86.08%,a recall of 45.24%,and an F1-score of 59.31%,offering a comprehensive evaluation of its performance in fraud detection.These results highlight the DC-FIPD method’s effectiveness in addressing the challenges of fraudulent IP identification.展开更多
The tell tail is usually placed on the triangular sail to display the running state of the air flow on the sail surface.It is of great significance to make accurate judgement on the drift of the tell tail of the sailb...The tell tail is usually placed on the triangular sail to display the running state of the air flow on the sail surface.It is of great significance to make accurate judgement on the drift of the tell tail of the sailboat during sailing for the best sailing effect.Normally it is difficult for sailors to keep an eye for a long time on the tell sail for accurate judging its changes,affected by strong sunlight and visual fatigue.In this case,we adopt computer vision technology in hope of helping the sailors judge the changes of the tell tail in ease with ease.This paper proposes for the first time a method to classify sailboat tell tails based on deep learning and an expert guidance system,supported by a sailboat tell tail classification data set on the expert guidance system of interpreting the tell tails states in different sea wind conditions,including the feature extraction performance.Considering the expression capabilities that vary with the computational features in different visual tasks,the paper focuses on five tell tail computing features,which are recoded by an automatic encoder and classified by a SVM classifier.All experimental samples were randomly divided into five groups,and four groups were selected from each group as the training set to train the classifier.The remaining one group was used as the test set for testing.The highest resolution value of the ResNet network was 80.26%.To achieve better operational results on the basis of deep computing features obtained through the ResNet network in the experiments.The method can be used to assist the sailors in making better judgement about the tell tail changes during sailing.展开更多
Lung cancer is a leading cause of global mortality rates.Early detection of pulmonary tumors can significantly enhance the survival rate of patients.Recently,various Computer-Aided Diagnostic(CAD)methods have been dev...Lung cancer is a leading cause of global mortality rates.Early detection of pulmonary tumors can significantly enhance the survival rate of patients.Recently,various Computer-Aided Diagnostic(CAD)methods have been developed to enhance the detection of pulmonary nodules with high accuracy.Nevertheless,the existing method-ologies cannot obtain a high level of specificity and sensitivity.The present study introduces a novel model for Lung Cancer Segmentation and Classification(LCSC),which incorporates two improved architectures,namely the improved U-Net architecture and the improved AlexNet architecture.The LCSC model comprises two distinct stages.The first stage involves the utilization of an improved U-Net architecture to segment candidate nodules extracted from the lung lobes.Subsequently,an improved AlexNet architecture is employed to classify lung cancer.During the first stage,the proposed model demonstrates a dice accuracy of 0.855,a precision of 0.933,and a recall of 0.789 for the segmentation of candidate nodules.The suggested improved AlexNet architecture attains 97.06%accuracy,a true positive rate of 96.36%,a true negative rate of 97.77%,a positive predictive value of 97.74%,and a negative predictive value of 96.41%for classifying pulmonary cancer as either benign or malignant.The proposed LCSC model is tested and evaluated employing the publically available dataset furnished by the Lung Image Database Consortium and Image Database Resource Initiative(LIDC-IDRI).This proposed technique exhibits remarkable performance compared to the existing methods by using various evaluation parameters.展开更多
基金funded by a major special project of PetroChina Company Limited(No.2021DJ1003No.2023ZZ2).
文摘The intricate distribution of oil and water in tight rocks makes pinpointing oil layers challenging.While conventional identification methods offer potential solutions,their limited accuracy precludes them from being effective in their applications to unconventional reservoirs.This study employed nuclear magnetic resonance(NMR)spectrum decomposition to dissect the NMR T_(2)spectrum into multiple subspectra.Furthermore,it employed laboratory NMR experiments to ascertain the fluid properties of these sub-spectra,aiming to enhance identification accuracy.The findings indicate that fluids of distinct properties overlap in the T_(2)spectra,with bound water,movable water,bound oil,and movable oil appearing sequentially from the low-value zone to the high-value zone.Consequently,an oil layer classification scheme was proposed,which considers the physical properties of reservoirs,oil-bearing capacity,and the characteristics of both mobility and the oil-water two-phase flow.When applied to tight oil layer identification,the scheme's outcomes align closely with actual test results.A horizontal well,deployed based on these findings,has produced high-yield industrial oil flow,underscoring the precision and dependability of this new approach.
基金Equinor for financing the R&D projectthe Institute of Science and Technology of Petroleum Geophysics of Brazil for supporting this research。
文摘We apply stochastic seismic inversion and Bayesian facies classification for porosity modeling and igneous rock identification in the presalt interval of the Santos Basin. This integration of seismic and well-derived information enhances reservoir characterization. Stochastic inversion and Bayesian classification are powerful tools because they permit addressing the uncertainties in the model. We used the ES-MDA algorithm to achieve the realizations equivalent to the percentiles P10, P50, and P90 of acoustic impedance, a novel method for acoustic inversion in presalt. The facies were divided into five: reservoir 1,reservoir 2, tight carbonates, clayey rocks, and igneous rocks. To deal with the overlaps in acoustic impedance values of facies, we included geological information using a priori probability, indicating that structural highs are reservoir-dominated. To illustrate our approach, we conducted porosity modeling using facies-related rock-physics models for rock-physics inversion in an area with a well drilled in a coquina bank and evaluated the thickness and extension of an igneous intrusion near the carbonate-salt interface. The modeled porosity and the classified seismic facies are in good agreement with the ones observed in the wells. Notably, the coquinas bank presents an improvement in the porosity towards the top. The a priori probability model was crucial for limiting the clayey rocks to the structural lows. In Well B, the hit rate of the igneous rock in the three scenarios is higher than 60%, showing an excellent thickness-prediction capability.
基金supported by the National Key R&D Program of China(2019YFA0708303)the Sichuan Science and Technology Program(2021YFG0318)+2 种基金the Engineering Technology Joint Research Institute Project of CCDC-SWPU(CQXN-2021-03)the PetroChina Innovation Foundation(2020D-5007-0312)the Key projects of NSFC(61731016).
文摘During the transient process of gas drilling conditions,the monitoring data often has obvious nonlinear fluctuation features,which leads to large classification errors and time delays in the commonly used intelligent classification models.Combined with the structural features of data samples obtained from monitoring while drilling,this paper uses convolution algorithm to extract the correlation features of multiple monitoring while drilling parameters changing with time,and applies RBF network with nonlinear classification ability to classify the features.In the training process,the loss function component based on distance mean square error is used to effectively adjust the best clustering center in RBF.Many field applications show that,the recognition accuracy of the above nonlinear classification network model for gas production,water production and drill sticking is 97.32%,95.25%and 93.78%.Compared with the traditional convolutional neural network(CNN)model,the network structure not only improves the classification accuracy of conditions in the transition stage of conditions,but also greatly advances the time points of risk identification,especially for the three common risk identification points of gas production,water production and drill sticking,which are advanced by 56,16 and 8 s.It has won valuable time for the site to take correct risk disposal measures in time,and fully demonstrated the applicability of nonlinear classification neural network in oil and gas field exploration and development.
基金the support from the Physical Research Platform in the School of Physics of Sun Yat-sen University(PRPSP,SYSU)Project supported by the National Natural Science Foundation of China(Grant No.12074445)the Open Fund of the State Key Laboratory of Optoelectronic Materials and Technologies of Sun Yat-sen University(Grant No.OEMT-2022-ZTS-05)。
文摘Single-molecule force spectroscopy(SMFS)measurements of the dynamics of biomolecules typically require identifying massive events and states from large data sets,such as extracting rupture forces from force-extension curves(FECs)in pulling experiments and identifying states from extension-time trajectories(ETTs)in force-clamp experiments.The former is often accomplished manually and hence is time-consuming and laborious while the latter is always impeded by the presence of baseline drift.In this study,we attempt to accurately and automatically identify the events and states from SMFS experiments with a machine learning approach,which combines clustering and classification for event identification of SMFS(ACCESS).As demonstrated by analysis of a series of data sets,ACCESS can extract the rupture forces from FECs containing multiple unfolding steps and classify the rupture forces into the corresponding conformational transitions.Moreover,ACCESS successfully identifies the unfolded and folded states even though the ETTs display severe nonmonotonic baseline drift.Besides,ACCESS is straightforward in use as it requires only three easy-to-interpret parameters.As such,we anticipate that ACCESS will be a useful,easy-to-implement and high-performance tool for event and state identification across a range of single-molecule experiments.
文摘The KT-II layer in the Zananor Oilfield,Caspian Basin,Kazakhstan,contains carbonate reservoirs of various types.The complex pore structure of the reservoirs have made it difficult to identify watered-out zones with traditional logging interpretation methods.This study classifies the reservoirs on the basis of core analysis and establishes an identification model for watered-out layers in the field to effectively improve the interpretation accuracy.Thin section analysis shows that there are three types of pores in the reservoirs,i.e.,the matrix pore,fracture and dissolution vug.A triple porosity model is used to calculate the porosities of the reservoirs and the results are combined with core analysis to classify the reservoirs into the fractured,matrix pore,fracture-pore as well as composite types.A classification standard is also proposed.There are differences in resistivity logging responses from the reservoirs of different types before and after watering-out.The preewatering-out resistivities are reconstructed using generalized neural network for different types of reservoirs.The watered-out layers can be effectively identified according to the difference in resistivity curves before and after watering-out.The results show that the watered-out layers identified with the method are consistent with measured data,thus serving as a reference for the evaluation of watered-out layers in the study area.
基金supported in part by the Nationa Natural Science Foundation of China (61876011)the National Key Research and Development Program of China (2022YFB4703700)+1 种基金the Key Research and Development Program 2020 of Guangzhou (202007050002)the Key-Area Research and Development Program of Guangdong Province (2020B090921003)。
文摘Recently, there have been some attempts of Transformer in 3D point cloud classification. In order to reduce computations, most existing methods focus on local spatial attention,but ignore their content and fail to establish relationships between distant but relevant points. To overcome the limitation of local spatial attention, we propose a point content-based Transformer architecture, called PointConT for short. It exploits the locality of points in the feature space(content-based), which clusters the sampled points with similar features into the same class and computes the self-attention within each class, thus enabling an effective trade-off between capturing long-range dependencies and computational complexity. We further introduce an inception feature aggregator for point cloud classification, which uses parallel structures to aggregate high-frequency and low-frequency information in each branch separately. Extensive experiments show that our PointConT model achieves a remarkable performance on point cloud shape classification. Especially, our method exhibits 90.3% Top-1 accuracy on the hardest setting of ScanObjectN N. Source code of this paper is available at https://github.com/yahuiliu99/PointC onT.
基金financially supported by the National Key Research and Development Program(Grant No.2022YFE0107000)the General Projects of the National Natural Science Foundation of China(Grant No.52171259)the High-Tech Ship Research Project of the Ministry of Industry and Information Technology(Grant No.[2021]342)。
文摘Identification of the ice channel is the basic technology for developing intelligent ships in ice-covered waters,which is important to ensure the safety and economy of navigation.In the Arctic,merchant ships with low ice class often navigate in channels opened up by icebreakers.Navigation in the ice channel often depends on good maneuverability skills and abundant experience from the captain to a large extent.The ship may get stuck if steered into ice fields off the channel.Under this circumstance,it is very important to study how to identify the boundary lines of ice channels with a reliable method.In this paper,a two-staged ice channel identification method is developed based on image segmentation and corner point regression.The first stage employs the image segmentation method to extract channel regions.In the second stage,an intelligent corner regression network is proposed to extract the channel boundary lines from the channel region.A non-intelligent angle-based filtering and clustering method is proposed and compared with corner point regression network.The training and evaluation of the segmentation method and corner regression network are carried out on the synthetic and real ice channel dataset.The evaluation results show that the accuracy of the method using the corner point regression network in the second stage is achieved as high as 73.33%on the synthetic ice channel dataset and 70.66%on the real ice channel dataset,and the processing speed can reach up to 14.58frames per second.
基金supported by the Yunnan Major Scientific and Technological Projects(Grant No.202302AD080001)the National Natural Science Foundation,China(No.52065033).
文摘When building a classification model,the scenario where the samples of one class are significantly more than those of the other class is called data imbalance.Data imbalance causes the trained classification model to be in favor of the majority class(usually defined as the negative class),which may do harm to the accuracy of the minority class(usually defined as the positive class),and then lead to poor overall performance of the model.A method called MSHR-FCSSVM for solving imbalanced data classification is proposed in this article,which is based on a new hybrid resampling approach(MSHR)and a new fine cost-sensitive support vector machine(CS-SVM)classifier(FCSSVM).The MSHR measures the separability of each negative sample through its Silhouette value calculated by Mahalanobis distance between samples,based on which,the so-called pseudo-negative samples are screened out to generate new positive samples(over-sampling step)through linear interpolation and are deleted finally(under-sampling step).This approach replaces pseudo-negative samples with generated new positive samples one by one to clear up the inter-class overlap on the borderline,without changing the overall scale of the dataset.The FCSSVM is an improved version of the traditional CS-SVM.It considers influences of both the imbalance of sample number and the class distribution on classification simultaneously,and through finely tuning the class cost weights by using the efficient optimization algorithm based on the physical phenomenon of rime-ice(RIME)algorithm with cross-validation accuracy as the fitness function to accurately adjust the classification borderline.To verify the effectiveness of the proposed method,a series of experiments are carried out based on 20 imbalanced datasets including both mildly and extremely imbalanced datasets.The experimental results show that the MSHR-FCSSVM method performs better than the methods for comparison in most cases,and both the MSHR and the FCSSVM played significant roles.
基金financially supported by the National Key Research and Development Program of China(2022YFB3706800,2020YFB1710100)the National Natural Science Foundation of China(51821001,52090042,52074183)。
文摘The complex sand-casting process combined with the interactions between process parameters makes it difficult to control the casting quality,resulting in a high scrap rate.A strategy based on a data-driven model was proposed to reduce casting defects and improve production efficiency,which includes the random forest(RF)classification model,the feature importance analysis,and the process parameters optimization with Monte Carlo simulation.The collected data includes four types of defects and corresponding process parameters were used to construct the RF model.Classification results show a recall rate above 90% for all categories.The Gini Index was used to assess the importance of the process parameters in the formation of various defects in the RF model.Finally,the classification model was applied to different production conditions for quality prediction.In the case of process parameters optimization for gas porosity defects,this model serves as an experimental process in the Monte Carlo method to estimate a better temperature distribution.The prediction model,when applied to the factory,greatly improved the efficiency of defect detection.Results show that the scrap rate decreased from 10.16% to 6.68%.
基金the Competitive Research Fund of the University of Aizu,Japan.
文摘Person identification is one of the most vital tasks for network security. People are more concerned about theirsecurity due to traditional passwords becoming weaker or leaking in various attacks. In recent decades, fingerprintsand faces have been widely used for person identification, which has the risk of information leakage as a resultof reproducing fingers or faces by taking a snapshot. Recently, people have focused on creating an identifiablepattern, which will not be reproducible falsely by capturing psychological and behavioral information of a personusing vision and sensor-based techniques. In existing studies, most of the researchers used very complex patternsin this direction, which need special training and attention to remember the patterns and failed to capturethe psychological and behavioral information of a person properly. To overcome these problems, this researchdevised a novel dynamic hand gesture-based person identification system using a Leap Motion sensor. Thisstudy developed two hand gesture-based pattern datasets for performing the experiments, which contained morethan 500 samples, collected from 25 subjects. Various static and dynamic features were extracted from the handgeometry. Randomforest was used to measure feature importance using the Gini Index. Finally, the support vectormachinewas implemented for person identification and evaluate its performance using identification accuracy. Theexperimental results showed that the proposed system produced an identification accuracy of 99.8% for arbitraryhand gesture-based patterns and 99.6% for the same dynamic hand gesture-based patterns. This result indicatedthat the proposed system can be used for person identification in the field of security.
基金supported by the National Natural Science Foundation of China under Grant 61602162the Hubei Provincial Science and Technology Plan Project under Grant 2023BCB041.
文摘Network traffic identification is critical for maintaining network security and further meeting various demands of network applications.However,network traffic data typically possesses high dimensionality and complexity,leading to practical problems in traffic identification data analytics.Since the original Dung Beetle Optimizer(DBO)algorithm,Grey Wolf Optimization(GWO)algorithm,Whale Optimization Algorithm(WOA),and Particle Swarm Optimization(PSO)algorithm have the shortcomings of slow convergence and easily fall into the local optimal solution,an Improved Dung Beetle Optimizer(IDBO)algorithm is proposed for network traffic identification.Firstly,the Sobol sequence is utilized to initialize the dung beetle population,laying the foundation for finding the global optimal solution.Next,an integration of levy flight and golden sine strategy is suggested to give dung beetles a greater probability of exploring unvisited areas,escaping from the local optimal solution,and converging more effectively towards a global optimal solution.Finally,an adaptive weight factor is utilized to enhance the search capabilities of the original DBO algorithm and accelerate convergence.With the improvements above,the proposed IDBO algorithm is then applied to traffic identification data analytics and feature selection,as so to find the optimal subset for K-Nearest Neighbor(KNN)classification.The simulation experiments use the CICIDS2017 dataset to verify the effectiveness of the proposed IDBO algorithm and compare it with the original DBO,GWO,WOA,and PSO algorithms.The experimental results show that,compared with other algorithms,the accuracy and recall are improved by 1.53%and 0.88%in binary classification,and the Distributed Denial of Service(DDoS)class identification is the most effective in multi-classification,with an improvement of 5.80%and 0.33%for accuracy and recall,respectively.Therefore,the proposed IDBO algorithm is effective in increasing the efficiency of traffic identification and solving the problem of the original DBO algorithm that converges slowly and falls into the local optimal solution when dealing with high-dimensional data analytics and feature selection for network traffic identification.
基金Supported by the National Natural Science Foundation(42202133,42072174,42130803,41872148)PetroChina Science and Technology Innovation Fund(2023DQ02-0106)PetroChina Basic Technology Project(2021DJ0101).
文摘Taking the Lower Permian Fengcheng Formation shale in Mahu Sag of Junggar Basin,NW China,as an example,core observation,test analysis,geological analysis and numerical simulation were applied to identify the shale oil micro-migration phenomenon.The hydrocarbon micro-migration in shale oil was quantitatively evaluated and verified by a self-created hydrocarbon expulsion potential method,and the petroleum geological significance of shale oil micro-migration evaluation was determined.Results show that significant micro-migration can be recognized between the organic-rich lamina and organic-poor lamina.The organic-rich lamina has strong hydrocarbon generation ability.The heavy components of hydrocarbon preferentially retained by kerogen swelling or adsorption,while the light components of hydrocarbon were migrated and accumulated to the interbedded felsic or carbonate organic-poor laminae as free oil.About 69% of the Fengcheng Formation shale samples in Well MY1 exhibit hydrocarbon charging phenomenon,while 31% of those exhibit hydrocarbon expulsion phenomenon.The reliability of the micro-migration evaluation results was verified by combining the group components based on the geochromatography effect,two-dimension nuclear magnetic resonance analysis,and the geochemical behavior of inorganic manganese elements in the process of hydrocarbon migration.Micro-migration is a bridge connecting the hydrocarbon accumulation elements in shale formations,which reflects the whole process of shale oil generation,expulsion and accumulation,and controls the content and composition of shale oil.The identification and evaluation of shale oil micro-migration will provide new perspectives for dynamically differential enrichment mechanism of shale oil and establishing a“multi-peak model in oil generation”of shale.
文摘The network of Himalayan roadways and highways connects some remote regions of valleys or hill slopes,which is vital for India’s socio-economic growth.Due to natural and artificial factors,frequency of slope instabilities along the networks has been increasing over last few decades.Assessment of stability of natural and artificial slopes due to construction of these connecting road networks is significant in safely executing these roads throughout the year.Several rock mass classification methods are generally used to assess the strength and deformability of rock mass.This study assesses slope stability along the NH-1A of Ramban district of North Western Himalayas.Various structurally and non-structurally controlled rock mass classification systems have been applied to assess the stability conditions of 14 slopes.For evaluating the stability of these slopes,kinematic analysis was performed along with geological strength index(GSI),rock mass rating(RMR),continuous slope mass rating(CoSMR),slope mass rating(SMR),and Q-slope in the present study.The SMR gives three slopes as completely unstable while CoSMR suggests four slopes as completely unstable.The stability of all slopes was also analyzed using a design chart under dynamic and static conditions by slope stability rating(SSR)for the factor of safety(FoS)of 1.2 and 1 respectively.Q-slope with probability of failure(PoF)1%gives two slopes as stable slopes.Stable slope angle has been determined based on the Q-slope safe angle equation and SSR design chart based on the FoS.The value ranges given by different empirical classifications were RMR(37-74),GSI(27.3-58.5),SMR(11-59),and CoSMR(3.39-74.56).Good relationship was found among RMR&SSR and RMR&GSI with correlation coefficient(R 2)value of 0.815 and 0.6866,respectively.Lastly,a comparative stability of all these slopes based on the above classification has been performed to identify the most critical slope along this road.
文摘Background: Cavernous transformation of the portal vein(CTPV) due to portal vein obstruction is a rare vascular anomaly defined as the formation of multiple collateral vessels in the hepatic hilum. This study aimed to investigate the imaging features of intrahepatic portal vein in adult patients with CTPV and establish the relationship between the manifestations of intrahepatic portal vein and the progression of CTPV. Methods: We retrospectively analyzed 14 CTPV patients in Beijing Tsinghua Changgung Hospital. All patients underwent both direct portal venography(DPV) and computed tomography angiography(CTA) to reveal the manifestations of the portal venous system. The vessels measured included the left portal vein(LPV), right portal vein(RPV), main portal vein(MPV) and the portal vein bifurcation(PVB). Results: Nine males and 5 females, with a median age of 40.5 years, were included in the study. No significant difference was found in the diameters of the LPV or RPV measured by DPV and CTA. The visualization in terms of LPV, RPV and PVB measured by DPV was higher than that by CTA. There was a significant association between LPV/RPV and PVB/MPV in term of visibility revealed with DPV( P = 0.01), while this association was not observed with CTA. According to the imaging features of the portal vein measured by DPV, CTPV was classified into three categories to facilitate the diagnosis and treatment. Conclusions: DPV was more accurate than CTA for revealing the course of the intrahepatic portal vein in patients with CTPV. The classification of CTPV, that originated from the imaging features of the portal vein revealed by DPV, may provide a new perspective for the diagnosis and treatment of CTPV.
基金the Deputyship for Research and Innovation,“Ministry of Education”in Saudi Arabia for funding this research(IFKSUOR3-014-3).
文摘In this study,our aim is to address the problem of gene selection by proposing a hybrid bio-inspired evolutionary algorithm that combines Grey Wolf Optimization(GWO)with Harris Hawks Optimization(HHO)for feature selection.Themotivation for utilizingGWOandHHOstems fromtheir bio-inspired nature and their demonstrated success in optimization problems.We aimto leverage the strengths of these algorithms to enhance the effectiveness of feature selection in microarray-based cancer classification.We selected leave-one-out cross-validation(LOOCV)to evaluate the performance of both two widely used classifiers,k-nearest neighbors(KNN)and support vector machine(SVM),on high-dimensional cancer microarray data.The proposed method is extensively tested on six publicly available cancer microarray datasets,and a comprehensive comparison with recently published methods is conducted.Our hybrid algorithm demonstrates its effectiveness in improving classification performance,Surpassing alternative approaches in terms of precision.The outcomes confirm the capability of our method to substantially improve both the precision and efficiency of cancer classification,thereby advancing the development ofmore efficient treatment strategies.The proposed hybridmethod offers a promising solution to the gene selection problem in microarray-based cancer classification.It improves the accuracy and efficiency of cancer diagnosis and treatment,and its superior performance compared to other methods highlights its potential applicability in realworld cancer classification tasks.By harnessing the complementary search mechanisms of GWO and HHO,we leverage their bio-inspired behavior to identify informative genes relevant to cancer diagnosis and treatment.
基金funded by the National Natural Science Foundation of China(41907175)the Open Fund of Key Laboratory(WSRCR-2023-01)the project of the China Geological Survey(DD20230459).
文摘Groundwater is an important source of drinking water.Groundwater pollution severely endangers drinking water safety and sustainable social development.In the case of groundwater pollution,the top priority is to identify pollution sources,and accurate information on pollution sources is the premise of efficient remediation.Then,an appropriate pollution remediation scheme should be developed according to information on pollution sources,site conditions,and economic costs.The methods for identifying pollution sources mainly include geophysical exploration,geochemistry,isotopic tracing,and numerical modeling.Among these identification methods,only the numerical modeling can recognize various information on pollution sources,while other methods can only identify a certain aspect of pollution sources.The remediation technologies of groundwater can be divided into in-situ and ex-situ remediation technologies according to the remediation location.The in-situ remediation technologies enjoy low costs and a wide remediation range,but their remediation performance is prone to be affected by environmental conditions and cause secondary pollution.The ex-situ remediation technologies boast high remediation efficiency,high processing capacity,and high treatment concentration but suffer high costs.Different methods for pollution source identification and remediation technologies are applicable to different conditions.To achieve the expected identification and remediation results,it is feasible to combine several methods and technologies according to the actual hydrogeological conditions of contaminated sites and the nature of pollutants.Additionally,detailed knowledge about the hydrogeological conditions and stratigraphic structure of the contaminated site is the basis of all work regardless of the adopted identification methods or remediation technologies.
基金the Natural Science Foundation of China(Grant Numbers 72074014 and 72004012).
文摘Purpose:Many science,technology and innovation(STI)resources are attached with several different labels.To assign automatically the resulting labels to an interested instance,many approaches with good performance on the benchmark datasets have been proposed for multi-label classification task in the literature.Furthermore,several open-source tools implementing these approaches have also been developed.However,the characteristics of real-world multi-label patent and publication datasets are not completely in line with those of benchmark ones.Therefore,the main purpose of this paper is to evaluate comprehensively seven multi-label classification methods on real-world datasets.Research limitations:Three real-world datasets differ in the following aspects:statement,data quality,and purposes.Additionally,open-source tools designed for multi-label classification also have intrinsic differences in their approaches for data processing and feature selection,which in turn impacts the performance of a multi-label classification approach.In the near future,we will enhance experimental precision and reinforce the validity of conclusions by employing more rigorous control over variables through introducing expanded parameter settings.Practical implications:The observed Macro F1 and Micro F1 scores on real-world datasets typically fall short of those achieved on benchmark datasets,underscoring the complexity of real-world multi-label classification tasks.Approaches leveraging deep learning techniques offer promising solutions by accommodating the hierarchical relationships and interdependencies among labels.With ongoing enhancements in deep learning algorithms and large-scale models,it is expected that the efficacy of multi-label classification tasks will be significantly improved,reaching a level of practical utility in the foreseeable future.Originality/value:(1)Seven multi-label classification methods are comprehensively compared on three real-world datasets.(2)The TextCNN and TextRCNN models perform better on small-scale datasets with more complex hierarchical structure of labels and more balanced document-label distribution.(3)The MLkNN method works better on the larger-scale dataset with more unbalanced document-label distribution.
基金funded by the National Natural Science Foundation of China under Grant No.62002103Henan Province Science Foundation for Youths No.222300420058+1 种基金Henan Province Science and Technology Research Project No.232102321064Teacher Education Curriculum Reform Research Priority Project No.2023-JSJYZD-011.
文摘Currently,telecom fraud is expanding from the traditional telephone network to the Internet,and identifying fraudulent IPs is of great significance for reducing Internet telecom fraud and protecting consumer rights.However,existing telecom fraud identification methods based on blacklists,reputation,content and behavioral characteristics have good identification performance in the telephone network,but it is difficult to apply to the Internet where IP(Internet Protocol)addresses change dynamically.To address this issue,we propose a fraudulent IP identification method based on homology detection and DBSCAN(Density-Based Spatial Clustering of Applications with Noise)clustering(DC-FIPD).First,we analyze the aggregation of fraudulent IP geographies and the homology of IP addresses.Next,the collected fraudulent IPs are clustered geographically to obtain the regional distribution of fraudulent IPs.Then,we constructed the fraudulent IP feature set,used the genetic optimization algorithm to determine the weights of the fraudulent IP features,and designed the calculation method of the IP risk value to give the risk value threshold of the fraudulent IP.Finally,the risk value of the target IP is calculated and the IP is identified based on the risk value threshold.Experimental results on a real-world telecom fraud detection dataset show that the DC-FIPD method achieves an average identification accuracy of 86.64%for fraudulent IPs.Additionally,the method records a precision of 86.08%,a recall of 45.24%,and an F1-score of 59.31%,offering a comprehensive evaluation of its performance in fraud detection.These results highlight the DC-FIPD method’s effectiveness in addressing the challenges of fraudulent IP identification.
基金supported by the Shandong Provin-cial Key Research Project of Undergraduate Teaching Reform(No.Z2022218)the Fundamental Research Funds for the Central University(No.202113028)+1 种基金the Graduate Education Promotion Program of Ocean University of China(No.HDJG20006)supported by the Sailing Laboratory of Ocean University of China.
文摘The tell tail is usually placed on the triangular sail to display the running state of the air flow on the sail surface.It is of great significance to make accurate judgement on the drift of the tell tail of the sailboat during sailing for the best sailing effect.Normally it is difficult for sailors to keep an eye for a long time on the tell sail for accurate judging its changes,affected by strong sunlight and visual fatigue.In this case,we adopt computer vision technology in hope of helping the sailors judge the changes of the tell tail in ease with ease.This paper proposes for the first time a method to classify sailboat tell tails based on deep learning and an expert guidance system,supported by a sailboat tell tail classification data set on the expert guidance system of interpreting the tell tails states in different sea wind conditions,including the feature extraction performance.Considering the expression capabilities that vary with the computational features in different visual tasks,the paper focuses on five tell tail computing features,which are recoded by an automatic encoder and classified by a SVM classifier.All experimental samples were randomly divided into five groups,and four groups were selected from each group as the training set to train the classifier.The remaining one group was used as the test set for testing.The highest resolution value of the ResNet network was 80.26%.To achieve better operational results on the basis of deep computing features obtained through the ResNet network in the experiments.The method can be used to assist the sailors in making better judgement about the tell tail changes during sailing.
基金supported and funded by the Deanship of Scientific Research at Imam Mohammad Ibn Saud Islamic University(IMSIU)(Grant Number IMSIU-RP23044).
文摘Lung cancer is a leading cause of global mortality rates.Early detection of pulmonary tumors can significantly enhance the survival rate of patients.Recently,various Computer-Aided Diagnostic(CAD)methods have been developed to enhance the detection of pulmonary nodules with high accuracy.Nevertheless,the existing method-ologies cannot obtain a high level of specificity and sensitivity.The present study introduces a novel model for Lung Cancer Segmentation and Classification(LCSC),which incorporates two improved architectures,namely the improved U-Net architecture and the improved AlexNet architecture.The LCSC model comprises two distinct stages.The first stage involves the utilization of an improved U-Net architecture to segment candidate nodules extracted from the lung lobes.Subsequently,an improved AlexNet architecture is employed to classify lung cancer.During the first stage,the proposed model demonstrates a dice accuracy of 0.855,a precision of 0.933,and a recall of 0.789 for the segmentation of candidate nodules.The suggested improved AlexNet architecture attains 97.06%accuracy,a true positive rate of 96.36%,a true negative rate of 97.77%,a positive predictive value of 97.74%,and a negative predictive value of 96.41%for classifying pulmonary cancer as either benign or malignant.The proposed LCSC model is tested and evaluated employing the publically available dataset furnished by the Lung Image Database Consortium and Image Database Resource Initiative(LIDC-IDRI).This proposed technique exhibits remarkable performance compared to the existing methods by using various evaluation parameters.