Although objectivity is mainly accounted for in terms of linguistic thought and communication,in this article I will aim to showthat at least one condition of possibility for our understanding of objectivity is ground...Although objectivity is mainly accounted for in terms of linguistic thought and communication,in this article I will aim to showthat at least one condition of possibility for our understanding of objectivity is grounded on a prepredicative,i. e. pre-linguistic and pre-communicative,level. I will endorse a Husserlian viewpoint on the issue,and I will try to develop some aspects of the Husserlian account of three-dimensional thing-perception by means of which I will showhowprepredicative experience can actually offer us a fundamental element of our common understanding of objectivity. In doing this,it will be necessary to acknowledge thing-perception as being primarily intertwined with indeterminacy. I will claim that only on the basis of such an intuitive and prepredicative access to the things as partially indeterminate,first,and as determinable,second,is it possible to have an understanding of the world as something (at least partially) independent from the intuition (s) all subjects can have of it. By means of the addition of a consciousness of the thing as accessible to other subjects,one achieves a vision of the thing as fully determinate in itself. This"vision",however,takes one to be aware of the determination of the thing as lying beyond any intuitive grasp of it. The result will,thus,be that the prepredicative constitution of our basic sense of objectivity leads us to intend the world as something which should be accounted for (also) by means of sources different from intuition.展开更多
For a physically possible deformation field of a continuum, the deformation gradient function F can be decomposed into direct sum of a symmetric tensor S and on orthogonal tensor R, which is called S-R decomposition t...For a physically possible deformation field of a continuum, the deformation gradient function F can be decomposed into direct sum of a symmetric tensor S and on orthogonal tensor R, which is called S-R decomposition theorem. In this paper, the S-R decomposition unique existence theorem is proved, by employing matrix and tensor method. Also, a brief proof of its objectivity is given.展开更多
This paper presents an analysis on how objectivity is discursively constructed in journalistic narratives by drawing on the theories of viewpoint and mental space in Cognitive Linguistics. It is posited that at least ...This paper presents an analysis on how objectivity is discursively constructed in journalistic narratives by drawing on the theories of viewpoint and mental space in Cognitive Linguistics. It is posited that at least three mental spaces are projected by a narrative discourse, i.e., a narrated event space, a narrating space, and a basic space, and the distance between the first two spaces determines the degree of objectivity in the narrative discourse. A schema which represents the configuration of the different spaces is proposed and applied in the analysis of journalistic narratives to explore the strategies of objectivity construction. The analysis reveals that what the different journalistic narratives have in common in the construction of objectivity is to distance the narrated event space and the narrating space with the former being foregrounded in the viewpoint arrangement.展开更多
The world is facing a once-in-a-lifetime situation:the COVID-19 pandemic.During the pandemic,the World Health Organization announced an infodemic as well.This infodemic caused infollution and sparked many controversie...The world is facing a once-in-a-lifetime situation:the COVID-19 pandemic.During the pandemic,the World Health Organization announced an infodemic as well.This infodemic caused infollution and sparked many controversies.Pandemics as extraordinary occurrences are always attractive to historians.However,infodemics and biased information threaten objective history-writing.Objectivity as it regards historians is already a much-discussed subject.In this commentary,the fundamental theories about objectivity are delineated.Second,the relationship between the infodemic and COVID-19 pandemic is explained.Lastly,the problems regarding objectivity in the historiography of the COVID-19 pandemic are explored.展开更多
China English, as one of the English varieties, is an objective reality. It is different from Chinglish which is an interlanguage for Chinese English learners. This paper expresses the definition of China English, its...China English, as one of the English varieties, is an objective reality. It is different from Chinglish which is an interlanguage for Chinese English learners. This paper expresses the definition of China English, its objectivity and manifestations in terms of pronunciation, vocabulary, syntax and text.展开更多
The history of science and medicine has long been steeped in the notion that they are objective(untainted by the philosophical and ideological ebbs and flows of society)and utilitarian(doing what is best for the great...The history of science and medicine has long been steeped in the notion that they are objective(untainted by the philosophical and ideological ebbs and flows of society)and utilitarian(doing what is best for the greater good).Because of this,scientific and medical epistemologies and praxis are often held to an esteem that is unquestioned,celebrated,and occasionally unchecked.A closer look at the history of science and medicine,however,readily reveal the extent to which the milieu of society has informed scientific and medical endeavors.As such,an understanding of how the subjectivities of scientific and medical endeavors situate within our contemporary disciplines and practices is significant to one’s ability to truly understand said disciplines.Likewise,such an evaluation will provide insight into our role in perpetuating the illusion of objectivity in these fields.With this in mind,this paper provides a philosophical and historical examination of the concept of objectivity(in contrast to subjectivity)in science and medicine.展开更多
Optical image-based ship detection can ensure the safety of ships and promote the orderly management of ships in offshore waters.Current deep learning researches on optical image-based ship detection mainly focus on i...Optical image-based ship detection can ensure the safety of ships and promote the orderly management of ships in offshore waters.Current deep learning researches on optical image-based ship detection mainly focus on improving one-stage detectors for real-time ship detection but sacrifices the accuracy of detection.To solve this problem,we present a hybrid ship detection framework which is named EfficientShip in this paper.The core parts of the EfficientShip are DLA-backboned object location(DBOL)and CascadeRCNN-guided object classification(CROC).The DBOL is responsible for finding potential ship objects,and the CROC is used to categorize the potential ship objects.We also design a pixel-spatial-level data augmentation(PSDA)to reduce the risk of detection model overfitting.We compare the proposed EfficientShip with state-of-the-art(SOTA)literature on a ship detection dataset called Seaships.Experiments show our ship detection framework achieves a result of 99.63%(mAP)at 45 fps,which is much better than 8 SOTA approaches on detection accuracy and can also meet the requirements of real-time application scenarios.展开更多
Road traffic monitoring is an imperative topic widely discussed among researchers.Systems used to monitor traffic frequently rely on cameras mounted on bridges or roadsides.However,aerial images provide the flexibilit...Road traffic monitoring is an imperative topic widely discussed among researchers.Systems used to monitor traffic frequently rely on cameras mounted on bridges or roadsides.However,aerial images provide the flexibility to use mobile platforms to detect the location and motion of the vehicle over a larger area.To this end,different models have shown the ability to recognize and track vehicles.However,these methods are not mature enough to produce accurate results in complex road scenes.Therefore,this paper presents an algorithm that combines state-of-the-art techniques for identifying and tracking vehicles in conjunction with image bursts.The extracted frames were converted to grayscale,followed by the application of a georeferencing algorithm to embed coordinate information into the images.The masking technique eliminated irrelevant data and reduced the computational cost of the overall monitoring system.Next,Sobel edge detection combined with Canny edge detection and Hough line transform has been applied for noise reduction.After preprocessing,the blob detection algorithm helped detect the vehicles.Vehicles of varying sizes have been detected by implementing a dynamic thresholding scheme.Detection was done on the first image of every burst.Then,to track vehicles,the model of each vehicle was made to find its matches in the succeeding images using the template matching algorithm.To further improve the tracking accuracy by incorporating motion information,Scale Invariant Feature Transform(SIFT)features have been used to find the best possible match among multiple matches.An accuracy rate of 87%for detection and 80%accuracy for tracking in the A1 Motorway Netherland dataset has been achieved.For the Vehicle Aerial Imaging from Drone(VAID)dataset,an accuracy rate of 86%for detection and 78%accuracy for tracking has been achieved.展开更多
Identifying workers’construction activities or behaviors can enable managers to better monitor labor efficiency and construction progress.However,current activity analysis methods for construction workers rely solely...Identifying workers’construction activities or behaviors can enable managers to better monitor labor efficiency and construction progress.However,current activity analysis methods for construction workers rely solely on manual observations and recordings,which consumes considerable time and has high labor costs.Researchers have focused on monitoring on-site construction activities of workers.However,when multiple workers are working together,current research cannot accu rately and automatically identify the construction activity.This research proposes a deep learning framework for the automated analysis of the construction activities of multiple workers.In this framework,multiple deep neural network models are designed and used to complete worker key point extraction,worker tracking,and worker construction activity analysis.The designed framework was tested at an actual construction site,and activity recognition for multiple workers was performed,indicating the feasibility of the framework for the automated monitoring of work efficiency.展开更多
Computer vision(CV)was developed for computers and other systems to act or make recommendations based on visual inputs,such as digital photos,movies,and other media.Deep learning(DL)methods are more successful than ot...Computer vision(CV)was developed for computers and other systems to act or make recommendations based on visual inputs,such as digital photos,movies,and other media.Deep learning(DL)methods are more successful than other traditional machine learning(ML)methods inCV.DL techniques can produce state-of-the-art results for difficult CV problems like picture categorization,object detection,and face recognition.In this review,a structured discussion on the history,methods,and applications of DL methods to CV problems is presented.The sector-wise presentation of applications in this papermay be particularly useful for researchers in niche fields who have limited or introductory knowledge of DL methods and CV.This review will provide readers with context and examples of how these techniques can be applied to specific areas.A curated list of popular datasets and a brief description of them are also included for the benefit of readers.展开更多
In clinical practice,the microscopic examination of urine sediment is considered an important in vitro examination with many broad applications.Measuring the amount of each type of urine sediment allows for screening,...In clinical practice,the microscopic examination of urine sediment is considered an important in vitro examination with many broad applications.Measuring the amount of each type of urine sediment allows for screening,diagnosis and evaluation of kidney and urinary tract disease,providing insight into the specific type and severity.However,manual urine sediment examination is labor-intensive,time-consuming,and subjective.Traditional machine learning based object detection methods require hand-crafted features for localization and classification,which have poor generalization capabilities and are difficult to quickly and accurately detect the number of urine sediments.Deep learning based object detection methods have the potential to address the challenges mentioned above,but these methods require access to large urine sediment image datasets.Unfortunately,only a limited number of publicly available urine sediment datasets are currently available.To alleviate the lack of urine sediment datasets in medical image analysis,we propose a new dataset named UriSed2K,which contains 2465 high-quality images annotated with expert guidance.Two main challenges are associated with our dataset:a large number of small objects and the occlusion between these small objects.Our manuscript focuses on applying deep learning object detection methods to the urine sediment dataset and addressing the challenges presented by this dataset.Specifically,our goal is to improve the accuracy and efficiency of the detection algorithm and,in doing so,provide medical professionals with an automatic detector that saves time and effort.We propose an improved lightweight one-stage object detection algorithm called Discriminatory-YOLO.The proposed algorithm comprises a local context attention module and a global background suppression module,which aid the detector in distinguishing urine sediment features in the image.The local context attention module captures context information beyond the object region,while the global background suppression module emphasizes objects in uninformative backgrounds.We comprehensively evaluate our method on the UriSed2K dataset,which includes seven categories of urine sediments,such as erythrocytes(red blood cells),leukocytes(white blood cells),epithelial cells,crystals,mycetes,broken erythrocytes,and broken leukocytes,achieving the best average precision(AP)of 95.3%while taking only 10 ms per image.The source code and dataset are available at https://github.com/binghuiwu98/discriminatoryyolov5.展开更多
Humans can perceive our complex world through multi-sensory fusion.Under limited visual conditions,people can sense a variety of tactile signals to identify objects accurately and rapidly.However,replicating this uniq...Humans can perceive our complex world through multi-sensory fusion.Under limited visual conditions,people can sense a variety of tactile signals to identify objects accurately and rapidly.However,replicating this unique capability in robots remains a significant challenge.Here,we present a new form of ultralight multifunctional tactile nano-layered carbon aerogel sensor that provides pressure,temperature,material recognition and 3D location capabilities,which is combined with multimodal supervised learning algorithms for object recognition.The sensor exhibits human-like pressure(0.04–100 kPa)and temperature(21.5–66.2℃)detection,millisecond response times(11 ms),a pressure sensitivity of 92.22 kPa^(−1)and triboelectric durability of over 6000 cycles.The devised algorithm has universality and can accommodate a range of application scenarios.The tactile system can identify common foods in a kitchen scene with 94.63%accuracy and explore the topographic and geomorphic features of a Mars scene with 100%accuracy.This sensing approach empowers robots with versatile tactile perception to advance future society toward heightened sensing,recognition and intelligence.展开更多
This study presents a general optimal trajectory planning(GOTP)framework for autonomous vehicles(AVs)that can effectively avoid obstacles and guide AVs to complete driving tasks safely and efficiently.Firstly,we emplo...This study presents a general optimal trajectory planning(GOTP)framework for autonomous vehicles(AVs)that can effectively avoid obstacles and guide AVs to complete driving tasks safely and efficiently.Firstly,we employ the fifth-order Bezier curve to generate and smooth the reference path along the road centerline.Cartesian coordinates are then transformed to achieve the curvature continuity of the generated curve.Considering the road constraints and vehicle dynamics,limited polynomial candidate trajectories are generated and smoothed in a curvilinear coordinate system.Furthermore,in selecting the optimal trajectory,we develop a unified and auto-tune objective function based on the principle of least action by employing AVs to simulate drivers’behavior and summarizing their manipulation characteristics of“seeking benefits and avoiding losses.”Finally,by integrating the idea of receding-horizon optimization,the proposed framework is achieved by considering dynamic multi-performance objectives and selecting trajectories that satisfy feasibility,optimality,and adaptability.Extensive simulations and experiments are performed,and the results demonstrate the framework’s feasibility and effectiveness,which avoids both dynamic and static obstacles and applies to various scenarios with multi-source interactive traffic participants.Moreover,we prove that the proposed method can guarantee real-time planning and safety requirements compared to drivers’manipulation.展开更多
Segment Anything Model(SAM)is a cutting-edge model that has shown impressive performance in general object segmentation.The birth of the segment anything is a groundbreaking step towards creating a universal intellige...Segment Anything Model(SAM)is a cutting-edge model that has shown impressive performance in general object segmentation.The birth of the segment anything is a groundbreaking step towards creating a universal intelligent model.Due to its superior performance in general object segmentation,it quickly gained attention and interest.This makes SAM particularly attractive in industrial surface defect segmentation,especially for complex industrial scenes with limited training data.However,its segmentation ability for specific industrial scenes remains unknown.Therefore,in this work,we select three representative and complex industrial surface defect detection scenarios,namely strip steel surface defects,tile surface defects,and rail surface defects,to evaluate the segmentation performance of SAM.Our results show that although SAM has great potential in general object segmentation,it cannot achieve satisfactory performance in complex industrial scenes.Our test results are available at:https://github.com/VDT-2048/SAM-IS.展开更多
The data analysis of blasting sites has always been the research goal of relevant researchers.The rise of mobile blasting robots has aroused many researchers’interest in machine learning methods for target detection ...The data analysis of blasting sites has always been the research goal of relevant researchers.The rise of mobile blasting robots has aroused many researchers’interest in machine learning methods for target detection in the field of blasting.Serverless Computing can provide a variety of computing services for people without hardware foundations and rich software development experience,which has aroused people’s interest in how to use it in the field ofmachine learning.In this paper,we design a distributedmachine learning training application based on the AWS Lambda platform.Based on data parallelism,the data aggregation and training synchronization in Function as a Service(FaaS)are effectively realized.It also encrypts the data set,effectively reducing the risk of data leakage.We rent a cloud server and a Lambda,and then we conduct experiments to evaluate our applications.Our results indicate the effectiveness,rapidity,and economy of distributed training on FaaS.展开更多
In recent years,there has been extensive research on object detection methods applied to optical remote sensing images utilizing convolutional neural networks.Despite these efforts,the detection of small objects in re...In recent years,there has been extensive research on object detection methods applied to optical remote sensing images utilizing convolutional neural networks.Despite these efforts,the detection of small objects in remote sensing remains a formidable challenge.The deep network structure will bring about the loss of object features,resulting in the loss of object features and the near elimination of some subtle features associated with small objects in deep layers.Additionally,the features of small objects are susceptible to interference from background features contained within the image,leading to a decline in detection accuracy.Moreover,the sensitivity of small objects to the bounding box perturbation further increases the detection difficulty.In this paper,we introduce a novel approach,Cross-Layer Fusion and Weighted Receptive Field-based YOLO(CAW-YOLO),specifically designed for small object detection in remote sensing.To address feature loss in deep layers,we have devised a cross-layer attention fusion module.Background noise is effectively filtered through the incorporation of Bi-Level Routing Attention(BRA).To enhance the model’s capacity to perceive multi-scale objects,particularly small-scale objects,we introduce a weightedmulti-receptive field atrous spatial pyramid poolingmodule.Furthermore,wemitigate the sensitivity arising from bounding box perturbation by incorporating the joint Normalized Wasserstein Distance(NWD)and Efficient Intersection over Union(EIoU)losses.The efficacy of the proposedmodel in detecting small objects in remote sensing has been validated through experiments conducted on three publicly available datasets.The experimental results unequivocally demonstrate the model’s pronounced advantages in small object detection for remote sensing,surpassing the performance of current mainstream models.展开更多
BACKGROUND Deep learning provides an efficient automatic image recognition method for small bowel(SB)capsule endoscopy(CE)that can assist physicians in diagnosis.However,the existing deep learning models present some ...BACKGROUND Deep learning provides an efficient automatic image recognition method for small bowel(SB)capsule endoscopy(CE)that can assist physicians in diagnosis.However,the existing deep learning models present some unresolved challenges.AIM To propose a novel and effective classification and detection model to automatically identify various SB lesions and their bleeding risks,and label the lesions accurately so as to enhance the diagnostic efficiency of physicians and the ability to identify high-risk bleeding groups.METHODS The proposed model represents a two-stage method that combined image classification with object detection.First,we utilized the improved ResNet-50 classification model to classify endoscopic images into SB lesion images,normal SB mucosa images,and invalid images.Then,the improved YOLO-V5 detection model was utilized to detect the type of lesion and its risk of bleeding,and the location of the lesion was marked.We constructed training and testing sets and compared model-assisted reading with physician reading.RESULTS The accuracy of the model constructed in this study reached 98.96%,which was higher than the accuracy of other systems using only a single module.The sensitivity,specificity,and accuracy of the model-assisted reading detection of all images were 99.17%,99.92%,and 99.86%,which were significantly higher than those of the endoscopists’diagnoses.The image processing time of the model was 48 ms/image,and the image processing time of the physicians was 0.40±0.24 s/image(P<0.001).CONCLUSION The deep learning model of image classification combined with object detection exhibits a satisfactory diagnostic effect on a variety of SB lesions and their bleeding risks in CE images,which enhances the diagnostic efficiency of physicians and improves the ability of physicians to identify high-risk bleeding groups.展开更多
Monocular 3D object detection is challenging due to the lack of accurate depth information.Some methods estimate the pixel-wise depth maps from off-the-shelf depth estimators and then use them as an additional input t...Monocular 3D object detection is challenging due to the lack of accurate depth information.Some methods estimate the pixel-wise depth maps from off-the-shelf depth estimators and then use them as an additional input to augment the RGB images.Depth-based methods attempt to convert estimated depth maps to pseudo-LiDAR and then use LiDAR-based object detectors or focus on the perspective of image and depth fusion learning.However,they demonstrate limited performance and efficiency as a result of depth inaccuracy and complex fusion mode with convolutions.Different from these approaches,our proposed depth-guided vision transformer with a normalizing flows(NF-DVT)network uses normalizing flows to build priors in depth maps to achieve more accurate depth information.Then we develop a novel Swin-Transformer-based backbone with a fusion module to process RGB image patches and depth map patches with two separate branches and fuse them using cross-attention to exchange information with each other.Furthermore,with the help of pixel-wise relative depth values in depth maps,we develop new relative position embeddings in the cross-attention mechanism to capture more accurate sequence ordering of input tokens.Our method is the first Swin-Transformer-based backbone architecture for monocular 3D object detection.The experimental results on the KITTI and the challenging Waymo Open datasets show the effectiveness of our proposed method and superior performance over previous counterparts.展开更多
文摘Although objectivity is mainly accounted for in terms of linguistic thought and communication,in this article I will aim to showthat at least one condition of possibility for our understanding of objectivity is grounded on a prepredicative,i. e. pre-linguistic and pre-communicative,level. I will endorse a Husserlian viewpoint on the issue,and I will try to develop some aspects of the Husserlian account of three-dimensional thing-perception by means of which I will showhowprepredicative experience can actually offer us a fundamental element of our common understanding of objectivity. In doing this,it will be necessary to acknowledge thing-perception as being primarily intertwined with indeterminacy. I will claim that only on the basis of such an intuitive and prepredicative access to the things as partially indeterminate,first,and as determinable,second,is it possible to have an understanding of the world as something (at least partially) independent from the intuition (s) all subjects can have of it. By means of the addition of a consciousness of the thing as accessible to other subjects,one achieves a vision of the thing as fully determinate in itself. This"vision",however,takes one to be aware of the determination of the thing as lying beyond any intuitive grasp of it. The result will,thus,be that the prepredicative constitution of our basic sense of objectivity leads us to intend the world as something which should be accounted for (also) by means of sources different from intuition.
文摘For a physically possible deformation field of a continuum, the deformation gradient function F can be decomposed into direct sum of a symmetric tensor S and on orthogonal tensor R, which is called S-R decomposition theorem. In this paper, the S-R decomposition unique existence theorem is proved, by employing matrix and tensor method. Also, a brief proof of its objectivity is given.
文摘This paper presents an analysis on how objectivity is discursively constructed in journalistic narratives by drawing on the theories of viewpoint and mental space in Cognitive Linguistics. It is posited that at least three mental spaces are projected by a narrative discourse, i.e., a narrated event space, a narrating space, and a basic space, and the distance between the first two spaces determines the degree of objectivity in the narrative discourse. A schema which represents the configuration of the different spaces is proposed and applied in the analysis of journalistic narratives to explore the strategies of objectivity construction. The analysis reveals that what the different journalistic narratives have in common in the construction of objectivity is to distance the narrated event space and the narrating space with the former being foregrounded in the viewpoint arrangement.
文摘The world is facing a once-in-a-lifetime situation:the COVID-19 pandemic.During the pandemic,the World Health Organization announced an infodemic as well.This infodemic caused infollution and sparked many controversies.Pandemics as extraordinary occurrences are always attractive to historians.However,infodemics and biased information threaten objective history-writing.Objectivity as it regards historians is already a much-discussed subject.In this commentary,the fundamental theories about objectivity are delineated.Second,the relationship between the infodemic and COVID-19 pandemic is explained.Lastly,the problems regarding objectivity in the historiography of the COVID-19 pandemic are explored.
文摘China English, as one of the English varieties, is an objective reality. It is different from Chinglish which is an interlanguage for Chinese English learners. This paper expresses the definition of China English, its objectivity and manifestations in terms of pronunciation, vocabulary, syntax and text.
文摘The history of science and medicine has long been steeped in the notion that they are objective(untainted by the philosophical and ideological ebbs and flows of society)and utilitarian(doing what is best for the greater good).Because of this,scientific and medical epistemologies and praxis are often held to an esteem that is unquestioned,celebrated,and occasionally unchecked.A closer look at the history of science and medicine,however,readily reveal the extent to which the milieu of society has informed scientific and medical endeavors.As such,an understanding of how the subjectivities of scientific and medical endeavors situate within our contemporary disciplines and practices is significant to one’s ability to truly understand said disciplines.Likewise,such an evaluation will provide insight into our role in perpetuating the illusion of objectivity in these fields.With this in mind,this paper provides a philosophical and historical examination of the concept of objectivity(in contrast to subjectivity)in science and medicine.
基金This work was supported by the Outstanding Youth Science and Technology Innovation Team Project of Colleges and Universities in Hubei Province(Grant No.T201923)Key Science and Technology Project of Jingmen(Grant Nos.2021ZDYF024,2022ZDYF019)+2 种基金LIAS Pioneering Partnerships Award,UK(Grant No.P202ED10)Data Science Enhancement Fund,UK(Grant No.P202RE237)Cultivation Project of Jingchu University of Technology(Grant No.PY201904).
文摘Optical image-based ship detection can ensure the safety of ships and promote the orderly management of ships in offshore waters.Current deep learning researches on optical image-based ship detection mainly focus on improving one-stage detectors for real-time ship detection but sacrifices the accuracy of detection.To solve this problem,we present a hybrid ship detection framework which is named EfficientShip in this paper.The core parts of the EfficientShip are DLA-backboned object location(DBOL)and CascadeRCNN-guided object classification(CROC).The DBOL is responsible for finding potential ship objects,and the CROC is used to categorize the potential ship objects.We also design a pixel-spatial-level data augmentation(PSDA)to reduce the risk of detection model overfitting.We compare the proposed EfficientShip with state-of-the-art(SOTA)literature on a ship detection dataset called Seaships.Experiments show our ship detection framework achieves a result of 99.63%(mAP)at 45 fps,which is much better than 8 SOTA approaches on detection accuracy and can also meet the requirements of real-time application scenarios.
基金supported by a grant from the Basic Science Research Program through the National Research Foundation(NRF)(2021R1F1A1063634)funded by the Ministry of Science and ICT(MSIT),Republic of KoreaThe authors are thankful to the Deanship of Scientific Research at Najran University for funding this work under the Research Group Funding Program Grant Code(NU/RG/SERC/13/40)+2 种基金Also,the authors are thankful to Prince Satam bin Abdulaziz University for supporting this study via funding from Prince Satam bin Abdulaziz University project number(PSAU/2024/R/1445)This work was also supported by Princess Nourah bint Abdulrahman University Researchers Supporting Project Number(PNURSP2023R54)Princess Nourah bint Abdulrahman University,Riyadh,Saudi Arabia.
文摘Road traffic monitoring is an imperative topic widely discussed among researchers.Systems used to monitor traffic frequently rely on cameras mounted on bridges or roadsides.However,aerial images provide the flexibility to use mobile platforms to detect the location and motion of the vehicle over a larger area.To this end,different models have shown the ability to recognize and track vehicles.However,these methods are not mature enough to produce accurate results in complex road scenes.Therefore,this paper presents an algorithm that combines state-of-the-art techniques for identifying and tracking vehicles in conjunction with image bursts.The extracted frames were converted to grayscale,followed by the application of a georeferencing algorithm to embed coordinate information into the images.The masking technique eliminated irrelevant data and reduced the computational cost of the overall monitoring system.Next,Sobel edge detection combined with Canny edge detection and Hough line transform has been applied for noise reduction.After preprocessing,the blob detection algorithm helped detect the vehicles.Vehicles of varying sizes have been detected by implementing a dynamic thresholding scheme.Detection was done on the first image of every burst.Then,to track vehicles,the model of each vehicle was made to find its matches in the succeeding images using the template matching algorithm.To further improve the tracking accuracy by incorporating motion information,Scale Invariant Feature Transform(SIFT)features have been used to find the best possible match among multiple matches.An accuracy rate of 87%for detection and 80%accuracy for tracking in the A1 Motorway Netherland dataset has been achieved.For the Vehicle Aerial Imaging from Drone(VAID)dataset,an accuracy rate of 86%for detection and 78%accuracy for tracking has been achieved.
基金supported by the National Natural Science Foundation of China(52130801,U20A20312,52178271,and 52077213)the National Key Research and Development Program of China(2021YFF0500903)。
文摘Identifying workers’construction activities or behaviors can enable managers to better monitor labor efficiency and construction progress.However,current activity analysis methods for construction workers rely solely on manual observations and recordings,which consumes considerable time and has high labor costs.Researchers have focused on monitoring on-site construction activities of workers.However,when multiple workers are working together,current research cannot accu rately and automatically identify the construction activity.This research proposes a deep learning framework for the automated analysis of the construction activities of multiple workers.In this framework,multiple deep neural network models are designed and used to complete worker key point extraction,worker tracking,and worker construction activity analysis.The designed framework was tested at an actual construction site,and activity recognition for multiple workers was performed,indicating the feasibility of the framework for the automated monitoring of work efficiency.
基金supported by the Project SP2023/074 Application of Machine and Process Control Advanced Methods supported by the Ministry of Education,Youth and Sports,Czech Republic.
文摘Computer vision(CV)was developed for computers and other systems to act or make recommendations based on visual inputs,such as digital photos,movies,and other media.Deep learning(DL)methods are more successful than other traditional machine learning(ML)methods inCV.DL techniques can produce state-of-the-art results for difficult CV problems like picture categorization,object detection,and face recognition.In this review,a structured discussion on the history,methods,and applications of DL methods to CV problems is presented.The sector-wise presentation of applications in this papermay be particularly useful for researchers in niche fields who have limited or introductory knowledge of DL methods and CV.This review will provide readers with context and examples of how these techniques can be applied to specific areas.A curated list of popular datasets and a brief description of them are also included for the benefit of readers.
基金This work was partially supported by the National Natural Science Foundation of China(Grant Nos.61906168,U20A20171)Zhejiang Provincial Natural Science Foundation of China(Grant Nos.LY23F020023,LY21F020027)Construction of Hubei Provincial Key Laboratory for Intelligent Visual Monitoring of Hydropower Projects(Grant Nos.2022SDSJ01).
文摘In clinical practice,the microscopic examination of urine sediment is considered an important in vitro examination with many broad applications.Measuring the amount of each type of urine sediment allows for screening,diagnosis and evaluation of kidney and urinary tract disease,providing insight into the specific type and severity.However,manual urine sediment examination is labor-intensive,time-consuming,and subjective.Traditional machine learning based object detection methods require hand-crafted features for localization and classification,which have poor generalization capabilities and are difficult to quickly and accurately detect the number of urine sediments.Deep learning based object detection methods have the potential to address the challenges mentioned above,but these methods require access to large urine sediment image datasets.Unfortunately,only a limited number of publicly available urine sediment datasets are currently available.To alleviate the lack of urine sediment datasets in medical image analysis,we propose a new dataset named UriSed2K,which contains 2465 high-quality images annotated with expert guidance.Two main challenges are associated with our dataset:a large number of small objects and the occlusion between these small objects.Our manuscript focuses on applying deep learning object detection methods to the urine sediment dataset and addressing the challenges presented by this dataset.Specifically,our goal is to improve the accuracy and efficiency of the detection algorithm and,in doing so,provide medical professionals with an automatic detector that saves time and effort.We propose an improved lightweight one-stage object detection algorithm called Discriminatory-YOLO.The proposed algorithm comprises a local context attention module and a global background suppression module,which aid the detector in distinguishing urine sediment features in the image.The local context attention module captures context information beyond the object region,while the global background suppression module emphasizes objects in uninformative backgrounds.We comprehensively evaluate our method on the UriSed2K dataset,which includes seven categories of urine sediments,such as erythrocytes(red blood cells),leukocytes(white blood cells),epithelial cells,crystals,mycetes,broken erythrocytes,and broken leukocytes,achieving the best average precision(AP)of 95.3%while taking only 10 ms per image.The source code and dataset are available at https://github.com/binghuiwu98/discriminatoryyolov5.
基金the National Natural Science Foundation of China(Grant No.52072041)the Beijing Natural Science Foundation(Grant No.JQ21007)+2 种基金the University of Chinese Academy of Sciences(Grant No.Y8540XX2D2)the Robotics Rhino-Bird Focused Research Project(No.2020-01-002)the Tencent Robotics X Laboratory.
文摘Humans can perceive our complex world through multi-sensory fusion.Under limited visual conditions,people can sense a variety of tactile signals to identify objects accurately and rapidly.However,replicating this unique capability in robots remains a significant challenge.Here,we present a new form of ultralight multifunctional tactile nano-layered carbon aerogel sensor that provides pressure,temperature,material recognition and 3D location capabilities,which is combined with multimodal supervised learning algorithms for object recognition.The sensor exhibits human-like pressure(0.04–100 kPa)and temperature(21.5–66.2℃)detection,millisecond response times(11 ms),a pressure sensitivity of 92.22 kPa^(−1)and triboelectric durability of over 6000 cycles.The devised algorithm has universality and can accommodate a range of application scenarios.The tactile system can identify common foods in a kitchen scene with 94.63%accuracy and explore the topographic and geomorphic features of a Mars scene with 100%accuracy.This sensing approach empowers robots with versatile tactile perception to advance future society toward heightened sensing,recognition and intelligence.
基金supported by the National Natural Science Foundation of China(the Key Project,52131201Science Fund for Creative Research Groups,52221005)+1 种基金the China Scholarship Councilthe Joint Laboratory for Internet of Vehicles,Ministry of Education–China MOBILE Communications Corporation。
文摘This study presents a general optimal trajectory planning(GOTP)framework for autonomous vehicles(AVs)that can effectively avoid obstacles and guide AVs to complete driving tasks safely and efficiently.Firstly,we employ the fifth-order Bezier curve to generate and smooth the reference path along the road centerline.Cartesian coordinates are then transformed to achieve the curvature continuity of the generated curve.Considering the road constraints and vehicle dynamics,limited polynomial candidate trajectories are generated and smoothed in a curvilinear coordinate system.Furthermore,in selecting the optimal trajectory,we develop a unified and auto-tune objective function based on the principle of least action by employing AVs to simulate drivers’behavior and summarizing their manipulation characteristics of“seeking benefits and avoiding losses.”Finally,by integrating the idea of receding-horizon optimization,the proposed framework is achieved by considering dynamic multi-performance objectives and selecting trajectories that satisfy feasibility,optimality,and adaptability.Extensive simulations and experiments are performed,and the results demonstrate the framework’s feasibility and effectiveness,which avoids both dynamic and static obstacles and applies to various scenarios with multi-source interactive traffic participants.Moreover,we prove that the proposed method can guarantee real-time planning and safety requirements compared to drivers’manipulation.
基金supported by the National Natural Science Foundation of China(51805078)Project of National Key Laboratory of Advanced Casting Technologies(CAT2023-002)the 111 Project(B16009).
文摘Segment Anything Model(SAM)is a cutting-edge model that has shown impressive performance in general object segmentation.The birth of the segment anything is a groundbreaking step towards creating a universal intelligent model.Due to its superior performance in general object segmentation,it quickly gained attention and interest.This makes SAM particularly attractive in industrial surface defect segmentation,especially for complex industrial scenes with limited training data.However,its segmentation ability for specific industrial scenes remains unknown.Therefore,in this work,we select three representative and complex industrial surface defect detection scenarios,namely strip steel surface defects,tile surface defects,and rail surface defects,to evaluate the segmentation performance of SAM.Our results show that although SAM has great potential in general object segmentation,it cannot achieve satisfactory performance in complex industrial scenes.Our test results are available at:https://github.com/VDT-2048/SAM-IS.
文摘The data analysis of blasting sites has always been the research goal of relevant researchers.The rise of mobile blasting robots has aroused many researchers’interest in machine learning methods for target detection in the field of blasting.Serverless Computing can provide a variety of computing services for people without hardware foundations and rich software development experience,which has aroused people’s interest in how to use it in the field ofmachine learning.In this paper,we design a distributedmachine learning training application based on the AWS Lambda platform.Based on data parallelism,the data aggregation and training synchronization in Function as a Service(FaaS)are effectively realized.It also encrypts the data set,effectively reducing the risk of data leakage.We rent a cloud server and a Lambda,and then we conduct experiments to evaluate our applications.Our results indicate the effectiveness,rapidity,and economy of distributed training on FaaS.
基金supported in part by the National Natural Science Foundation of China under Grant 62006071part by the Science and Technology Research Project of Henan Province under Grant 232103810086.
文摘In recent years,there has been extensive research on object detection methods applied to optical remote sensing images utilizing convolutional neural networks.Despite these efforts,the detection of small objects in remote sensing remains a formidable challenge.The deep network structure will bring about the loss of object features,resulting in the loss of object features and the near elimination of some subtle features associated with small objects in deep layers.Additionally,the features of small objects are susceptible to interference from background features contained within the image,leading to a decline in detection accuracy.Moreover,the sensitivity of small objects to the bounding box perturbation further increases the detection difficulty.In this paper,we introduce a novel approach,Cross-Layer Fusion and Weighted Receptive Field-based YOLO(CAW-YOLO),specifically designed for small object detection in remote sensing.To address feature loss in deep layers,we have devised a cross-layer attention fusion module.Background noise is effectively filtered through the incorporation of Bi-Level Routing Attention(BRA).To enhance the model’s capacity to perceive multi-scale objects,particularly small-scale objects,we introduce a weightedmulti-receptive field atrous spatial pyramid poolingmodule.Furthermore,wemitigate the sensitivity arising from bounding box perturbation by incorporating the joint Normalized Wasserstein Distance(NWD)and Efficient Intersection over Union(EIoU)losses.The efficacy of the proposedmodel in detecting small objects in remote sensing has been validated through experiments conducted on three publicly available datasets.The experimental results unequivocally demonstrate the model’s pronounced advantages in small object detection for remote sensing,surpassing the performance of current mainstream models.
基金The Shanxi Provincial Administration of Traditional Chinese Medicine,No.2023ZYYDA2005.
文摘BACKGROUND Deep learning provides an efficient automatic image recognition method for small bowel(SB)capsule endoscopy(CE)that can assist physicians in diagnosis.However,the existing deep learning models present some unresolved challenges.AIM To propose a novel and effective classification and detection model to automatically identify various SB lesions and their bleeding risks,and label the lesions accurately so as to enhance the diagnostic efficiency of physicians and the ability to identify high-risk bleeding groups.METHODS The proposed model represents a two-stage method that combined image classification with object detection.First,we utilized the improved ResNet-50 classification model to classify endoscopic images into SB lesion images,normal SB mucosa images,and invalid images.Then,the improved YOLO-V5 detection model was utilized to detect the type of lesion and its risk of bleeding,and the location of the lesion was marked.We constructed training and testing sets and compared model-assisted reading with physician reading.RESULTS The accuracy of the model constructed in this study reached 98.96%,which was higher than the accuracy of other systems using only a single module.The sensitivity,specificity,and accuracy of the model-assisted reading detection of all images were 99.17%,99.92%,and 99.86%,which were significantly higher than those of the endoscopists’diagnoses.The image processing time of the model was 48 ms/image,and the image processing time of the physicians was 0.40±0.24 s/image(P<0.001).CONCLUSION The deep learning model of image classification combined with object detection exhibits a satisfactory diagnostic effect on a variety of SB lesions and their bleeding risks in CE images,which enhances the diagnostic efficiency of physicians and improves the ability of physicians to identify high-risk bleeding groups.
基金supported in part by the Major Project for New Generation of AI (2018AAA0100400)the National Natural Science Foundation of China (61836014,U21B2042,62072457,62006231)the InnoHK Program。
文摘Monocular 3D object detection is challenging due to the lack of accurate depth information.Some methods estimate the pixel-wise depth maps from off-the-shelf depth estimators and then use them as an additional input to augment the RGB images.Depth-based methods attempt to convert estimated depth maps to pseudo-LiDAR and then use LiDAR-based object detectors or focus on the perspective of image and depth fusion learning.However,they demonstrate limited performance and efficiency as a result of depth inaccuracy and complex fusion mode with convolutions.Different from these approaches,our proposed depth-guided vision transformer with a normalizing flows(NF-DVT)network uses normalizing flows to build priors in depth maps to achieve more accurate depth information.Then we develop a novel Swin-Transformer-based backbone with a fusion module to process RGB image patches and depth map patches with two separate branches and fuse them using cross-attention to exchange information with each other.Furthermore,with the help of pixel-wise relative depth values in depth maps,we develop new relative position embeddings in the cross-attention mechanism to capture more accurate sequence ordering of input tokens.Our method is the first Swin-Transformer-based backbone architecture for monocular 3D object detection.The experimental results on the KITTI and the challenging Waymo Open datasets show the effectiveness of our proposed method and superior performance over previous counterparts.