Camouflaged people are extremely expert in actively concealing themselves by effectively utilizing cover and the surrounding environment. Despite advancements in optical detection capabilities through imaging systems,...Camouflaged people are extremely expert in actively concealing themselves by effectively utilizing cover and the surrounding environment. Despite advancements in optical detection capabilities through imaging systems, including spectral, polarization, and infrared technologies, there is still a lack of effective real-time method for accurately detecting small-size and high-efficient camouflaged people in complex real-world scenes. Here, this study proposes a snapshot multispectral image-based camouflaged detection model, multispectral YOLO(MS-YOLO), which utilizes the SPD-Conv and Sim AM modules to effectively represent targets and suppress background interference by exploiting the spatial-spectral target information. Besides, the study constructs the first real-shot multispectral camouflaged people dataset(MSCPD), which encompasses diverse scenes, target scales, and attitudes. To minimize information redundancy, MS-YOLO selects an optimal subset of 12 bands with strong feature representation and minimal inter-band correlation as input. Through experiments on the MSCPD, MS-YOLO achieves a mean Average Precision of 94.31% and real-time detection at 65 frames per second, which confirms the effectiveness and efficiency of our method in detecting camouflaged people in various typical desert and forest scenes. Our approach offers valuable support to improve the perception capabilities of unmanned aerial vehicles in detecting enemy forces and rescuing personnel in battlefield.展开更多
Automatic control technology is the basis of road robot improvement,according to the characteristics of construction equipment and functions,the research will be input type perception from positioning acquisition,real...Automatic control technology is the basis of road robot improvement,according to the characteristics of construction equipment and functions,the research will be input type perception from positioning acquisition,real-world monitoring,the process will use RTK-GNSS positional perception technology,by projecting the left side of the earth from Gauss-Krueger projection method,and then carry out the Cartesian conversion based on the characteristics of drawing;steering control system is the core of the electric drive unmanned module,on the basis of the analysis of the composition of the steering system of unmanned engineering vehicles,the steering system key components such as direction,torque sensor,drive motor and other models are established,the joint simulation model of unmanned engineering vehicles is established,the steering controller is designed using the PID method,the simulation results show that the control method can meet the construction path demand for automatic steering.The path planning will first formulate the construction area with preset values and realize the steering angle correction during driving by PID algorithm,and never realize the construction-based path planning,and the results show that the method can control the straight path within the error of 10 cm and the curve error within 20 cm.With the collaboration of various modules,the automatic construction simulation results of this robot show that the design path and control method is effective.展开更多
Detecting highly-overlapped objects in crowded scenes remains a challenging problem,especially for one-stage detector.In this paper,we extricate YOLOv4 from the dilemma in a crowd by fine-tuning its detection scheme,n...Detecting highly-overlapped objects in crowded scenes remains a challenging problem,especially for one-stage detector.In this paper,we extricate YOLOv4 from the dilemma in a crowd by fine-tuning its detection scheme,named YOLO-CS.Specifically,we give YOLOv4 the power to detect multiple objects in one cell.Center to our method is the carefully designed joint prediction scheme,which is executed through an assignment of bounding boxes and a joint loss.Equipped with the derived joint-object augmentation(DJA),refined regression loss(RL)and Score-NMS(SN),YOLO-CS achieves competitive detection performance on CrowdHuman and CityPersons benchmarks compared with state-of-the-art detectors at the cost of little time.Furthermore,on the widely used general benchmark COCO,YOLOCS still has a good performance,indicating its robustness to various scenes.展开更多
The analysis of overcrowded areas is essential for flow monitoring,assembly control,and security.Crowd counting’s primary goal is to calculate the population in a given region,which requires real-time analysis of con...The analysis of overcrowded areas is essential for flow monitoring,assembly control,and security.Crowd counting’s primary goal is to calculate the population in a given region,which requires real-time analysis of congested scenes for prompt reactionary actions.The crowd is always unexpected,and the benchmarked available datasets have a lot of variation,which limits the trained models’performance on unseen test data.In this paper,we proposed an end-to-end deep neural network that takes an input image and generates a density map of a crowd scene.The proposed model consists of encoder and decoder networks comprising batch-free normalization layers known as evolving normalization(EvoNorm).This allows our network to be generalized for unseen data because EvoNorm is not using statistics from the training samples.The decoder network uses dilated 2D convolutional layers to provide large receptive fields and fewer parameters,which enables real-time processing and solves the density drift problem due to its large receptive field.Five benchmark datasets are used in this study to assess the proposed model,resulting in the conclusion that it outperforms conventional models.展开更多
Weather is a key factor affecting the control of air traffic.Accurate recognition and classification of similar weather scenes in the terminal area is helpful for rapid decision-making in air trafficflow management.Curren...Weather is a key factor affecting the control of air traffic.Accurate recognition and classification of similar weather scenes in the terminal area is helpful for rapid decision-making in air trafficflow management.Current researches mostly use traditional machine learning methods to extract features of weather scenes,and clustering algorithms to divide similar scenes.Inspired by the excellent performance of deep learning in image recognition,this paper proposes a terminal area similar weather scene classification method based on improved deep convolution embedded clustering(IDCEC),which uses the com-bination of the encoding layer and the decoding layer to reduce the dimensionality of the weather image,retaining useful information to the greatest extent,and then uses the combination of the pre-trained encoding layer and the clustering layer to train the clustering model of the similar scenes in the terminal area.Finally,term-inal area of Guangzhou Airport is selected as the research object,the method pro-posed in this article is used to classify historical weather data in similar scenes,and the performance is compared with other state-of-the-art methods.The experi-mental results show that the proposed IDCEC method can identify similar scenes more accurately based on the spatial distribution characteristics and severity of weather;at the same time,compared with the actualflight volume in the Guangz-hou terminal area,IDCEC's recognition results of similar weather scenes are con-sistent with the recognition of experts in thefield.展开更多
In European thought and culture,there exists a group of passionate artists who are fascinated by the intention,passion,and richness of artistic expression.They strive to establish connections between different art for...In European thought and culture,there exists a group of passionate artists who are fascinated by the intention,passion,and richness of artistic expression.They strive to establish connections between different art forms.Musicians not only attempt to represent masterpieces through the language of music but also aim to convey subjective experiences of emotions and personal imagination to listeners by adding titles to their musical works.This study examines two pieces,“Scenes of Childhood”and“Children’s Garden”,and analyzes the different approaches employed by the composers in portraying similar content.展开更多
In digital video analysis, browse, retrieval and query, shot is incapable of meeting needs. Scene is a cluster of a series of shots, which partially meets above demands. In this paper, an algorithm of video scenes clu...In digital video analysis, browse, retrieval and query, shot is incapable of meeting needs. Scene is a cluster of a series of shots, which partially meets above demands. In this paper, an algorithm of video scenes clustering based on shot key frame sets is proposed. We use X^2 histogram match and twin histogram comparison for shot detection. A method is presented for key frame set extraction based on distance of non adjacent frames, further more, the minimum distance of key frame sets as distance of shots is computed, eventually scenes are clustered according to the distance of shots. Experiments of this algorithm show satisfactory performance in cor rectness and computing speed.展开更多
By releasing the book The Catcher in the Rye,J.D.Salinger received an immediate popularity of his writing career.Hissymbolic use of language has been thoroughly researched but the symbolic scenes which make up Holden&...By releasing the book The Catcher in the Rye,J.D.Salinger received an immediate popularity of his writing career.Hissymbolic use of language has been thoroughly researched but the symbolic scenes which make up Holden's life stage,especiallythe symbolic connotations of ironic resting places in the novel,such as bed,couch and bedroom,has not been paid much attention.It tries to analyse the four scenes: on Holden's history teacher's bed,on the hotel bed with a prostitute,in his sister's bedroom,and on his English teacher's couch,and aims to discover his spiritual chaos as well as adolescent desires in the real world,demon-strating that there is no place for adolescent Holden to rest after he chooses his own stage of scenes in his life.展开更多
An automatic approach is presented to track a wide screen in a multipurpose hall video scene. Once the screen is located, this system also generates the temporal rate of change by using the edge detection based method...An automatic approach is presented to track a wide screen in a multipurpose hall video scene. Once the screen is located, this system also generates the temporal rate of change by using the edge detection based method. Our approach adopts a scene segmentation algorithm that explores visual features (texture) and depth information to perform efficient screen localization. The cropped region which refers to the wide screen undergoes salient visual cues extraction to retrieve the emphasized changes required in rate-of- change computation. In addition to video document indexing and retrieval, this work can improve the machine vision capability in the behavior analysis and pattern recognition.展开更多
Encryption and decryption method of three-dimensional objects uses holograms computer-generated and suggests encoding stage. Information obtained amplitude and phase of a three-dimensional object using mathematically ...Encryption and decryption method of three-dimensional objects uses holograms computer-generated and suggests encoding stage. Information obtained amplitude and phase of a three-dimensional object using mathematically stage transforms overlap stored on a digital computer. Different three-dimensional images restore and develop the system for the expansion of the three-dimensional scenes and camera movement parameters. This article talks about these kinds of digital image processing algorithms as the reconstruction of three-dimensional model of the scene. In the present state, many such algorithms need to be improved in this paper proposing one of the options to improve the accuracy of such reconstruction.展开更多
In this paper, we study autonomous landing scene recognition with knowledge transfer for drones. Considering the difficulties in aerial remote sensing, especially that some scenes are extremely similar, or the same sc...In this paper, we study autonomous landing scene recognition with knowledge transfer for drones. Considering the difficulties in aerial remote sensing, especially that some scenes are extremely similar, or the same scene has different representations in different altitudes, we employ a deep convolutional neural network(CNN) based on knowledge transfer and fine-tuning to solve the problem. Then, LandingScenes-7 dataset is established and divided into seven classes. Moreover, there is still a novelty detection problem in the classifier, and we address this by excluding other landing scenes using the approach of thresholding in the prediction stage. We employ the transfer learning method based on ResNeXt-50 backbone with the adaptive momentum(ADAM) optimization algorithm. We also compare ResNet-50 backbone and the momentum stochastic gradient descent(SGD) optimizer. Experiment results show that ResNeXt-50 based on the ADAM optimization algorithm has better performance. With a pre-trained model and fine-tuning, it can achieve 97.845 0% top-1 accuracy on the LandingScenes-7dataset, paving the way for drones to autonomously learn landing scenes.展开更多
In today’s real world, an important research part in image processing isscene text detection and recognition. Scene text can be in different languages,fonts, sizes, colours, orientations and structures. Moreover, the...In today’s real world, an important research part in image processing isscene text detection and recognition. Scene text can be in different languages,fonts, sizes, colours, orientations and structures. Moreover, the aspect ratios andlayouts of a scene text may differ significantly. All these variations appear assignificant challenges for the detection and recognition algorithms that are consideredfor the text in natural scenes. In this paper, a new intelligent text detection andrecognition method for detectingthe text from natural scenes and forrecognizingthe text by applying the newly proposed Conditional Random Field-based fuzzyrules incorporated Convolutional Neural Network (CR-CNN) has been proposed.Moreover, we have recommended a new text detection method for detecting theexact text from the input natural scene images. For enhancing the presentation ofthe edge detection process, image pre-processing activities such as edge detectionand color modeling have beenapplied in this work. In addition, we have generatednew fuzzy rules for making effective decisions on the processes of text detectionand recognition. The experiments have been directedusing the standard benchmark datasets such as the ICDAR 2003, the ICDAR 2011, the ICDAR2005 and the SVT and have achieved better detection accuracy intext detectionand recognition. By using these three datasets, five different experiments havebeen conducted for evaluating the proposed model. And also, we have comparedthe proposed system with the other classifiers such as the SVM, the MLP and theCNN. In these comparisons, the proposed model has achieved better classificationaccuracywhen compared with the other existing works.展开更多
Traffic scene captioning technology automatically generates one or more sentences to describe the content of traffic scenes by analyzing the content of the input traffic scene images,ensuring road safety while providi...Traffic scene captioning technology automatically generates one or more sentences to describe the content of traffic scenes by analyzing the content of the input traffic scene images,ensuring road safety while providing an important decision-making function for sustainable transportation.In order to provide a comprehensive and reasonable description of complex traffic scenes,a traffic scene semantic captioningmodel withmulti-stage feature enhancement is proposed in this paper.In general,the model follows an encoder-decoder structure.First,multilevel granularity visual features are used for feature enhancement during the encoding process,which enables the model to learn more detailed content in the traffic scene image.Second,the scene knowledge graph is applied to the decoding process,and the semantic features provided by the scene knowledge graph are used to enhance the features learned by the decoder again,so that themodel can learn the attributes of objects in the traffic scene and the relationships between objects to generate more reasonable captions.This paper reports extensive experiments on the challenging MS-COCO dataset,evaluated by five standard automatic evaluation metrics,and the results show that the proposed model has improved significantly in all metrics compared with the state-of-the-art methods,especially achieving a score of 129.0 on the CIDEr-D evaluation metric,which also indicates that the proposed model can effectively provide a more reasonable and comprehensive description of the traffic scene.展开更多
NPC deputies working on behalf of the people they represent are making a difference back home One of the true signs of spring is the arrival of swallows.But in Beijing,another sign heralds the return of springtime:tho...NPC deputies working on behalf of the people they represent are making a difference back home One of the true signs of spring is the arrival of swallows.But in Beijing,another sign heralds the return of springtime:thousands of deputies come to town to attend a series of two-week-long meetings,where a range of diverse issues are展开更多
With the intelligent development of road traffic control and management,higher requirements for the accuracy and effectiveness of traffic data have been put forward.The issue of how to collect and integrate data for t...With the intelligent development of road traffic control and management,higher requirements for the accuracy and effectiveness of traffic data have been put forward.The issue of how to collect and integrate data for traffic scenes has sought importance in this field as various treatment technologies have emerged.A lot of research work have been carried out from the theoretical aspect to engineering application.展开更多
In recent years,many visual positioning algorithms have been proposed based on computer vision and they have achieved good results.However,these algorithms have a single function,cannot perceive the environment,and ha...In recent years,many visual positioning algorithms have been proposed based on computer vision and they have achieved good results.However,these algorithms have a single function,cannot perceive the environment,and have poor versatility,and there is a certain mismatch phenomenon,which affects the positioning accuracy.Therefore,this paper proposes a location algorithm that combines a target recognition algorithm with a depth feature matching algorithm to solve the problem of unmanned aerial vehicle(UAV)environment perception and multi-modal image-matching fusion location.This algorithm was based on the single-shot object detector based on multi-level feature pyramid network(M2Det)algorithm and replaced the original visual geometry group(VGG)feature extraction network with the ResNet-101 network to improve the feature extraction capability of the network model.By introducing a depth feature matching algorithm,the algorithm shares neural network weights and realizes the design of UAV target recognition and a multi-modal image-matching fusion positioning algorithm.When the reference image and the real-time image were mismatched,the dynamic adaptive proportional constraint and the random sample consensus consistency algorithm(DAPC-RANSAC)were used to optimize the matching results to improve the correct matching efficiency of the target.Using the multi-modal registration data set,the proposed algorithm was compared and analyzed to verify its superiority and feasibility.The results show that the algorithm proposed in this paper can effectively deal with the matching between multi-modal images(visible image–infrared image,infrared image–satellite image,visible image–satellite image),and the contrast,scale,brightness,ambiguity deformation,and other changes had good stability and robustness.Finally,the effectiveness and practicability of the algorithm proposed in this paper were verified in an aerial test scene of an S1000 sixrotor UAV.展开更多
In recent years,with the continuous deepening of smart city construction,there have been significant changes and improvements in the field of intelligent transportation.The semantic segmentation of road scenes has imp...In recent years,with the continuous deepening of smart city construction,there have been significant changes and improvements in the field of intelligent transportation.The semantic segmentation of road scenes has important practical significance in the fields of automatic driving,transportation planning,and intelligent transportation systems.However,the current mainstream lightweight semantic segmentation models in road scene segmentation face problems such as poor segmentation performance of small targets and insufficient refinement of segmentation edges.Therefore,this article proposes a lightweight semantic segmentation model based on the LiteSeg model improvement to address these issues.The model uses the lightweight backbone network MobileNet instead of the LiteSeg backbone network to reduce the network parameters and computation,and combines the Coordinate Attention(CA)mechanism to help the network capture long-distance dependencies.At the same time,by combining the dependencies of spatial information and channel information,the Spatial and Channel Network(SCNet)attention mechanism is proposed to improve the feature extraction ability of the model.Finally,a multiscale transposed attention encoding(MTAE)module was proposed to obtain features of different resolutions and perform feature fusion.In this paper,the proposed model is verified on the Cityscapes dataset.The experimental results show that the addition of SCNet and MTAE modules increases the mean Intersection over Union(mIoU)of the original LiteSeg model by 4.69%.On this basis,the backbone network is replaced with MobileNet,and the CA model is added at the same time.At the cost of increasing the minimum model parameters and computing costs,the mIoU of the original LiteSeg model is increased by 2.46%.This article also compares the proposed model with some current lightweight semantic segmentation models,and experiments show that the comprehensive performance of the proposed model is the best,especially in achieving excellent results in small object segmentation.Finally,this article will conduct generalization testing on the KITTI dataset for the proposed model,and the experimental results show that the proposed algorithm has a certain degree of generalization.展开更多
In the past, many Chinese artists have beenafraid they would be criticized for lagging behindthe world in their conceptions. They are eager todo something unconventional or unorthodox. Theiractions are very similar wi...In the past, many Chinese artists have beenafraid they would be criticized for lagging behindthe world in their conceptions. They are eager todo something unconventional or unorthodox. Theiractions are very similar with the trend Liu展开更多
基金support by the National Natural Science Foundation of China (Grant No. 62005049)Natural Science Foundation of Fujian Province (Grant Nos. 2020J01451, 2022J05113)Education and Scientific Research Program for Young and Middleaged Teachers in Fujian Province (Grant No. JAT210035)。
文摘Camouflaged people are extremely expert in actively concealing themselves by effectively utilizing cover and the surrounding environment. Despite advancements in optical detection capabilities through imaging systems, including spectral, polarization, and infrared technologies, there is still a lack of effective real-time method for accurately detecting small-size and high-efficient camouflaged people in complex real-world scenes. Here, this study proposes a snapshot multispectral image-based camouflaged detection model, multispectral YOLO(MS-YOLO), which utilizes the SPD-Conv and Sim AM modules to effectively represent targets and suppress background interference by exploiting the spatial-spectral target information. Besides, the study constructs the first real-shot multispectral camouflaged people dataset(MSCPD), which encompasses diverse scenes, target scales, and attitudes. To minimize information redundancy, MS-YOLO selects an optimal subset of 12 bands with strong feature representation and minimal inter-band correlation as input. Through experiments on the MSCPD, MS-YOLO achieves a mean Average Precision of 94.31% and real-time detection at 65 frames per second, which confirms the effectiveness and efficiency of our method in detecting camouflaged people in various typical desert and forest scenes. Our approach offers valuable support to improve the perception capabilities of unmanned aerial vehicles in detecting enemy forces and rescuing personnel in battlefield.
文摘Automatic control technology is the basis of road robot improvement,according to the characteristics of construction equipment and functions,the research will be input type perception from positioning acquisition,real-world monitoring,the process will use RTK-GNSS positional perception technology,by projecting the left side of the earth from Gauss-Krueger projection method,and then carry out the Cartesian conversion based on the characteristics of drawing;steering control system is the core of the electric drive unmanned module,on the basis of the analysis of the composition of the steering system of unmanned engineering vehicles,the steering system key components such as direction,torque sensor,drive motor and other models are established,the joint simulation model of unmanned engineering vehicles is established,the steering controller is designed using the PID method,the simulation results show that the control method can meet the construction path demand for automatic steering.The path planning will first formulate the construction area with preset values and realize the steering angle correction during driving by PID algorithm,and never realize the construction-based path planning,and the results show that the method can control the straight path within the error of 10 cm and the curve error within 20 cm.With the collaboration of various modules,the automatic construction simulation results of this robot show that the design path and control method is effective.
基金the China National Key Research and Development Program(No.2016YFC0802904)National Natural Science Foundation of China(61671470)62nd batch of funded projects of China Postdoctoral Science Foundation(No.2017M623423).
文摘Detecting highly-overlapped objects in crowded scenes remains a challenging problem,especially for one-stage detector.In this paper,we extricate YOLOv4 from the dilemma in a crowd by fine-tuning its detection scheme,named YOLO-CS.Specifically,we give YOLOv4 the power to detect multiple objects in one cell.Center to our method is the carefully designed joint prediction scheme,which is executed through an assignment of bounding boxes and a joint loss.Equipped with the derived joint-object augmentation(DJA),refined regression loss(RL)and Score-NMS(SN),YOLO-CS achieves competitive detection performance on CrowdHuman and CityPersons benchmarks compared with state-of-the-art detectors at the cost of little time.Furthermore,on the widely used general benchmark COCO,YOLOCS still has a good performance,indicating its robustness to various scenes.
基金This research was supported by Basic Science Research Program through the National Research Foundation of Korea(NRF)funded by the Ministry of Education(No.2021R1I1A1A01055652).
文摘The analysis of overcrowded areas is essential for flow monitoring,assembly control,and security.Crowd counting’s primary goal is to calculate the population in a given region,which requires real-time analysis of congested scenes for prompt reactionary actions.The crowd is always unexpected,and the benchmarked available datasets have a lot of variation,which limits the trained models’performance on unseen test data.In this paper,we proposed an end-to-end deep neural network that takes an input image and generates a density map of a crowd scene.The proposed model consists of encoder and decoder networks comprising batch-free normalization layers known as evolving normalization(EvoNorm).This allows our network to be generalized for unseen data because EvoNorm is not using statistics from the training samples.The decoder network uses dilated 2D convolutional layers to provide large receptive fields and fewer parameters,which enables real-time processing and solves the density drift problem due to its large receptive field.Five benchmark datasets are used in this study to assess the proposed model,resulting in the conclusion that it outperforms conventional models.
基金supported by the Fundamental Research Funds for the CentralUniversities under Grant NS2020045. Y.L.G received the grant.
文摘Weather is a key factor affecting the control of air traffic.Accurate recognition and classification of similar weather scenes in the terminal area is helpful for rapid decision-making in air trafficflow management.Current researches mostly use traditional machine learning methods to extract features of weather scenes,and clustering algorithms to divide similar scenes.Inspired by the excellent performance of deep learning in image recognition,this paper proposes a terminal area similar weather scene classification method based on improved deep convolution embedded clustering(IDCEC),which uses the com-bination of the encoding layer and the decoding layer to reduce the dimensionality of the weather image,retaining useful information to the greatest extent,and then uses the combination of the pre-trained encoding layer and the clustering layer to train the clustering model of the similar scenes in the terminal area.Finally,term-inal area of Guangzhou Airport is selected as the research object,the method pro-posed in this article is used to classify historical weather data in similar scenes,and the performance is compared with other state-of-the-art methods.The experi-mental results show that the proposed IDCEC method can identify similar scenes more accurately based on the spatial distribution characteristics and severity of weather;at the same time,compared with the actualflight volume in the Guangz-hou terminal area,IDCEC's recognition results of similar weather scenes are con-sistent with the recognition of experts in thefield.
文摘In European thought and culture,there exists a group of passionate artists who are fascinated by the intention,passion,and richness of artistic expression.They strive to establish connections between different art forms.Musicians not only attempt to represent masterpieces through the language of music but also aim to convey subjective experiences of emotions and personal imagination to listeners by adding titles to their musical works.This study examines two pieces,“Scenes of Childhood”and“Children’s Garden”,and analyzes the different approaches employed by the composers in portraying similar content.
基金Supported by the Natural Science Foundation ofHubei Province(2004ABA174)
文摘In digital video analysis, browse, retrieval and query, shot is incapable of meeting needs. Scene is a cluster of a series of shots, which partially meets above demands. In this paper, an algorithm of video scenes clustering based on shot key frame sets is proposed. We use X^2 histogram match and twin histogram comparison for shot detection. A method is presented for key frame set extraction based on distance of non adjacent frames, further more, the minimum distance of key frame sets as distance of shots is computed, eventually scenes are clustered according to the distance of shots. Experiments of this algorithm show satisfactory performance in cor rectness and computing speed.
文摘By releasing the book The Catcher in the Rye,J.D.Salinger received an immediate popularity of his writing career.Hissymbolic use of language has been thoroughly researched but the symbolic scenes which make up Holden's life stage,especiallythe symbolic connotations of ironic resting places in the novel,such as bed,couch and bedroom,has not been paid much attention.It tries to analyse the four scenes: on Holden's history teacher's bed,on the hotel bed with a prostitute,in his sister's bedroom,and on his English teacher's couch,and aims to discover his spiritual chaos as well as adolescent desires in the real world,demon-strating that there is no place for adolescent Holden to rest after he chooses his own stage of scenes in his life.
文摘An automatic approach is presented to track a wide screen in a multipurpose hall video scene. Once the screen is located, this system also generates the temporal rate of change by using the edge detection based method. Our approach adopts a scene segmentation algorithm that explores visual features (texture) and depth information to perform efficient screen localization. The cropped region which refers to the wide screen undergoes salient visual cues extraction to retrieve the emphasized changes required in rate-of- change computation. In addition to video document indexing and retrieval, this work can improve the machine vision capability in the behavior analysis and pattern recognition.
文摘Encryption and decryption method of three-dimensional objects uses holograms computer-generated and suggests encoding stage. Information obtained amplitude and phase of a three-dimensional object using mathematically stage transforms overlap stored on a digital computer. Different three-dimensional images restore and develop the system for the expansion of the three-dimensional scenes and camera movement parameters. This article talks about these kinds of digital image processing algorithms as the reconstruction of three-dimensional model of the scene. In the present state, many such algorithms need to be improved in this paper proposing one of the options to improve the accuracy of such reconstruction.
基金supported by the National Natural Science Foundation of China (62103104)the China Postdoctoral Science Foundation(2021M690615)。
文摘In this paper, we study autonomous landing scene recognition with knowledge transfer for drones. Considering the difficulties in aerial remote sensing, especially that some scenes are extremely similar, or the same scene has different representations in different altitudes, we employ a deep convolutional neural network(CNN) based on knowledge transfer and fine-tuning to solve the problem. Then, LandingScenes-7 dataset is established and divided into seven classes. Moreover, there is still a novelty detection problem in the classifier, and we address this by excluding other landing scenes using the approach of thresholding in the prediction stage. We employ the transfer learning method based on ResNeXt-50 backbone with the adaptive momentum(ADAM) optimization algorithm. We also compare ResNet-50 backbone and the momentum stochastic gradient descent(SGD) optimizer. Experiment results show that ResNeXt-50 based on the ADAM optimization algorithm has better performance. With a pre-trained model and fine-tuning, it can achieve 97.845 0% top-1 accuracy on the LandingScenes-7dataset, paving the way for drones to autonomously learn landing scenes.
文摘In today’s real world, an important research part in image processing isscene text detection and recognition. Scene text can be in different languages,fonts, sizes, colours, orientations and structures. Moreover, the aspect ratios andlayouts of a scene text may differ significantly. All these variations appear assignificant challenges for the detection and recognition algorithms that are consideredfor the text in natural scenes. In this paper, a new intelligent text detection andrecognition method for detectingthe text from natural scenes and forrecognizingthe text by applying the newly proposed Conditional Random Field-based fuzzyrules incorporated Convolutional Neural Network (CR-CNN) has been proposed.Moreover, we have recommended a new text detection method for detecting theexact text from the input natural scene images. For enhancing the presentation ofthe edge detection process, image pre-processing activities such as edge detectionand color modeling have beenapplied in this work. In addition, we have generatednew fuzzy rules for making effective decisions on the processes of text detectionand recognition. The experiments have been directedusing the standard benchmark datasets such as the ICDAR 2003, the ICDAR 2011, the ICDAR2005 and the SVT and have achieved better detection accuracy intext detectionand recognition. By using these three datasets, five different experiments havebeen conducted for evaluating the proposed model. And also, we have comparedthe proposed system with the other classifiers such as the SVM, the MLP and theCNN. In these comparisons, the proposed model has achieved better classificationaccuracywhen compared with the other existing works.
基金funded by(i)Natural Science Foundation China(NSFC)under Grant Nos.61402397,61263043,61562093 and 61663046(ii)Open Foundation of Key Laboratory in Software Engineering of Yunnan Province:No.2020SE304.(iii)Practical Innovation Project of Yunnan University,Project Nos.2021z34,2021y128 and 2021y129.
文摘Traffic scene captioning technology automatically generates one or more sentences to describe the content of traffic scenes by analyzing the content of the input traffic scene images,ensuring road safety while providing an important decision-making function for sustainable transportation.In order to provide a comprehensive and reasonable description of complex traffic scenes,a traffic scene semantic captioningmodel withmulti-stage feature enhancement is proposed in this paper.In general,the model follows an encoder-decoder structure.First,multilevel granularity visual features are used for feature enhancement during the encoding process,which enables the model to learn more detailed content in the traffic scene image.Second,the scene knowledge graph is applied to the decoding process,and the semantic features provided by the scene knowledge graph are used to enhance the features learned by the decoder again,so that themodel can learn the attributes of objects in the traffic scene and the relationships between objects to generate more reasonable captions.This paper reports extensive experiments on the challenging MS-COCO dataset,evaluated by five standard automatic evaluation metrics,and the results show that the proposed model has improved significantly in all metrics compared with the state-of-the-art methods,especially achieving a score of 129.0 on the CIDEr-D evaluation metric,which also indicates that the proposed model can effectively provide a more reasonable and comprehensive description of the traffic scene.
文摘NPC deputies working on behalf of the people they represent are making a difference back home One of the true signs of spring is the arrival of swallows.But in Beijing,another sign heralds the return of springtime:thousands of deputies come to town to attend a series of two-week-long meetings,where a range of diverse issues are
文摘With the intelligent development of road traffic control and management,higher requirements for the accuracy and effectiveness of traffic data have been put forward.The issue of how to collect and integrate data for traffic scenes has sought importance in this field as various treatment technologies have emerged.A lot of research work have been carried out from the theoretical aspect to engineering application.
基金supported in part by the National Natural Science Foundation of China under Grant 62276274in part by the Natural Science Foundation of Shaanxi Province under Grant 2020JM-537,and in part by the Aeronautical Science Fund under Grant 201851U8012(corresponding author:Xiaogang Yang).
文摘In recent years,many visual positioning algorithms have been proposed based on computer vision and they have achieved good results.However,these algorithms have a single function,cannot perceive the environment,and have poor versatility,and there is a certain mismatch phenomenon,which affects the positioning accuracy.Therefore,this paper proposes a location algorithm that combines a target recognition algorithm with a depth feature matching algorithm to solve the problem of unmanned aerial vehicle(UAV)environment perception and multi-modal image-matching fusion location.This algorithm was based on the single-shot object detector based on multi-level feature pyramid network(M2Det)algorithm and replaced the original visual geometry group(VGG)feature extraction network with the ResNet-101 network to improve the feature extraction capability of the network model.By introducing a depth feature matching algorithm,the algorithm shares neural network weights and realizes the design of UAV target recognition and a multi-modal image-matching fusion positioning algorithm.When the reference image and the real-time image were mismatched,the dynamic adaptive proportional constraint and the random sample consensus consistency algorithm(DAPC-RANSAC)were used to optimize the matching results to improve the correct matching efficiency of the target.Using the multi-modal registration data set,the proposed algorithm was compared and analyzed to verify its superiority and feasibility.The results show that the algorithm proposed in this paper can effectively deal with the matching between multi-modal images(visible image–infrared image,infrared image–satellite image,visible image–satellite image),and the contrast,scale,brightness,ambiguity deformation,and other changes had good stability and robustness.Finally,the effectiveness and practicability of the algorithm proposed in this paper were verified in an aerial test scene of an S1000 sixrotor UAV.
基金the National Natural Science Foundation of China(No.62063006)the Natural Science Foundation of Guangxi Province(No.2023GXNSFAA026025)+3 种基金to the Innovation Fund of Chinese Universities Industry-University-Research(ID:2021RYC06005)to the Research Project for Young and Middle-Aged Teachers in Guangxi Universities(ID:2020KY15013)to the Special Research Project of Hechi University(ID:2021GCC028)supported by the Project of Outstanding Thousand Young Teachers’Training in Higher Education Institutions of Guangxi,Guangxi Colleges and Universities Key Laboratory of AI and Information Processing(Hechi University),Education Department of Guangxi Zhuang Autonomous Region.
文摘In recent years,with the continuous deepening of smart city construction,there have been significant changes and improvements in the field of intelligent transportation.The semantic segmentation of road scenes has important practical significance in the fields of automatic driving,transportation planning,and intelligent transportation systems.However,the current mainstream lightweight semantic segmentation models in road scene segmentation face problems such as poor segmentation performance of small targets and insufficient refinement of segmentation edges.Therefore,this article proposes a lightweight semantic segmentation model based on the LiteSeg model improvement to address these issues.The model uses the lightweight backbone network MobileNet instead of the LiteSeg backbone network to reduce the network parameters and computation,and combines the Coordinate Attention(CA)mechanism to help the network capture long-distance dependencies.At the same time,by combining the dependencies of spatial information and channel information,the Spatial and Channel Network(SCNet)attention mechanism is proposed to improve the feature extraction ability of the model.Finally,a multiscale transposed attention encoding(MTAE)module was proposed to obtain features of different resolutions and perform feature fusion.In this paper,the proposed model is verified on the Cityscapes dataset.The experimental results show that the addition of SCNet and MTAE modules increases the mean Intersection over Union(mIoU)of the original LiteSeg model by 4.69%.On this basis,the backbone network is replaced with MobileNet,and the CA model is added at the same time.At the cost of increasing the minimum model parameters and computing costs,the mIoU of the original LiteSeg model is increased by 2.46%.This article also compares the proposed model with some current lightweight semantic segmentation models,and experiments show that the comprehensive performance of the proposed model is the best,especially in achieving excellent results in small object segmentation.Finally,this article will conduct generalization testing on the KITTI dataset for the proposed model,and the experimental results show that the proposed algorithm has a certain degree of generalization.
文摘In the past, many Chinese artists have beenafraid they would be criticized for lagging behindthe world in their conceptions. They are eager todo something unconventional or unorthodox. Theiractions are very similar with the trend Liu