With the wide application of location-based social networks(LBSNs),personalized point of interest(POI)recommendation becomes popular,especially in the commercial field.Unfortunately,it is challenging to accurately rec...With the wide application of location-based social networks(LBSNs),personalized point of interest(POI)recommendation becomes popular,especially in the commercial field.Unfortunately,it is challenging to accurately recommend POIs to users because the user-POI matrix is extremely sparse.In addition,a user's check-in activities are affected by many influential factors.However,most of existing studies capture only few influential factors.It is hard for them to be extended to incorporate other heterogeneous information in a unified way.To address these problems,we propose a meta-path-based deep representation learning(MPDRL)model for personalized POI recommendation.In this model,we design eight types of meta-paths to fully utilize the rich heterogeneous information in LBSNs for the representations of users and POIs,and deeply mine the correlations between users and POIs.To further improve the recommendation performance,we design an attention-based long short-term memory(LSTM)network to learn the importance of different influential factors on a user's specific check-in activity.To verify the effectiveness of our proposed method,we conduct extensive experiments on a real-world dataset,Foursquare.Experimental results show that the MPDRL model improves at least 16.97%and 23.55%over all comparison methods in terms of the metric Precision@N(Pre@N)and Recall@N(Rec@N)respectively.展开更多
Biological slices are an effective tool for studying the physiological structure and evolutionmechanism of biological systems.However,due to the complexity of preparation technology and the presence of many uncontroll...Biological slices are an effective tool for studying the physiological structure and evolutionmechanism of biological systems.However,due to the complexity of preparation technology and the presence of many uncontrollable factors during the preparation processing,leads to problems such as difficulty in preparing slice images and breakage of slice images.Therefore,we proposed a biological slice image small-scale corruption inpainting algorithm with interpretability based on multi-layer deep sparse representation,achieving the high-fidelity reconstruction of slice images.We further discussed the relationship between deep convolutional neural networks and sparse representation,ensuring the high-fidelity characteristic of the algorithm first.A novel deep wavelet dictionary is proposed that can better obtain image prior and possess learnable feature.And multi-layer deep sparse representation is used to implement dictionary learning,acquiring better signal expression.Compared with methods such as NLABH,Shearlet,Partial Differential Equation(PDE),K-Singular Value Decomposition(K-SVD),Convolutional Sparse Coding,and Deep Image Prior,the proposed algorithm has better subjective reconstruction and objective evaluation with small-scale image data,which realized high-fidelity inpainting,under the condition of small-scale image data.And theOn2-level time complexitymakes the proposed algorithm practical.The proposed algorithm can be effectively extended to other cross-sectional image inpainting problems,such as magnetic resonance images,and computed tomography images.展开更多
Spectrogram representations of acoustic scenes have achieved competitive performance for acoustic scene classification. Yet, the spectrogram alone does not take into account a substantial amount of time-frequency info...Spectrogram representations of acoustic scenes have achieved competitive performance for acoustic scene classification. Yet, the spectrogram alone does not take into account a substantial amount of time-frequency information. In this study, we present an approach for exploring the benefits of deep scalogram representations, extracted in segments from an audio stream. The approach presented firstly transforms the segmented acoustic scenes into bump and morse scalograms, as well as spectrograms; secondly, the spectrograms or scalograms are sent into pre-trained convolutional neural networks; thirdly,the features extracted from a subsequent fully connected layer are fed into(bidirectional) gated recurrent neural networks, which are followed by a single highway layer and a softmax layer;finally, predictions from these three systems are fused by a margin sampling value strategy. We then evaluate the proposed approach using the acoustic scene classification data set of 2017 IEEE AASP Challenge on Detection and Classification of Acoustic Scenes and Events(DCASE). On the evaluation set, an accuracy of 64.0 % from bidirectional gated recurrent neural networks is obtained when fusing the spectrogram and the bump scalogram, which is an improvement on the 61.0 % baseline result provided by the DCASE 2017 organisers. This result shows that extracted bump scalograms are capable of improving the classification accuracy,when fusing with a spectrogram-based system.展开更多
Visual localization is a crucial component in the application of mobile robot and autonomous driving.Image retrieval is an efficient and effective technique in image-based localization methods.Due to the drastic varia...Visual localization is a crucial component in the application of mobile robot and autonomous driving.Image retrieval is an efficient and effective technique in image-based localization methods.Due to the drastic variability of environmental conditions,e.g.,illumination changes,retrievalbased visual localization is severely affected and becomes a challenging problem.In this work,a general architecture is first formulated probabilistically to extract domain-invariant features through multi-domain image translation.Then,a novel gradientweighted similarity activation mapping loss(Grad-SAM)is incorporated for finer localization with high accuracy.We also propose a new adaptive triplet loss to boost the contrastive learning of the embedding in a self-supervised manner.The final coarse-to-fine image retrieval pipeline is implemented as the sequential combination of models with and without Grad-SAM loss.Extensive experiments have been conducted to validate the effectiveness of the proposed approach on the CMU-Seasons dataset.The strong generalization ability of our approach is verified with the RobotCar dataset using models pre-trained on urban parts of the CMU-Seasons dataset.Our performance is on par with or even outperforms the state-of-the-art image-based localization baselines in medium or high precision,especially under challenging environments with illumination variance,vegetation,and night-time images.Moreover,real-site experiments have been conducted to validate the efficiency and effectiveness of the coarse-to-fine strategy for localization.展开更多
In this paper,we propose a novel and effective approach,namely GridNet,to hierarchically learn deep representation of 3D point clouds.It incorporates the ability of regular holistic description and fast data processin...In this paper,we propose a novel and effective approach,namely GridNet,to hierarchically learn deep representation of 3D point clouds.It incorporates the ability of regular holistic description and fast data processing in a single framework,which is able to abstract powerful features progressively in an efficient way.Moreover,to capture more accurate internal geometry attributes,anchors are inferred within local neighborhoods,in contrast to the fixed or the sampled ones used in existing methods,and the learned features are thus more representative and discriminative to local point distribution.GridNet delivers very competitive results compared with the state of the art methods in both the object classification and segmentation tasks.展开更多
基金National Natural Science Foundation of China(No.61972080)Shanghai Rising-Star Program,China(No.19QA1400300)。
文摘With the wide application of location-based social networks(LBSNs),personalized point of interest(POI)recommendation becomes popular,especially in the commercial field.Unfortunately,it is challenging to accurately recommend POIs to users because the user-POI matrix is extremely sparse.In addition,a user's check-in activities are affected by many influential factors.However,most of existing studies capture only few influential factors.It is hard for them to be extended to incorporate other heterogeneous information in a unified way.To address these problems,we propose a meta-path-based deep representation learning(MPDRL)model for personalized POI recommendation.In this model,we design eight types of meta-paths to fully utilize the rich heterogeneous information in LBSNs for the representations of users and POIs,and deeply mine the correlations between users and POIs.To further improve the recommendation performance,we design an attention-based long short-term memory(LSTM)network to learn the importance of different influential factors on a user's specific check-in activity.To verify the effectiveness of our proposed method,we conduct extensive experiments on a real-world dataset,Foursquare.Experimental results show that the MPDRL model improves at least 16.97%and 23.55%over all comparison methods in terms of the metric Precision@N(Pre@N)and Recall@N(Rec@N)respectively.
基金supported by the National Natural Science Foundation of China(Grant No.61871380)the Shandong Provincial Natural Science Foundation(Grant No.ZR2020MF019)Beijing Natural Science Foundation(Grant No.4172034).
文摘Biological slices are an effective tool for studying the physiological structure and evolutionmechanism of biological systems.However,due to the complexity of preparation technology and the presence of many uncontrollable factors during the preparation processing,leads to problems such as difficulty in preparing slice images and breakage of slice images.Therefore,we proposed a biological slice image small-scale corruption inpainting algorithm with interpretability based on multi-layer deep sparse representation,achieving the high-fidelity reconstruction of slice images.We further discussed the relationship between deep convolutional neural networks and sparse representation,ensuring the high-fidelity characteristic of the algorithm first.A novel deep wavelet dictionary is proposed that can better obtain image prior and possess learnable feature.And multi-layer deep sparse representation is used to implement dictionary learning,acquiring better signal expression.Compared with methods such as NLABH,Shearlet,Partial Differential Equation(PDE),K-Singular Value Decomposition(K-SVD),Convolutional Sparse Coding,and Deep Image Prior,the proposed algorithm has better subjective reconstruction and objective evaluation with small-scale image data,which realized high-fidelity inpainting,under the condition of small-scale image data.And theOn2-level time complexitymakes the proposed algorithm practical.The proposed algorithm can be effectively extended to other cross-sectional image inpainting problems,such as magnetic resonance images,and computed tomography images.
基金supported by the German National BMBF IKT2020-Grant(16SV7213)(EmotAsS)the European-Unions Horizon 2020 Research and Innovation Programme(688835)(DE-ENIGMA)the China Scholarship Council(CSC)
文摘Spectrogram representations of acoustic scenes have achieved competitive performance for acoustic scene classification. Yet, the spectrogram alone does not take into account a substantial amount of time-frequency information. In this study, we present an approach for exploring the benefits of deep scalogram representations, extracted in segments from an audio stream. The approach presented firstly transforms the segmented acoustic scenes into bump and morse scalograms, as well as spectrograms; secondly, the spectrograms or scalograms are sent into pre-trained convolutional neural networks; thirdly,the features extracted from a subsequent fully connected layer are fed into(bidirectional) gated recurrent neural networks, which are followed by a single highway layer and a softmax layer;finally, predictions from these three systems are fused by a margin sampling value strategy. We then evaluate the proposed approach using the acoustic scene classification data set of 2017 IEEE AASP Challenge on Detection and Classification of Acoustic Scenes and Events(DCASE). On the evaluation set, an accuracy of 64.0 % from bidirectional gated recurrent neural networks is obtained when fusing the spectrogram and the bump scalogram, which is an improvement on the 61.0 % baseline result provided by the DCASE 2017 organisers. This result shows that extracted bump scalograms are capable of improving the classification accuracy,when fusing with a spectrogram-based system.
文摘Visual localization is a crucial component in the application of mobile robot and autonomous driving.Image retrieval is an efficient and effective technique in image-based localization methods.Due to the drastic variability of environmental conditions,e.g.,illumination changes,retrievalbased visual localization is severely affected and becomes a challenging problem.In this work,a general architecture is first formulated probabilistically to extract domain-invariant features through multi-domain image translation.Then,a novel gradientweighted similarity activation mapping loss(Grad-SAM)is incorporated for finer localization with high accuracy.We also propose a new adaptive triplet loss to boost the contrastive learning of the embedding in a self-supervised manner.The final coarse-to-fine image retrieval pipeline is implemented as the sequential combination of models with and without Grad-SAM loss.Extensive experiments have been conducted to validate the effectiveness of the proposed approach on the CMU-Seasons dataset.The strong generalization ability of our approach is verified with the RobotCar dataset using models pre-trained on urban parts of the CMU-Seasons dataset.Our performance is on par with or even outperforms the state-of-the-art image-based localization baselines in medium or high precision,especially under challenging environments with illumination variance,vegetation,and night-time images.Moreover,real-site experiments have been conducted to validate the efficiency and effectiveness of the coarse-to-fine strategy for localization.
基金This work was supported by the National Natural Science Foundation of China(Grant No.61673033).
文摘In this paper,we propose a novel and effective approach,namely GridNet,to hierarchically learn deep representation of 3D point clouds.It incorporates the ability of regular holistic description and fast data processing in a single framework,which is able to abstract powerful features progressively in an efficient way.Moreover,to capture more accurate internal geometry attributes,anchors are inferred within local neighborhoods,in contrast to the fixed or the sampled ones used in existing methods,and the learned features are thus more representative and discriminative to local point distribution.GridNet delivers very competitive results compared with the state of the art methods in both the object classification and segmentation tasks.