Support vector machines have met with significant success in the information retrieval field, especially in handling text classification tasks. Although various performance estimators for SVMs have been proposed, thes...Support vector machines have met with significant success in the information retrieval field, especially in handling text classification tasks. Although various performance estimators for SVMs have been proposed, these only focus on accuracy which is based on the leave-one-out cross validation procedure. Information-retrieval-related performance measures are always neglected in a kernel learning methodology. In this paper, we have proposed a set of information-retrieval-oriented performance estimators for SVMs, which are based on the span bound of the leave-one-out procedure. Experiments have proven that our proposed estimators are both effective and stable.展开更多
An effective shape signature namely multi-level included angle functions MIAFs is proposed to describe the hierarchy information ranging from global information to local variations of shape.Invariance to rotation tran...An effective shape signature namely multi-level included angle functions MIAFs is proposed to describe the hierarchy information ranging from global information to local variations of shape.Invariance to rotation translation and scaling are the intrinsic properties of the MIAFs.For each contour point the multi-level included angles are obtained based on the paired line segments derived from unequal-arc-length partitions of contour.And a Fourier descriptor derived from multi-level included angle functions MIAFD is presented for efficient shape retrieval.The proposed descriptor is evaluated with the standard performance evaluation method on three shape image databases the MPEG-7 database the Kimia-99 database and the Swedish leaf database. The experimental results of shape retrieval indicate that the MIAFD outperforms the existing Fourier descriptors and has low computational complexity.And the comparison of the MIAFD with other shape description methods also shows that the proposed descriptor has the highest precision at the same recall value which verifies its effectiveness.展开更多
In this paper, we present a novel Support Vector Machine active learning algorithm for effective 3D model retrieval using the concept of relevance feedback. The proposed method learns from the most informative objects...In this paper, we present a novel Support Vector Machine active learning algorithm for effective 3D model retrieval using the concept of relevance feedback. The proposed method learns from the most informative objects which are marked by the user, and then creates a boundary separating the relevant models from irrelevant ones. What it needs is only a small number of 3D models labelled by the user. It can grasp the user's semantic knowledge rapidly and accurately. Experimental results showed that the proposed algorithm significantly improves the retrieval effectiveness. Compared with four state-of-the-art query refinement schemes for 3D model retrieval, it provides superior retrieval performance after no more than two rounds of relevance feedback.展开更多
Given one specific image,it would be quite significant if humanity could simply retrieve all those pictures that fall into a similar category of images.However,traditional methods are inclined to achieve high-quality ...Given one specific image,it would be quite significant if humanity could simply retrieve all those pictures that fall into a similar category of images.However,traditional methods are inclined to achieve high-quality retrieval by utilizing adequate learning instances,ignoring the extraction of the image’s essential information which leads to difficulty in the retrieval of similar category images just using one reference image.Aiming to solve this problem above,we proposed in this paper one refined sparse representation based similar category image retrieval model.On the one hand,saliency detection and multi-level decomposition could contribute to taking salient and spatial information into consideration more fully in the future.On the other hand,the cross mutual sparse coding model aims to extract the image’s essential feature to the maximumextent possible.At last,we set up a database concluding a large number of multi-source images.Adequate groups of comparative experiments show that our method could contribute to retrieving similar category images effectively.Moreover,adequate groups of ablation experiments show that nearly all procedures play their roles,respectively.展开更多
The number of blogs and other forms of opinionated online content has increased dramatically in recent years.Many fields,including academia and national security,place an emphasis on automated political article orient...The number of blogs and other forms of opinionated online content has increased dramatically in recent years.Many fields,including academia and national security,place an emphasis on automated political article orientation detection.Political articles(especially in the Arab world)are different from other articles due to their subjectivity,in which the author’s beliefs and political affiliation might have a significant influence on a political article.With categories representing the main political ideologies,this problem may be thought of as a subset of the text categorization(classification).In general,the performance of machine learning models for text classification is sensitive to hyperparameter settings.Furthermore,the feature vector used to represent a document must capture,to some extent,the complex semantics of natural language.To this end,this paper presents an intelligent system to detect political Arabic article orientation that adapts the categorical boosting(CatBoost)method combined with a multi-level feature concept.Extracting features at multiple levels can enhance the model’s ability to discriminate between different classes or patterns.Each level may capture different aspects of the input data,contributing to a more comprehensive representation.CatBoost,a robust and efficient gradient-boosting algorithm,is utilized to effectively learn and predict the complex relationships between these features and the political orientation labels associated with the articles.A dataset of political Arabic texts collected from diverse sources,including postings and articles,is used to assess the suggested technique.Conservative,reform,and revolutionary are the three subcategories of these opinions.The results of this study demonstrate that compared to other frequently used machine learning models for text classification,the CatBoost method using multi-level features performs better with an accuracy of 98.14%.展开更多
This paper describes a new method for active learning in content-based image retrieval. The proposed method firstly uses support vector machine (SVM) classifiers to learn an initial query concept. Then the proposed ac...This paper describes a new method for active learning in content-based image retrieval. The proposed method firstly uses support vector machine (SVM) classifiers to learn an initial query concept. Then the proposed active learning scheme employs similarity measure to check the current version space and selects images with maximum expected information gain to solicit user's label. Finally, the learned query is refined based on the user's further feedback. With the combination of SVM classifier and similarity measure, the proposed method can alleviate model bias existing in each of them. Our experiments on several query concepts show that the proposed method can learn the user's query concept quickly and effectively only with several iterations.展开更多
A new information search model is reported and the design and implementation of a system based on intelligent agent is presented. The system is an assistant information retrieval system which helps users to search wha...A new information search model is reported and the design and implementation of a system based on intelligent agent is presented. The system is an assistant information retrieval system which helps users to search what they need. The system consists of four main components: interface agent, information retrieval agent, broker agent and learning agent. They collaborate to implement system functions. The agents apply learning mechanisms based on an improved ID3 algorithm.展开更多
AIM:To present a content-based image retrieval(CBIR) system that supports the classification of breast tissue density and can be used in the processing chain to adapt parameters for lesion segmentation and classificat...AIM:To present a content-based image retrieval(CBIR) system that supports the classification of breast tissue density and can be used in the processing chain to adapt parameters for lesion segmentation and classification.METHODS:Breast density is characterized by image texture using singular value decomposition(SVD) and histograms.Pattern similarity is computed by a support vector machine(SVM) to separate the four BI-RADS tissue categories.The crucial number of remaining singular values is varied(SVD),and linear,radial,and polynomial kernels are investigated(SVM).The system is supported by a large reference database for training and evaluation.Experiments are based on 5-fold cross validation.RESULTS:Adopted from DDSM,MIAS,LLNL,and RWTH datasets,the reference database is composed of over 10000 various mammograms with unified and reliable ground truth.An average precision of 82.14% is obtained using 25 singular values(SVD),polynomial kernel and the one-against-one(SVM).CONCLUSION:Breast density characterization using SVD allied with SVM for image retrieval enable the development of a CBIR system that can effectively aid radiologists in their diagnosis.展开更多
At present,there are few security models which control the communication between virtual machines (VMs).Moreover,these models are not applicable to multi-level security (MLS).In order to implement mandatory access con...At present,there are few security models which control the communication between virtual machines (VMs).Moreover,these models are not applicable to multi-level security (MLS).In order to implement mandatory access control (MAC) and MLS in virtual machine system,this paper designs Virt-BLP model,which is based on BLP model.For the distinction between virtual machine system and non-virtualized system,we build elements and security axioms of Virt-BLP model by modifying those of BLP.Moreover,comparing with BLP,the number of state transition rules of Virt-BLP is reduced accordingly and some rules can only be enforced by trusted subject.As a result,Virt-BLP model supports MAC and partial discretionary access control (DAC),well satisfying the requirement of MLS in virtual machine system.As space is limited,the implementation of our MAC framework will be shown in a continuation.展开更多
This letter presents a new discriminative model for Information Retrieval (IR), referred to as Ordinal Regression Model (ORM). ORM is different from most existing models in that it views IR as ordinal regression probl...This letter presents a new discriminative model for Information Retrieval (IR), referred to as Ordinal Regression Model (ORM). ORM is different from most existing models in that it views IR as ordinal regression problem (i.e. ranking problem) instead of binary classification. It is noted that the task of IR is to rank documents according to the user information needed, so IR can be viewed as ordinal regression problem. Two parameter learning algorithms for ORM are presented. One is a perceptron-based algorithm. The other is the ranking Support Vector Machine (SVM). The effec- tiveness of the proposed approach has been evaluated on the task of ad hoc retrieval using three English Text REtrieval Conference (TREC) sets and two Chinese TREC sets. Results show that ORM sig- nificantly outperforms the state-of-the-art language model approaches and OKAPI system in all test sets; and it is more appropriate to view IR as ordinal regression other than binary classification.展开更多
The information access is the rich data available for information retrieval, evolved to provide principle approaches or strategies for searching. For building the successful web retrieval search engine model, there ar...The information access is the rich data available for information retrieval, evolved to provide principle approaches or strategies for searching. For building the successful web retrieval search engine model, there are a number of prospects that arise at the different levels where techniques, such as Usenet, support vector machine are employed to have a significant impact. The present investigations explore the number of problems identified its level and related to finding information on web. The authors have attempted to examine the issues and prospects by applying different methods such as web graph analysis, the retrieval and analysis of newsgroup postings and statistical methods for inferring meaning in text. The proposed model thus assists the users in finding the existing formation of data they need. The study proposes three heuristics model to characterize the balancing between query and feedback information, so that adaptive relevance feedback. The authors have made an attempt to discuss the parameter factors that are responsible for the efficient searching. The important parameters can be taken care of for the future extension or development of search engines.展开更多
Much attention has been paid to relevant feedback in intelligent computation for social computing, especially in content-based image retrieval which based on WeChat platform for the medical auxiliary. It has a good ef...Much attention has been paid to relevant feedback in intelligent computation for social computing, especially in content-based image retrieval which based on WeChat platform for the medical auxiliary. It has a good effect on reducing the semantic gap between high semantics and low semantics of images. There are many kinds of support vector machines (SVM) based relevance feedback methods in image retrieval, but all of them may encounter some problems, such as a small size of sample, an asymmetric positive sample and negative sample as well as a long feedback cycle. To deal with these problems, an improved asymmetric bagging (IAB) relevance feedback algorithm is proposed. Furthermore, we apply a new fuzzy support machine (FSVM) to cooperate with IAB. To solve the over-fitting and real-time problems, we use modified local binary patterns (MLBP) as image features. Finally, experimental results demonstrate that our method performs other methods in terms of improving retrieval precision as well as retrieval efficiency.展开更多
This paper aims to develop Machine Learning algorithms to classify electronic articles related to this phenomenon by retrieving information and topic modelling.The Methodology of this study is categorized into three p...This paper aims to develop Machine Learning algorithms to classify electronic articles related to this phenomenon by retrieving information and topic modelling.The Methodology of this study is categorized into three phases:the Text Classification Approach(TCA),the Proposed Algorithms Interpretation(PAI),andfinally,Information Retrieval Approach(IRA).The TCA reflects the text preprocessing pipeline called a clean corpus.The Global Vec-tors for Word Representation(Glove)pre-trained model,FastText,Term Frequency-Inverse Document Fre-quency(TF-IDF),and Bag-of-Words(BOW)for extracting the features have been interpreted in this research.The PAI manifests the Bidirectional Long Short-Term Memory(Bi-LSTM)and Convolutional Neural Network(CNN)to classify the COVID-19 news.Again,the IRA explains the mathematical interpretation of Latent Dirich-let Allocation(LDA),obtained for modelling the topic of Information Retrieval(IR).In this study,99%accuracy was obtained by performing K-fold cross-validation on Bi-LSTM with Glove.A comparative analysis between Deep Learning and Machine Learning based on feature extraction and computational complexity exploration has been performed in this research.Furthermore,some text analyses and the most influential aspects of each document have been explored in this study.We have utilized Bidirectional Encoder Representations from Trans-formers(BERT)as a Deep Learning mechanism in our model training,but the result has not been uncovered satisfactory.However,the proposed system can be adjustable in the real-time news classification of COVID-19.展开更多
文摘Support vector machines have met with significant success in the information retrieval field, especially in handling text classification tasks. Although various performance estimators for SVMs have been proposed, these only focus on accuracy which is based on the leave-one-out cross validation procedure. Information-retrieval-related performance measures are always neglected in a kernel learning methodology. In this paper, we have proposed a set of information-retrieval-oriented performance estimators for SVMs, which are based on the span bound of the leave-one-out procedure. Experiments have proven that our proposed estimators are both effective and stable.
基金The National Natural Science Foundation of China(No.61170116,61375010,60973064)
文摘An effective shape signature namely multi-level included angle functions MIAFs is proposed to describe the hierarchy information ranging from global information to local variations of shape.Invariance to rotation translation and scaling are the intrinsic properties of the MIAFs.For each contour point the multi-level included angles are obtained based on the paired line segments derived from unequal-arc-length partitions of contour.And a Fourier descriptor derived from multi-level included angle functions MIAFD is presented for efficient shape retrieval.The proposed descriptor is evaluated with the standard performance evaluation method on three shape image databases the MPEG-7 database the Kimia-99 database and the Swedish leaf database. The experimental results of shape retrieval indicate that the MIAFD outperforms the existing Fourier descriptors and has low computational complexity.And the comparison of the MIAFD with other shape description methods also shows that the proposed descriptor has the highest precision at the same recall value which verifies its effectiveness.
基金the National Basic Research Program (973) of China (No. 2004CB719401)the National Research Foundation for the Doctoral Program of Higher Education of China (No.20060003060)
文摘In this paper, we present a novel Support Vector Machine active learning algorithm for effective 3D model retrieval using the concept of relevance feedback. The proposed method learns from the most informative objects which are marked by the user, and then creates a boundary separating the relevant models from irrelevant ones. What it needs is only a small number of 3D models labelled by the user. It can grasp the user's semantic knowledge rapidly and accurately. Experimental results showed that the proposed algorithm significantly improves the retrieval effectiveness. Compared with four state-of-the-art query refinement schemes for 3D model retrieval, it provides superior retrieval performance after no more than two rounds of relevance feedback.
基金sponsored by the National Natural Science Foundation of China(Grants:62002200,61772319)Shandong Natural Science Foundation of China(Grant:ZR2020QF012).
文摘Given one specific image,it would be quite significant if humanity could simply retrieve all those pictures that fall into a similar category of images.However,traditional methods are inclined to achieve high-quality retrieval by utilizing adequate learning instances,ignoring the extraction of the image’s essential information which leads to difficulty in the retrieval of similar category images just using one reference image.Aiming to solve this problem above,we proposed in this paper one refined sparse representation based similar category image retrieval model.On the one hand,saliency detection and multi-level decomposition could contribute to taking salient and spatial information into consideration more fully in the future.On the other hand,the cross mutual sparse coding model aims to extract the image’s essential feature to the maximumextent possible.At last,we set up a database concluding a large number of multi-source images.Adequate groups of comparative experiments show that our method could contribute to retrieving similar category images effectively.Moreover,adequate groups of ablation experiments show that nearly all procedures play their roles,respectively.
文摘The number of blogs and other forms of opinionated online content has increased dramatically in recent years.Many fields,including academia and national security,place an emphasis on automated political article orientation detection.Political articles(especially in the Arab world)are different from other articles due to their subjectivity,in which the author’s beliefs and political affiliation might have a significant influence on a political article.With categories representing the main political ideologies,this problem may be thought of as a subset of the text categorization(classification).In general,the performance of machine learning models for text classification is sensitive to hyperparameter settings.Furthermore,the feature vector used to represent a document must capture,to some extent,the complex semantics of natural language.To this end,this paper presents an intelligent system to detect political Arabic article orientation that adapts the categorical boosting(CatBoost)method combined with a multi-level feature concept.Extracting features at multiple levels can enhance the model’s ability to discriminate between different classes or patterns.Each level may capture different aspects of the input data,contributing to a more comprehensive representation.CatBoost,a robust and efficient gradient-boosting algorithm,is utilized to effectively learn and predict the complex relationships between these features and the political orientation labels associated with the articles.A dataset of political Arabic texts collected from diverse sources,including postings and articles,is used to assess the suggested technique.Conservative,reform,and revolutionary are the three subcategories of these opinions.The results of this study demonstrate that compared to other frequently used machine learning models for text classification,the CatBoost method using multi-level features performs better with an accuracy of 98.14%.
文摘This paper describes a new method for active learning in content-based image retrieval. The proposed method firstly uses support vector machine (SVM) classifiers to learn an initial query concept. Then the proposed active learning scheme employs similarity measure to check the current version space and selects images with maximum expected information gain to solicit user's label. Finally, the learned query is refined based on the user's further feedback. With the combination of SVM classifier and similarity measure, the proposed method can alleviate model bias existing in each of them. Our experiments on several query concepts show that the proposed method can learn the user's query concept quickly and effectively only with several iterations.
文摘A new information search model is reported and the design and implementation of a system based on intelligent agent is presented. The system is an assistant information retrieval system which helps users to search what they need. The system consists of four main components: interface agent, information retrieval agent, broker agent and learning agent. They collaborate to implement system functions. The agents apply learning mechanisms based on an improved ID3 algorithm.
基金Supported by CNPq-Brazil,Grants 306193/2007-8,471518/ 2007-7,307373/2006-1 and 484893/2007-6,by FAPEMIG,Grant PPM 347/08,and by CAPESThe IRMA project is funded by the German Research Foundation(DFG),Le 1108/4 and Le 1108/9
文摘AIM:To present a content-based image retrieval(CBIR) system that supports the classification of breast tissue density and can be used in the processing chain to adapt parameters for lesion segmentation and classification.METHODS:Breast density is characterized by image texture using singular value decomposition(SVD) and histograms.Pattern similarity is computed by a support vector machine(SVM) to separate the four BI-RADS tissue categories.The crucial number of remaining singular values is varied(SVD),and linear,radial,and polynomial kernels are investigated(SVM).The system is supported by a large reference database for training and evaluation.Experiments are based on 5-fold cross validation.RESULTS:Adopted from DDSM,MIAS,LLNL,and RWTH datasets,the reference database is composed of over 10000 various mammograms with unified and reliable ground truth.An average precision of 82.14% is obtained using 25 singular values(SVD),polynomial kernel and the one-against-one(SVM).CONCLUSION:Breast density characterization using SVD allied with SVM for image retrieval enable the development of a CBIR system that can effectively aid radiologists in their diagnosis.
基金Acknowledgements This work was supported by National Key Basic Research and Development Plan (973 Plan) of China (No. 2007CB310900) and National Natural Science Foundation of China (No. 90612018, 90715030 and 60970008).
文摘At present,there are few security models which control the communication between virtual machines (VMs).Moreover,these models are not applicable to multi-level security (MLS).In order to implement mandatory access control (MAC) and MLS in virtual machine system,this paper designs Virt-BLP model,which is based on BLP model.For the distinction between virtual machine system and non-virtualized system,we build elements and security axioms of Virt-BLP model by modifying those of BLP.Moreover,comparing with BLP,the number of state transition rules of Virt-BLP is reduced accordingly and some rules can only be enforced by trusted subject.As a result,Virt-BLP model supports MAC and partial discretionary access control (DAC),well satisfying the requirement of MLS in virtual machine system.As space is limited,the implementation of our MAC framework will be shown in a continuation.
基金Supported by the High Technology Research and Devel-opment Program of China (No.2006AA01Z150)the Key Project of the National Natural Science Foundation of China (No.60373101)+1 种基金the Natural Science Foundation of Heilongjiang Province (No.F2007-14)the Project of Heilongjiang Outstanding Young University Teacher (No. 1151G037).
文摘This letter presents a new discriminative model for Information Retrieval (IR), referred to as Ordinal Regression Model (ORM). ORM is different from most existing models in that it views IR as ordinal regression problem (i.e. ranking problem) instead of binary classification. It is noted that the task of IR is to rank documents according to the user information needed, so IR can be viewed as ordinal regression problem. Two parameter learning algorithms for ORM are presented. One is a perceptron-based algorithm. The other is the ranking Support Vector Machine (SVM). The effec- tiveness of the proposed approach has been evaluated on the task of ad hoc retrieval using three English Text REtrieval Conference (TREC) sets and two Chinese TREC sets. Results show that ORM sig- nificantly outperforms the state-of-the-art language model approaches and OKAPI system in all test sets; and it is more appropriate to view IR as ordinal regression other than binary classification.
文摘The information access is the rich data available for information retrieval, evolved to provide principle approaches or strategies for searching. For building the successful web retrieval search engine model, there are a number of prospects that arise at the different levels where techniques, such as Usenet, support vector machine are employed to have a significant impact. The present investigations explore the number of problems identified its level and related to finding information on web. The authors have attempted to examine the issues and prospects by applying different methods such as web graph analysis, the retrieval and analysis of newsgroup postings and statistical methods for inferring meaning in text. The proposed model thus assists the users in finding the existing formation of data they need. The study proposes three heuristics model to characterize the balancing between query and feedback information, so that adaptive relevance feedback. The authors have made an attempt to discuss the parameter factors that are responsible for the efficient searching. The important parameters can be taken care of for the future extension or development of search engines.
基金This work is supported by the National Natural Science Foundation of China (No. 61472161, 61133011, 61402195, 61502198, 61303132, 61202308), Science & Technology Development Project of Jilin Province (No. 20140101201JC).
文摘Much attention has been paid to relevant feedback in intelligent computation for social computing, especially in content-based image retrieval which based on WeChat platform for the medical auxiliary. It has a good effect on reducing the semantic gap between high semantics and low semantics of images. There are many kinds of support vector machines (SVM) based relevance feedback methods in image retrieval, but all of them may encounter some problems, such as a small size of sample, an asymmetric positive sample and negative sample as well as a long feedback cycle. To deal with these problems, an improved asymmetric bagging (IAB) relevance feedback algorithm is proposed. Furthermore, we apply a new fuzzy support machine (FSVM) to cooperate with IAB. To solve the over-fitting and real-time problems, we use modified local binary patterns (MLBP) as image features. Finally, experimental results demonstrate that our method performs other methods in terms of improving retrieval precision as well as retrieval efficiency.
文摘This paper aims to develop Machine Learning algorithms to classify electronic articles related to this phenomenon by retrieving information and topic modelling.The Methodology of this study is categorized into three phases:the Text Classification Approach(TCA),the Proposed Algorithms Interpretation(PAI),andfinally,Information Retrieval Approach(IRA).The TCA reflects the text preprocessing pipeline called a clean corpus.The Global Vec-tors for Word Representation(Glove)pre-trained model,FastText,Term Frequency-Inverse Document Fre-quency(TF-IDF),and Bag-of-Words(BOW)for extracting the features have been interpreted in this research.The PAI manifests the Bidirectional Long Short-Term Memory(Bi-LSTM)and Convolutional Neural Network(CNN)to classify the COVID-19 news.Again,the IRA explains the mathematical interpretation of Latent Dirich-let Allocation(LDA),obtained for modelling the topic of Information Retrieval(IR).In this study,99%accuracy was obtained by performing K-fold cross-validation on Bi-LSTM with Glove.A comparative analysis between Deep Learning and Machine Learning based on feature extraction and computational complexity exploration has been performed in this research.Furthermore,some text analyses and the most influential aspects of each document have been explored in this study.We have utilized Bidirectional Encoder Representations from Trans-formers(BERT)as a Deep Learning mechanism in our model training,but the result has not been uncovered satisfactory.However,the proposed system can be adjustable in the real-time news classification of COVID-19.