The aim of this study was to discriminate organic from conventional orange juice based on chemical elements and data mining applications.A comprehensive sampling of organic and conventional oranges was carried out in ...The aim of this study was to discriminate organic from conventional orange juice based on chemical elements and data mining applications.A comprehensive sampling of organic and conventional oranges was carried out in Borborema,state of Sao Paulo,Brazil.The fruits of the variety Valencia(Citrus sinensis(L.)Osbeck)budded on Rangpur lime(Citrus limonia Osbeck)were analyzed.Eleven chemical elements were determined in 57 orange samples grown in organic and conventional systems.In order to classify these samples,data mining techniques(Support Vector Machine(SVM)and Multilayer Perceptron(MLP))were combined with feature selection(F-score and chi-squared).SVM with chi-squared had a better performance compared with the other techniques because it reached 93.00% accuracy using only seven chemical components(Cu,Cs,Zn,Al,Mn,Rb and Sr),and correctly classified 96.73% of the samples grown in an organic system.展开更多
Wines with a clear geographical origin are an issue of interest for consumers and food industries.This paper presents a data mining study of Merlot wines from South America to identify the fingerprint of their geograp...Wines with a clear geographical origin are an issue of interest for consumers and food industries.This paper presents a data mining study of Merlot wines from South America to identify the fingerprint of their geographical origin.A group of samples from Argentina(n=17),Brazil(n=12),Chile(n=48),and Uruguay(n=6)was analyzed.Twenty chemical compounds were determined by high-performance liquid chromatography(HPLC).These compounds include antioxidant activity,total polyphenols,total anthocyanins,individual anthocyanins and color.Four binary classification problems were performed(Brazil versus non-Brazil,Argentina versus non-Argentina,Chile versus non-Chile,and Uruguay versus non-Uruguay)to investigate the geographic characteristics of each country.Through the evaluation of binary classifications in our dataset it was possible to identify the main variables(chemical compounds)that discriminate between the countries.We used the following algorithms:Synthetic Minority over-sample Technique and under-sampling to balance the dataset of each classification approach,the Relief algorithm to obtain a variable importance ranking and the classifiers Support Vector Machines,Multilayer Perceptron and Radial Basis Function Network with dynamic decay adjustment.SVM model obtained the highest performance measures among the classifiers for each dataset(93.73%of accuracy for the Brazil versus non-Brazil,91.18%for the Argentina versus non-Argentina,79.16%for the Chile versus non-Chile,and 91.67%for the Uruguay versus non-Uruguay classification).These accuracies were achieved by the search of the possible variable subsets according to Relief for each classification approach.We found that some variables,such as DPPH,wine color and individual anthocyanins,are among the most important variables in the characterization of Merlot wines.展开更多
文摘The aim of this study was to discriminate organic from conventional orange juice based on chemical elements and data mining applications.A comprehensive sampling of organic and conventional oranges was carried out in Borborema,state of Sao Paulo,Brazil.The fruits of the variety Valencia(Citrus sinensis(L.)Osbeck)budded on Rangpur lime(Citrus limonia Osbeck)were analyzed.Eleven chemical elements were determined in 57 orange samples grown in organic and conventional systems.In order to classify these samples,data mining techniques(Support Vector Machine(SVM)and Multilayer Perceptron(MLP))were combined with feature selection(F-score and chi-squared).SVM with chi-squared had a better performance compared with the other techniques because it reached 93.00% accuracy using only seven chemical components(Cu,Cs,Zn,Al,Mn,Rb and Sr),and correctly classified 96.73% of the samples grown in an organic system.
基金Authors are grateful to Conselho Nacional de Desenvolvimento Cientı´fico e Tecnolo´gico(CNPq)for financial support.
文摘Wines with a clear geographical origin are an issue of interest for consumers and food industries.This paper presents a data mining study of Merlot wines from South America to identify the fingerprint of their geographical origin.A group of samples from Argentina(n=17),Brazil(n=12),Chile(n=48),and Uruguay(n=6)was analyzed.Twenty chemical compounds were determined by high-performance liquid chromatography(HPLC).These compounds include antioxidant activity,total polyphenols,total anthocyanins,individual anthocyanins and color.Four binary classification problems were performed(Brazil versus non-Brazil,Argentina versus non-Argentina,Chile versus non-Chile,and Uruguay versus non-Uruguay)to investigate the geographic characteristics of each country.Through the evaluation of binary classifications in our dataset it was possible to identify the main variables(chemical compounds)that discriminate between the countries.We used the following algorithms:Synthetic Minority over-sample Technique and under-sampling to balance the dataset of each classification approach,the Relief algorithm to obtain a variable importance ranking and the classifiers Support Vector Machines,Multilayer Perceptron and Radial Basis Function Network with dynamic decay adjustment.SVM model obtained the highest performance measures among the classifiers for each dataset(93.73%of accuracy for the Brazil versus non-Brazil,91.18%for the Argentina versus non-Argentina,79.16%for the Chile versus non-Chile,and 91.67%for the Uruguay versus non-Uruguay classification).These accuracies were achieved by the search of the possible variable subsets according to Relief for each classification approach.We found that some variables,such as DPPH,wine color and individual anthocyanins,are among the most important variables in the characterization of Merlot wines.