期刊文献+
共找到430篇文章
< 1 2 22 >
每页显示 20 50 100
Correspondence Analysis on a Space-Time Data Set for Multiple Environmental Variables
1
作者 Palma Monica 《International Journal of Geosciences》 2015年第10期1154-1165,共12页
Applications of the multivariate technique called correspondence analysis for environmental studies are relatively new and are limited to spatial multivariate data set. In this paper, a procedure of applying correspon... Applications of the multivariate technique called correspondence analysis for environmental studies are relatively new and are limited to spatial multivariate data set. In this paper, a procedure of applying correspondence analysis to a large space-time data set for multiple environmental variables is shown. In particular, nitrogen dioxide and carbon monoxide hourly concentrations measured during January 1999 at several monitored stations in a district of Northern Italy are analyzed. The procedure consists in transforming the continuous variables into categorical ones by the means of appropriate indicator variables, generating special contingency tables and applying correspondence analysis. The use of this classical multivariate technique allows the identification of important relationships among pollution levels and monitoring stations and/or relationships among pollution levels and observation times. 展开更多
关键词 space-time data INDICATOR TRANSFORM CORRESPONDENCE analysis
下载PDF
Exploratory Data Analysis Applied in Mapping Multi-element Soil Geochemical Anomalies for Drill Target Definition:A Case Study from the Unpha Layered Non-magmatic Hydrothermal Pb-Zn Deposit,DPR Korea
2
作者 JANG Gwang-Hyok WON Hyon-Chol +1 位作者 HWANG Bo-Hyon CHOI Chol-Man 《Acta Geologica Sinica(English Edition)》 SCIE CAS CSCD 2021年第4期1357-1365,共9页
A factor analysis was applied to soil geochemical data to define anomalies related to buried Pb-Zn mineralization.A favorable main factor with a strong association of the elements Zn,Cu and Pb,related to mineralizatio... A factor analysis was applied to soil geochemical data to define anomalies related to buried Pb-Zn mineralization.A favorable main factor with a strong association of the elements Zn,Cu and Pb,related to mineralization,was selected for interpretation.The median+2 MAD(median absolute deviation)method of exploratory data analysis(EDA)and C-A(concentration-area)fractal modeling were then applied to the Mahalanobis distance,as defined by Zn,Cu and Pb from the factor analysis to set the thresholds for defining multi-element anomalies.As a result,the median+2 MAD method more successfully identified the Pb-Zn mineralization than the C-A fractal model.The soil anomaly identified by the median+2 MAD method on the Mahalanobis distances defined by three principal elements(Zn,Cu and Pb)rather than thirteen elements(Co,Zn,Cu,V,Mo,Ni,Cr,Mn,Pb,Ba,Sr,Zr and Ti)was the more favorable reflection of the ore body.The identified soil geochemical anomalies were compared with the in situ economic Pb-Zn ore bodies for validation.The results showed that the median+2 MAD approach is capable of mapping both strong and weak geochemical anomalies related to buried Pb-Zn mineralization,which is therefore useful at the reconnaissance drilling stage. 展开更多
关键词 factor analysis exploratory data analysis Mahalanobis distance multi-element Unpha
下载PDF
Clustering Structure Analysis in Time-Series Data With Density-Based Clusterability Measure 被引量:6
3
作者 Juho Jokinen Tomi Raty Timo Lintonen 《IEEE/CAA Journal of Automatica Sinica》 SCIE EI CSCD 2019年第6期1332-1343,共12页
Clustering is used to gain an intuition of the struc tures in the data.Most of the current clustering algorithms pro duce a clustering structure even on data that do not possess such structure.In these cases,the algor... Clustering is used to gain an intuition of the struc tures in the data.Most of the current clustering algorithms pro duce a clustering structure even on data that do not possess such structure.In these cases,the algorithms force a structure in the data instead of discovering one.To avoid false structures in the relations of data,a novel clusterability assessment method called density-based clusterability measure is proposed in this paper.I measures the prominence of clustering structure in the data to evaluate whether a cluster analysis could produce a meaningfu insight to the relationships in the data.This is especially useful in time-series data since visualizing the structure in time-series data is hard.The performance of the clusterability measure is evalu ated against several synthetic data sets and time-series data sets which illustrate that the density-based clusterability measure can successfully indicate clustering structure of time-series data. 展开更多
关键词 CLUSTERING exploratory data analysis time-series UNSUPERVISED LEARNING
下载PDF
ST-Map:an Interactive Map for Discovering Spatial and Temporal Patterns in Bibliographic Data
4
作者 ZUO Chenyu XU Yifan +1 位作者 DING Lingfang MENG Liqiu 《Journal of Geodesy and Geoinformation Science》 CSCD 2024年第1期3-15,共13页
Getting insight into the spatiotemporal distribution patterns of knowledge innovation is receiving increasing attention from policymakers and economic research organizations.Many studies use bibliometric data to analy... Getting insight into the spatiotemporal distribution patterns of knowledge innovation is receiving increasing attention from policymakers and economic research organizations.Many studies use bibliometric data to analyze the popularity of certain research topics,well-adopted methodologies,influential authors,and the interrelationships among research disciplines.However,the visual exploration of the patterns of research topics with an emphasis on their spatial and temporal distribution remains challenging.This study combined a Space-Time Cube(STC)and a 3D glyph to represent the complex multivariate bibliographic data.We further implemented a visual design by developing an interactive interface.The effectiveness,understandability,and engagement of ST-Map are evaluated by seven experts in geovisualization.The results suggest that it is promising to use three-dimensional visualization to show the overview and on-demand details on a single screen. 展开更多
关键词 space-time cube bibliographic data spatiotemporal analysis user study interactive map
下载PDF
Method of Phase Diagrams for the Analysis of Seism-Acoustical Spatial-Time Monitoring Data in Oil Wells
5
作者 Olga Hachay Oleg Khachay 《Open Journal of Geology》 2018年第9期874-882,共9页
Experimental and theoretical studies of the mechanisms of vibration stimulation of oil recovery in watered fields lead to the conclusion that resonance oscillations develop in fractured-block formations. These oscilla... Experimental and theoretical studies of the mechanisms of vibration stimulation of oil recovery in watered fields lead to the conclusion that resonance oscillations develop in fractured-block formations. These oscillations, caused by weak but long-lasting and frequency-stable influences, create the conditions for ultrasonic wave’s generation in the layers, which are capable of destroying thickened oil membranes in reservoir cracks. For fractured-porous reservoirs in the process of exploitation by the method of water high-pressure oil displacement, the possibility of intensifying ultrasonic vibrations can have an important technological significance. Even a very weak ultrasound can destroy, over a long period of time, the viscous oil membranes formed in the cracks between the blocks, which can be the reason for lowering the permeability of the layers and increasing the oil recovery. To describe these effects, it is necessary to consider the wave process in a hierarchically blocky environment and theoretically simulate the mechanism of the appearance of self-oscillations under the action of relaxation shear stresses. For the analysis of seism acoustic response in time on fixed intervals along the borehole an algorithm of phase diagrams of the state of many-phase medium is suggested. 展开更多
关键词 Phase DIAGRAMS METHOD of analysis space-time Monitoring data Oil WELLS State of the Two Component MEDIUM
下载PDF
Spatio-temporal evolution and factor explanatory power analysis of urban resilience in the Yangtze River Economic Belt 被引量:2
6
作者 Changsheng Ye Mengshan Hu +2 位作者 Lei Lu Qian Dong Moli Gu 《Geography and Sustainability》 2022年第4期299-311,共13页
Urban resilience assesses a city’s ability to withstand unknown risks.Scholars are not comprehensive in assessing urban resilience,and they lack consideration of population resilience.This study investigated 110 pref... Urban resilience assesses a city’s ability to withstand unknown risks.Scholars are not comprehensive in assessing urban resilience,and they lack consideration of population resilience.This study investigated 110 prefecturelevel cities in the Yangtze River Economic Belt(YREB)as study areas.We calculated the YREB’s level of urban resilience based on the aspects of“economy-society-population-ecology-infrastructure”,which ensured that the comprehensive evaluation of urban resilience is complete and sufficient.The spatio-temporal evolution of urban resilience was analyzed using exploratory spatial data.Geodetectors were used to investigate the impact of several indicators,focusing on economic,social,population,ecological,and infrastructure factors,on urban resilience.The results showed that the urban resilience of the YREB has maintained a slow upward trend from 2005 to 2018,and the average urban resilience of the YREB has risen from 0.2442 to 0.2560.The resilience gap between cities in the study region increased initially and then decreased.The dominant factor in the spatial differentiation of urban resilience was the economic factors,followed by the population factors.Urban resilience has been clarified and an evaluation index system is constructed,which can provide an effective reference for the evaluation of urban resilience among countries around the world.Based on this,factors that optimize urban resilience are configured,and the regional and national sustainable development can be promoted. 展开更多
关键词 Urban resilience Spatial-temporal differentiation Geographical detector exploratory spatial data analysis The Yangtze River Economic Belt
下载PDF
Geographical Analysis of Lung Cancer Mortality Rate and PM2.5 Using Global Annual Average PM2.5 Grids from MODIS and MISR Aerosol Optical Depth
7
作者 Zhiyong Hu Ethan Baker 《Journal of Geoscience and Environment Protection》 2017年第6期183-197,共15页
Exposure to particulate matter with an aerodynamic diameter of less than 2.5 μm (PM2.5) may increase risk of lung cancer. The repetitive and broad-area coverage of satellites may allow atmospheric remote sensing to o... Exposure to particulate matter with an aerodynamic diameter of less than 2.5 μm (PM2.5) may increase risk of lung cancer. The repetitive and broad-area coverage of satellites may allow atmospheric remote sensing to offer a unique opportunity to monitor air quality and help fill air pollution data gaps that hinder efforts to study air pollution and protect public health. This geographical study explores if there is an association between PM2.5 and lung cancer mortality rate in the conterminous USA. Lung cancer (ICD-10 codes C34- C34) death count and population at risk by county were extracted for the period from 2001 to 2010 from the U.S. CDC WONDER online database. The 2001-2010 Global Annual Average PM2.5 Grids from MODIS and MISR Aerosol Optical Depth dataset was used to calculate a 10 year average PM2.5 pollution. Exploratory spatial data analyses, spatial regression (a spatial lag and a spatial error model), and spatially extended Bayesian Monte Carlo Markov Chain simulation found that there is a significant positive association between lung cancer mortality rate and PM2.5. The association would justify the need of further toxicological investigation of the biological mechanism of the adverse effect of the PM2.5 pollution on lung cancer. The Global Annual Average PM2.5 Grids from MODIS and MISR Aerosol Optical Depth dataset provides a continuous surface of concentrations of PM2.5 and is a useful data source for environmental health research. 展开更多
关键词 Lung Cancer PM2.5 Remote Sensing GIS exploratory SPATIAL data analysis SPATIAL Regression BAYESIAN MCMC Simulation
下载PDF
桥梁健康监测数据的质量评估方法研究 被引量:1
8
作者 殷鹏程 龙清春 +1 位作者 单德山 曹阳梅 《公路工程》 2024年第2期1-6,45,共7页
桥梁健康监测数据的挖掘和分析工作只有在整体数据质量符合基本要求的有效数据基础上进行,才能保障如模态参数识别、损伤识别和状态评估等后续工作的准确性。因此,基于量化改进的探索性分析方法(Exploratory Data Analysis,EDA)和相关... 桥梁健康监测数据的挖掘和分析工作只有在整体数据质量符合基本要求的有效数据基础上进行,才能保障如模态参数识别、损伤识别和状态评估等后续工作的准确性。因此,基于量化改进的探索性分析方法(Exploratory Data Analysis,EDA)和相关性分析从数据完整性、准确性和一致性的角度建立了桥梁健康监测静、动态数据的质量评估方法。对某大跨度斜拉桥健康监测系统的静、动态数据进行质量评估,通过对比分析了不同评估质量的温度数据、静挠度数据和不同评估质量的主梁竖向加速度动力信号的模态参数识别的稳定图,验证了所提方法的正确性。结果表明,所提评估方法能够快速有效地判断数据质量的好坏,进而确保桥梁结构的服役性能评估和预测的准确性,有利于提高健康监测数据的可用性和效能。 展开更多
关键词 健康监测 数据质量评估 探索性数据分析 模态参数识别
下载PDF
基于EDA统计图量化的桥梁动态监测数据质量评估
9
作者 殷鹏程 谭曼丽莎 +1 位作者 曹阳梅 单德山 《重庆交通大学学报(自然科学版)》 CAS CSCD 北大核心 2024年第5期9-16,共8页
探索性数据分析统计图在桥梁健康监测动态数据质量评估中已有广泛应用。为了减少人工观察统计图的主观性,通过近似度量方法实现统计图的量化分析,得到多个指标对监测数据进行快速质量评估。在运营环境激励作用下,桥梁结构动力响应具有... 探索性数据分析统计图在桥梁健康监测动态数据质量评估中已有广泛应用。为了减少人工观察统计图的主观性,通过近似度量方法实现统计图的量化分析,得到多个指标对监测数据进行快速质量评估。在运营环境激励作用下,桥梁结构动力响应具有短时线性平稳性,近似服从正态分布。以某大跨斜拉桥振动数据为研究对象,首先,绘制样本数据直方图和Q-Q图,通过观察数据分布特征预先判断数据质量,确定优、良和差3个等级。然后,分别通过KL散度和余弦相似度2种近似度量方法对样本数据直方图和Q-Q图进行量化,得到数据服从正态分布程度的指标;通过箱线图检测样本数据全局异常点,得到正常数据占比;统计分析得到量化值和先验质量等级的对应关系,确定以直方图KL散度和余弦相似度为主、以箱线图正常数据占比为辅的数据质量评估标准。最后,取部分数据为验证集,进一步验证所提方法各个指标的合理性,并给出该方法在实际工程上的应用结果。 展开更多
关键词 桥梁工程 桥梁结构健康监测 数据质量评估 探索性数据分析 KL散度 余弦相似度 箱线图
下载PDF
体育彩票业网络关注度时空分异与影响因素研究
10
作者 史文文 杨舰东 +2 位作者 伊哲 刘世文 张锐 《哈尔滨体育学院学报》 2024年第4期10-20,共11页
网络关注度体现了人们对某种事物的现实关注和潜在需求,为研究大数据背景下事物发展趋势提供了新的数据参考。运用指数测度、探索性空间数据分析(ESDA)和地理探测器等方法,对2011—2021年我国31个省(市、自治区)体育彩票网络关注度的时... 网络关注度体现了人们对某种事物的现实关注和潜在需求,为研究大数据背景下事物发展趋势提供了新的数据参考。运用指数测度、探索性空间数据分析(ESDA)和地理探测器等方法,对2011—2021年我国31个省(市、自治区)体育彩票网络关注度的时空分异特征及影响因素进行分析。结果表明:(1)在时间上,2011—2021年中国体育彩票总体呈现先上升再下降并趋于稳定的态势。(2)在空间上,中国体育彩票网络关注度总体表现为集聚—随机—再集聚分布的特征;各省份之间空间异质性与空间依赖性并存;网络关注度的冷热点区域沿“胡焕庸线”两侧分布,呈现出南多北少、东部密集西部稀疏的极化效应。(3)地区生产总值、中青年人口数量、大专及大专以上人口数、互联网宽带接入用户数和体育彩票销量是造成中国体育彩票网络关注度空间分异的核心影响因素,各个因素的影响程度在地理空间上存在一定差异。 展开更多
关键词 中国体育彩票 网络关注度 时空分异 探索性空间数据分析 地理探测器
下载PDF
县域耕地非农化、非粮化与乡村人口空心化的耦合协调关系——以长江经济带为例 被引量:1
11
作者 崔家兴 靳涵 +3 位作者 罗滢渊 林勇 童新 朱媛媛 《生态学报》 CAS CSCD 北大核心 2024年第5期1822-1836,共15页
耕地非农化、非粮化的不断扩大严重影响了我国的粮食安全,亟需研究其演变机制和管控措施。基于1990、2000、2010和2020年四期土地覆盖数据和农业劳动力数据,利用耦合协调度模型和探索性时空数据分析(ESTDA)方法,分析了长江经济带县域耕... 耕地非农化、非粮化的不断扩大严重影响了我国的粮食安全,亟需研究其演变机制和管控措施。基于1990、2000、2010和2020年四期土地覆盖数据和农业劳动力数据,利用耦合协调度模型和探索性时空数据分析(ESTDA)方法,分析了长江经济带县域耕地非农化、非粮化和乡村人口空心化的时空演化特征及其耦合协调关系。结果表明:(1)非农化呈现东高西低的格局,且存在围绕中心城市高值集聚的态势。非粮化则大致呈现出西高东低的倾向,远离大城市的边远县域非粮化程度较高。研究期内非农化和非粮化呈整体加剧趋势,乡村人口空心化程度显著提升。(2)非农化、非粮化和乡村人口空心化之间呈现较强耦合作用,失调区域逐渐扩大。(3)非农化、非粮化和乡村人口空心化耦合协调度存在明显的空间集聚现象并逐渐增强。高值集聚区主要分布在上游地区且数量逐渐减少,低值集聚区分布相对分散。(4)非农化、非粮化和乡村人口空心化耦合协调度具有较强时空动态特征,但县域及其邻域协同变动的比例较高,表明具有较强的局域整合性。 展开更多
关键词 非农化 非粮化 乡村人口空心化 探索性时空数据分析 长江经济带
下载PDF
我国省域体育场地设施时空演变特征及影响因素分析 被引量:1
12
作者 刘艳 王占坤 唐闻捷 《浙江体育科学》 2024年第1期19-27,共9页
为准确把握我国体育场地设施发展脉络与影响机制,文章以我国31个省份(直辖市、自治区)在五普、六普、七普三个不同阶段中的体育场地数量和人均体育场地面积相关数据为研究对象,并运用探索性空间数据分析法、Spearman相关分析和主成分分... 为准确把握我国体育场地设施发展脉络与影响机制,文章以我国31个省份(直辖市、自治区)在五普、六普、七普三个不同阶段中的体育场地数量和人均体育场地面积相关数据为研究对象,并运用探索性空间数据分析法、Spearman相关分析和主成分分析,探究我国体育场地设施时空演变特征及影响因素。研究表明:从总体上看,我国体育场地数量空间分布主要以胡焕庸线为界,呈现“单核心-双核心-多核心”的演变趋势,形成了“东多西少”“内紧外松”的空间格局;从全局空间分布上看,我国体育场地设施空间自相关现象一直存在,但在不同阶段体育场地设施数量和面积空间分布的规律并不完全相似;从局域空间分布上看,体育场地数量热点区域集聚范围“逐年递增”,主要以江西、湖南等江浙沪地区及临近省域为热点区,以青海、四川、甘肃等地区为冷点区域。其空间分布特征始终是“东多西少”,演变趋势变化较小。但人均体育场地面积空间布局极不稳定,经历了“北部高于南部”到“东部优于西部”的过程;主成分分析结果表明,我国体育场地设施空间布局是诸多因素共同作用而产生的结果,其影响因素的解释力由大到小排序依次为政策环境、经济状况、人口发展水平、教育经费投入、生活水平。 展开更多
关键词 体育场地设施 时空演变 影响因素 探索性空间数据分析法 主成分分析
下载PDF
黄河下游流域村落空间形态特征研究——以东平湖至泺口段为例
13
作者 樊德正 孔亚暐 马永东 《城市建筑》 2024年第3期90-93,共4页
以黄河下游东平湖至泺口段乡村聚落为研究对象,在流域范围内选取106个典型村落,采用定性、定量相结合的分析方法,开展黄河流域环境影响下的村落空间形态特征的研究。提取影响村落空间形态的指标,借助探索性数据分析寻找指标之间的关联性... 以黄河下游东平湖至泺口段乡村聚落为研究对象,在流域范围内选取106个典型村落,采用定性、定量相结合的分析方法,开展黄河流域环境影响下的村落空间形态特征的研究。提取影响村落空间形态的指标,借助探索性数据分析寻找指标之间的关联性,以分析流域村落空间结构形态特征。研究发现,村落边界与路网相关性较强;黄河走向、村落与黄河距离、村落周边的田、林、塘均会对村落形态产生影响。对黄河下游东平湖至泺口段流域传统村落的空间形态研究及其保护、发展具有一定的理论和实践参考意义,对其他流域村落空间形态的研究具有借鉴价值。 展开更多
关键词 黄河下游 村落空间 形态特征 探索性数据分析
下载PDF
基于耦合协调分析的火灾-经济-环境时空演化研究
14
作者 向月 骆鑫 +1 位作者 秦毅 钱一诺 《中国安全科学学报》 CAS CSCD 北大核心 2024年第2期103-109,共7页
为探究火灾、经济、环境的时空演化特征,基于2000—2019年中国火灾、经济、环境数据,建立火灾-经济-环境耦合协调模型,研究火灾、经济、环境系统的耦合发展水平;结合探索性空间数据分析(ESDA)方法研究我国31个省域各子系统时空演变特征... 为探究火灾、经济、环境的时空演化特征,基于2000—2019年中国火灾、经济、环境数据,建立火灾-经济-环境耦合协调模型,研究火灾、经济、环境系统的耦合发展水平;结合探索性空间数据分析(ESDA)方法研究我国31个省域各子系统时空演变特征及聚集性,并基于Moran'I检验开展火灾、经济、环境空间相关性分析。研究结果表明:2000—2019年火灾、经济和环境综合发展耦合良好。从时间维度上,2000—2019年耦合度和协调度呈上升趋势,耦合度受火灾综合得分影响,协调度受经济子系统影响;从空间维度上,经济发展是影响31个省域耦合协调度的主要因素。从耦合关系时空演变分析看,省域耦合度随时间呈聚集状态,耦合协调度在空间聚集中呈随机性。火灾、经济和环境对耦合协调的不同影响,导致协调类型及空间聚集状态不同。 展开更多
关键词 火灾-经济-环境 耦合协调度 时空演化 探索性空间数据分析(ESDA) 熵值法
下载PDF
基于深度强化学习的数据探索性会话自动生成
15
作者 汪洋 《现代信息科技》 2024年第4期66-73,78,共9页
探索性数据分析(EDA)是一种数据分析方法,旨在通过对数据集进行可视化和摘要统计等方式揭示数据的结构、模式和关系。数据分析人员可通过操作交互式地探索不熟悉的数据集,并为用户提供先导性见解。深度强化学习(DRL)已被证明可以用来解... 探索性数据分析(EDA)是一种数据分析方法,旨在通过对数据集进行可视化和摘要统计等方式揭示数据的结构、模式和关系。数据分析人员可通过操作交互式地探索不熟悉的数据集,并为用户提供先导性见解。深度强化学习(DRL)已被证明可以用来解决众多难以解决的人工智能挑战,可尝试将EDA与DRL进行结合,提出了一个名为AEDAS的系统。该系统将EDA建模为一个控制决策问题,从而结合一个新颖的DRL架构来自动生成有说服力的探索性会话,并以EDA笔记本的形式呈现。实验表明,该系统生成的EDA笔记本,可以使用户获得切实有效的先导性见解。 展开更多
关键词 探索性数据分析 深度强化学习框架 控制性问题 探索性会话 EDA笔记本
下载PDF
成渝地区碳足迹时空格局演化与影响因素分析
16
作者 应钰 吴纪凤 程华英 《黑龙江环境通报》 2024年第4期19-24,共6页
成渝地区作为中国西部的重要经济区域,如何在保持高速发展的同时,有效控制碳排放,是一个亟待解决的问题。本文以成渝地区各市级行政区为研究单元,从自然-社会二元系统视角对其碳足迹进行测算和分析,探索碳足迹的时空格局演变特征和影响... 成渝地区作为中国西部的重要经济区域,如何在保持高速发展的同时,有效控制碳排放,是一个亟待解决的问题。本文以成渝地区各市级行政区为研究单元,从自然-社会二元系统视角对其碳足迹进行测算和分析,探索碳足迹的时空格局演变特征和影响因素,为实现“碳达峰”提供政策建议。主要研究内容包括:1)基于自然生态系统和社会经济系统二元视角的碳足迹测算;2)通过探索性时空数据分析方法综合分析碳足迹的时空格局演化特征;3)基于改进Kaya模型的碳足迹分解与影响因素分析。本文旨在揭示成渝地区碳足迹水平提升的主要原因,找出实现“碳达峰”的关键所在,为制定区域差别化的碳减排措施提供科学依据。 展开更多
关键词 碳足迹 自然-社会二元系统 探索性时空数据分析 Kaya模型
下载PDF
Predicting the Subcellular Localization of Human Proteins Using Machine Learning and Exploratory Data Analysis 被引量:1
17
作者 George K. Acquaah-Mensah Sonia M. Leach Chittibabu Guda 《Genomics, Proteomics & Bioinformatics》 SCIE CAS CSCD 2006年第2期120-133,共14页
Identifying the subcellular localization of proteins is particularly helpful in the functional annotation of gene products. In this study, we use Machine Learning and Exploratory Data Analysis (EDA) techniques to ex... Identifying the subcellular localization of proteins is particularly helpful in the functional annotation of gene products. In this study, we use Machine Learning and Exploratory Data Analysis (EDA) techniques to examine and characterize amino acid sequences of human proteins localized in nine cellular compartments. A dataset of 3,749 protein sequences representing human proteins was extracted from the SWISS-PROT database. Feature vectors were created to capture specific amino acid sequence characteristics. Relative to a Support Vector Machine, a Multi-layer Perceptron, and a Naive Bayes classifier, the C4.5 Decision Tree algorithm was the most consistent performer across all nine compartments in reliably predicting the subcellular localization of proteins based on their amino acid sequences (average Precision=0.88; average Sensitivity=0.86). Furthermore, EDA graphics characterized essential features of proteins in each compartment. As examples, proteins localized to the plasma membrane had higher proportions of hydrophobic amino acids; cytoplasmic proteins had higher proportions of neutral amino acids; and mitochondrial proteins had higher proportions of neutral amino acids and lower proportions of polar amino acids. These data showed that the C4.5 classifier and EDA tools can be effective for characterizing and predicting the subcellular localization of human proteins based on their amino acid sequences. 展开更多
关键词 subcellular localization Machine Learning exploratory data analysis Decision Tree
原文传递
A comprehensive framework for exploratory spatial data analysis:Moran location and variance scatterplots 被引量:2
18
作者 J.G.Negreiros M.T.Painho +1 位作者 F.J.Aguilar M.A.Aguilar 《International Journal of Digital Earth》 SCIE 2010年第2期157-186,共30页
A significant Geographic Information Science(GIS)issue is closely related to spatial autocorrelation,a burning question in the phase of information extraction from the statistical analysis of georeferenced data.At pre... A significant Geographic Information Science(GIS)issue is closely related to spatial autocorrelation,a burning question in the phase of information extraction from the statistical analysis of georeferenced data.At present,spatial autocorrelation presents two types of measures:continuous and discrete.Is it possible to use Moran’s I and the Moran scatterplot with continuous data?Is it possible to use the same methodology with discrete data?A particular and cumbersome problem is the choice of the spatial-neighborhood matrix(W)for points data.This paper addresses these issues by introducing the concept of covariogram contiguity,where each weight is based on the variogram model for that particular dataset:(1)the variogram,whose range equals the distance with the highest Moran I value,defines the weights for points separated by less than the estimated range and(2)weights equal zero for points widely separated from the variogram range considered.After the W matrix is computed,the Moran location scatterplot is created in an iterative process.In accordance with various lag distances,Moran’s I is presented as a good search factor for the optimal neighborhood area.Uncertainty/transition regions are also emphasized.At the same time,a new Exploratory Spatial Data Analysis(ESDA)tool is developed,the Moran variance scatterplot,since the conventional Moran scatterplot is not sensitive to neighbor variance.This computer-mapping framework allows the study of spatial patterns,outliers,changeover areas,and trends in an ESDA process.All these tools were implemented in a free web e-Learning program for quantitative geographers called SAKWeb#(or,in the near future,myGeooffice.org). 展开更多
关键词 GEOCOMPUTATION exploratory spatial data analysis spatial autocorrelation Moran scatterplot Moran’s I variography
原文传递
Geo-Data Science:Leveraging Geoscience Research with Geoinformatics,Semantics and Open Data
19
作者 MA Xiaogang 《Acta Geologica Sinica(English Edition)》 SCIE CAS CSCD 2019年第S01期44-47,共4页
1 Key concepts underpinning geo-data science Geoinformatics and Geomathematics Computers have been used for data collection,management,analysis,and transmission in geoscience for about 70 years since the 1950s (Merria... 1 Key concepts underpinning geo-data science Geoinformatics and Geomathematics Computers have been used for data collection,management,analysis,and transmission in geoscience for about 70 years since the 1950s (Merriam,2001;2004).The term geoinformatics is widely used to describe such activities.In real-world practices,researchers in both geography and geoscience are using the term geoinformatics. 展开更多
关键词 geo-data SCIENCE CYBERINFRASTRUCTURE data interoperability exploratory data analysis
下载PDF
A comprehensive review of tools for exploratory analysis of tabular industrial datasets
20
作者 Aindrila Ghosh Mona Nashaat +2 位作者 James Miller Shaikh Quader Chad Marston 《Visual Informatics》 EI 2018年第4期235-253,共19页
Exploratory data analysis plays a major role in obtaining insights from data.Over the last two decades,researchers have proposed several visual data exploration tools that can assist with each step of the analysis pro... Exploratory data analysis plays a major role in obtaining insights from data.Over the last two decades,researchers have proposed several visual data exploration tools that can assist with each step of the analysis process.Nevertheless,in recent years,data analysis requirements have changed significantly.With constantly increasing size and types of data to be analyzed,scalability and analysis duration are now among the primary concerns of researchers.Moreover,in order to minimize the analysis cost,businesses are in need of data analysis tools that can be used with limited analytical knowledge.To address these challenges,traditional data exploration tools have evolved within the last few years.In this paper,with an in-depth analysis of an industrial tabular dataset,we identify a set of additional exploratory requirements for large datasets.Later,we present a comprehensive survey of the recent advancements in the emerging field of exploratory data analysis.We investigate 50 academic and non-academic visual data exploration tools with respect to their utility in the six fundamental steps of the exploratory data analysis process.We also examine the extent to which these modern data exploration tools fulfill the additional requirements for analyzing large datasets.Finally,we identify and present a set of research opportunities in the field of visual exploratory data analysis. 展开更多
关键词 exploratory data analysis Industrial tabular data Interactive visualization Systematic literature review Research opportunities
原文传递
上一页 1 2 22 下一页 到第
使用帮助 返回顶部