期刊文献+
共找到280篇文章
< 1 2 14 >
每页显示 20 50 100
Effect Modeling of Count Data Using Logistic Regression with Qualitative Predictors
1
作者 Haeil Ahn 《Engineering(科研)》 2014年第12期758-772,共15页
We modeled binary count data with categorical predictors, using logistic regression to develop a statistical method. We found that ANOVA-type analyses often performed unsatisfactorily, even when using different transf... We modeled binary count data with categorical predictors, using logistic regression to develop a statistical method. We found that ANOVA-type analyses often performed unsatisfactorily, even when using different transformations. The logistic transformation of fraction data could be an alternative, but it is not desirable in the statistical sense. We concluded that such methods are not appropriate, especially in cases where the fractions were close to 0 or 1. The major purpose of this paper is to demonstrate that logistic regression with an ANOVA-model like parameterization aids our understanding and provides a somewhat different, but sound, statistical background. We examined a simple real world example to show that we can efficiently test the significance of regression parameters, look for interactions, estimate related confidence intervals, and calculate the difference between the mean values of the referent and experimental subgroups. This paper demonstrates that precise confidence interval estimates can be obtained using the proposed ANOVA-model like approach. The method discussed here can be extended to any type of experimental fraction data analysis, particularly for experimental design. 展开更多
关键词 LOGISTIC Regression LOGIT LOGISTIC Response CATEGORICAL BINARY count data
下载PDF
Accuracy Assessment and Guidelines for Manual Traffic Counts from Pre-Recorded Video Data
2
作者 Mishuk Majumder Chester Wilmot 《Journal of Transportation Technologies》 2023年第4期497-523,共27页
Traffic count is the fundamental data source for transportation planning, management, design, and effectiveness evaluation. Recording traffic flow and counting from the recorded videos are increasingly used due to con... Traffic count is the fundamental data source for transportation planning, management, design, and effectiveness evaluation. Recording traffic flow and counting from the recorded videos are increasingly used due to convenience, high accuracy, and cost-effectiveness. Manual counting from pre-recorded video footage can be prone to inconsistencies and errors, leading to inaccurate counts. Besides, there are no standard guidelines for collecting video data and conducting manual counts from the recorded videos. This paper aims to comprehensively assess the accuracy of manual counts from pre-recorded videos and introduces guidelines for efficiently collecting video data and conducting manual counts by trained individuals. The accuracy assessment of the manual counts was conducted based on repeated counts, and the guidelines were provided from the experience of conducting a traffic survey on forty strip mall access points in Baton Rouge, Louisiana, USA. The percentage of total error, classification error, and interval error were found to be 1.05 percent, 1.08 percent, and 1.29 percent, respectively. Besides, the percent root mean square errors (RMSE) were found to be 1.13 percent, 1.21 percent, and 1.48 percent, respectively. Guidelines were provided for selecting survey sites, instruments and timeframe, fieldwork, and manual counts for an efficient traffic data collection survey. 展开更多
关键词 Traffic Survey counting Error Transportation Planning Total Error Collecting Video data Classification Error Standard Guidelines Repeated counts Interval Error
下载PDF
Robust Estimation of Semiparametric Transformation Model for Panel Count Data 被引量:1
3
作者 FENG Yan WANG Yijun +1 位作者 WANG Weiwei CHEN Zhuo 《Journal of Systems Science & Complexity》 SCIE EI CSCD 2021年第6期2334-2356,共23页
Panel count data are frequently encountered when study subjects are under discrete observations.However,limited literature has been found on variable selection for panel count data.In this paper,without considering th... Panel count data are frequently encountered when study subjects are under discrete observations.However,limited literature has been found on variable selection for panel count data.In this paper,without considering the model assumption of observation process,a more general semiparametric transformation model for panel count data with informative observation process is developed.A penalized estimation procedure based on the quantile regression function is proposed for variable selection and parameter estimation simultaneously.The consistency and oracle properties of the estimators are established under some mild conditions.Some simulations and an application are reported to evaluate the proposed approach. 展开更多
关键词 B-spline function panel count data quantile regression semiparametric transformation model variable selection
原文传递
Panel Count Data模型参数的经验似然推断
4
作者 胡宏昌 崔恒建 《数理统计与管理》 CSSCI 北大核心 2014年第4期647-654,共8页
对Panel Count Data的处理越来越受到人们的关注,Sun与Wei^([1-2])基于简单的半参数模型,提出了Panel Count Data的回归分析,并且给出了参数的估计方程。本文则基于经验似然的思想,讨论了上述Panel Count Data模型参数的置信域构造问题... 对Panel Count Data的处理越来越受到人们的关注,Sun与Wei^([1-2])基于简单的半参数模型,提出了Panel Count Data的回归分析,并且给出了参数的估计方程。本文则基于经验似然的思想,讨论了上述Panel Count Data模型参数的置信域构造问题,特别仅通过经验似然置信区域给出了参数估计的方差阵估计,证明了估计的1/n相合性。基于Sun与Wei所给的数据,给出了参数置信区域的具体构造过程和结果。通过作图比较可以看出经验似然置信域要优于依据渐近正态性所构造的置信域。我们还依据所作出的经验似然置信域对参数估计的方差矩阵进行了估计,与用传统渐近正态性得到的矩阵较为接近。 展开更多
关键词 PANEL count data 经验似然 置信域 协方差矩阵估计
原文传递
Bayesian Computation for the Parameters of a Zero-Inflated Cosine Geometric Distribution with Application to COVID-19 Pandemic Data
5
作者 Sunisa Junnumtuam Sa-Aat Niwitpong Suparat Niwitpong 《Computer Modeling in Engineering & Sciences》 SCIE EI 2023年第5期1229-1254,共26页
A new three-parameter discrete distribution called the zero-inflated cosine geometric(ZICG)distribution is proposed for the first time herein.It can be used to analyze over-dispersed count data with excess zeros.The b... A new three-parameter discrete distribution called the zero-inflated cosine geometric(ZICG)distribution is proposed for the first time herein.It can be used to analyze over-dispersed count data with excess zeros.The basic statistical properties of the new distribution,such as the moment generating function,mean,and variance are presented.Furthermore,confidence intervals are constructed by using the Wald,Bayesian,and highest posterior density(HPD)methods to estimate the true confidence intervals for the parameters of the ZICG distribution.Their efficacies were investigated by using both simulation and real-world data comprising the number of daily COVID-19 positive cases at the Olympic Games in Tokyo 2020.The results show that the HPD interval performed better than the other methods in terms of coverage probability and average length in most cases studied. 展开更多
关键词 Bayesian analysis confidence interval gibbs sampling random-walk metropolis zero-inflated count data
下载PDF
Some Additional Moment Conditions for a Dynamic Count Panel Data Model with Predetermined Explanatory Variables
6
作者 Yoshitsugu Kitazawa 《Open Journal of Statistics》 2013年第5期319-333,共15页
This paper proposes some additional moment conditions for the linear feedback model with explanatory variables being predetermined, which is proposed by [1] for the purpose of dealing with count panel data. The newly ... This paper proposes some additional moment conditions for the linear feedback model with explanatory variables being predetermined, which is proposed by [1] for the purpose of dealing with count panel data. The newly proposed moment conditions include those associated with the equidispersion, the Negbin I-type model and the stationarity. The GMM estimators are constructed incorporating the additional moment conditions. Some Monte Carlo experiments indicate that the GMM estimators incorporating the additional moment conditions perform well, compared to that using only the conventional moment conditions proposed by [2,3]. 展开更多
关键词 count PANEL data Linear Feedback Model MOMENT Conditions GMM MONTE Carlo Experiments
下载PDF
Dynamically Computing Approximate Frequency Counts in Sliding Window over Data Stream 被引量:1
7
作者 NIE Guo-liang LU Zheng-ding 《Wuhan University Journal of Natural Sciences》 EI CAS 2006年第1期283-288,共6页
This paper presents two one-pass algorithms for dynamically computing frequency counts in sliding window over a data stream-computing frequency counts exceeding user-specified threshold ε. The first algorithm constru... This paper presents two one-pass algorithms for dynamically computing frequency counts in sliding window over a data stream-computing frequency counts exceeding user-specified threshold ε. The first algorithm constructs sub-windows and deletes expired sub-windows periodically in sliding window, and each sub-window maintains a summary data structure. The first algorithm outputs at most 1/ε + 1 elements for frequency queries over the most recent N elements. The second algorithm adapts multiple levels method to deal with data stream. Once the sketch of the most recent N elements has been constructed, the second algorithm can provides the answers to the frequency queries over the most recent n(n≤N) elements. The second algorithm outputs at most 1/ε+2 elements. The analytical and experimental results show that our algorithms are accurate and effective. 展开更多
关键词 数据流 滑动时窗 近似算法 频数
下载PDF
基于广义线性混合效应模型的森林树木死亡研究
8
作者 闫明 陈艳梅 +1 位作者 闫静 奚为民 《生态学报》 CAS CSCD 北大核心 2024年第6期2420-2436,共17页
基于计数模型方法,同时考虑样地的随机效应,构建林分水平死亡模型,探究影响树木死亡的因素,以期为森林资源的监测与管理提供参考依据。以美国德州东部森林连续清查的样地数据为数据源,按4∶1的比例将其进行随机抽样,划分为训练集和验证... 基于计数模型方法,同时考虑样地的随机效应,构建林分水平死亡模型,探究影响树木死亡的因素,以期为森林资源的监测与管理提供参考依据。以美国德州东部森林连续清查的样地数据为数据源,按4∶1的比例将其进行随机抽样,划分为训练集和验证集数据,将立地因子、林分因子和气候因子作为模型的自变量,林木死亡株数则作为模型的因变量,运用计数模型和混合效应模型方法进行模型的构建,并分析影响林木死亡株数的因子。使用赤池信息准则(AIC)、贝叶斯信息准则(BIC)和-2倍对数似然函数值(-2logL)3种模型评价指标评估各模型间的拟合效果;采用平均绝对误差(MAE)和均方根误差(RMSE)2种评价指标评估其预测效果,以便筛选出最佳的林分水平死亡模型。结果表明:立地因子方面,林木死亡株数与海拔(P<0.01)呈显著的负效应,与坡度(P<0.05)呈显著的正效应,说明林木死亡株数随海拔的升高而减少,随坡度的增加而增多;林分因子方面,林木死亡株数与林分年龄(P<0.001)和树木基面积(P<0.001)呈显著的正效应,与林分平方平均胸径(P<0.001)和林分密度(P<0.05)呈显著的负效应,说明林木死亡株数随林分年龄的增加和树木基面积的增大而增加,随林分平方平均胸径和林分密度的增大而减少;气候因子方面,林木死亡株数与SPEI(P<0.05)、干旱长度(P<0.001)、年平均温度(P<0.001)和夏季平均降雨量(P<0.05)均呈显著的负效应,与夏季平均温度(P<0.001)呈显著的正效应,说明林木死亡株数随干旱强度和夏季平均温度的增加而增多,随干旱长度、年平均温度和夏季平均降雨量的增加而减少。在基础计数模型中,零膨胀负二项(ZINB)模型的拟合效果最好。而加入样地随机效应后,混合效应模型的拟合精度明显有所提高。基于所有模型模拟结果的比较,得出德州东部森林的林分水平死亡模型以ZINB-mixed模型为最优模型。 展开更多
关键词 树木死亡 计数模型 混合效应模型 影响因子
下载PDF
星载光子计数激光测距雷达的实时去噪方法
9
作者 谭崇涛 于文博 +4 位作者 向雨琰 李少辉 余婧 王倩莹 李松 《红外与毫米波学报》 SCIE EI CAS CSCD 北大核心 2024年第2期242-253,共12页
星载光子计数体制激光测距雷达系统具有高重频、高精度等显著优势,但也面临原始数据量大且噪声数据占比过高的问题。为适应星上数据通道的传输能力,需压缩原始数据量并保障信号光子的查全率,因此必须发展以硬件为主体的实时去噪算法。... 星载光子计数体制激光测距雷达系统具有高重频、高精度等显著优势,但也面临原始数据量大且噪声数据占比过高的问题。为适应星上数据通道的传输能力,需压缩原始数据量并保障信号光子的查全率,因此必须发展以硬件为主体的实时去噪算法。本文提出一种粗精结合的快速去噪算法,首先基于激光器发射脉宽、系统噪声率、目标特性以及接收光子事件的局部密度信息进行粗去噪,剔除部分噪声光子;再利用直方图统计,对保留的光子事件进行精去噪,确定信号光子区间及最终的信号光子及其时间信息。通过蒙特卡洛仿真和ICESat-2实测数据对算法进行验证,测试结果表明,本算法查全率大于94%、查准率大于93%、调和平均值大于94%,运行效率提高了10%。算法可以实现光子事件的快速实时去噪,为星上硬件实时去噪处理提供了理论基础。 展开更多
关键词 光子计数 激光测距 粗精去噪 数据密度 直方图统计
下载PDF
Using Statistical Learning to Treat Missing Data: A Case of HIV/TB Co-Infection in Kenya
10
作者 Joshua O. Mwaro Linda Chaba Collins Odhiambo 《Journal of Data Analysis and Information Processing》 2020年第3期110-133,共24页
In this study, we investigate the effects of missing data when estimating HIV/TB co-infection. We revisit the concept of missing data and examine three available approaches for dealing with missingness. The main objec... In this study, we investigate the effects of missing data when estimating HIV/TB co-infection. We revisit the concept of missing data and examine three available approaches for dealing with missingness. The main objective is to identify the best method for correcting missing data in TB/HIV Co-infection setting. We employ both empirical data analysis and extensive simulation study to examine the effects of missing data, the accuracy, sensitivity, specificity and train and test error for different approaches. The novelty of this work hinges on the use of modern statistical learning algorithm when treating missingness. In the empirical analysis, both HIV data and TB-HIV co-infection data imputations were performed, and the missing values were imputed using different approaches. In the simulation study, sets of 0% (Complete case), 10%, 30%, 50% and 80% of the data were drawn randomly and replaced with missing values. Results show complete cases only had a co-infection rate (95% Confidence Interval band) of 29% (25%, 33%), weighted method 27% (23%, 31%), likelihood-based approach 26% (24%, 28%) and multiple imputation approach 21% (20%, 22%). In conclusion, MI remains the best approach for dealing with missing data and failure to apply it, results to overestimation of HIV/TB co-infection rate by 8%. 展开更多
关键词 Missing data HIV/TB Co-Infection IMPUTATION Missing at Random count data
下载PDF
Challenges Analyzing RNA-Seq Gene Expression Data
11
作者 Liliana López-Kleine Cristian González-Prieto 《Open Journal of Statistics》 2016年第4期628-636,共9页
The analysis of messenger Ribonucleic acid obtained through sequencing techniques (RNA-se- quencing) data is very challenging. Once technical difficulties have been sorted, an important choice has to be made during pr... The analysis of messenger Ribonucleic acid obtained through sequencing techniques (RNA-se- quencing) data is very challenging. Once technical difficulties have been sorted, an important choice has to be made during pre-processing: Two different paths can be chosen: Transform RNA- sequencing count data to a continuous variable or continue to work with count data. For each data type, analysis tools have been developed and seem appropriate at first sight, but a deeper analysis of data distribution and structure, are a discussion worth. In this review, open questions regarding RNA-sequencing data nature are discussed and highlighted, indicating important future research topics in statistics that should be addressed for a better analysis of already available and new appearing gene expression data. Moreover, a comparative analysis of RNAseq count and transformed data is presented. This comparison indicates that transforming RNA-seq count data seems appropriate, at least for differential expression detection. 展开更多
关键词 RNA-Seq Analysis count data PREPROCESSING Differential Expression Gene Co-Expression Network
下载PDF
Modelling fertility:an application of count regression models
12
作者 Ranjita Pandey Charanjit Kaur 《Chinese Journal of Population,Resources and Environment》 2015年第4期349-357,共9页
Often the lifecycle data occur as count of the vital events and are recorded as integers.The purpose of this article is to model the fertility behavior based on religious,educational,economic,and occupational characte... Often the lifecycle data occur as count of the vital events and are recorded as integers.The purpose of this article is to model the fertility behavior based on religious,educational,economic,and occupational characteristics.The responses of classified groups according to these determinants are examined for significant influence on fertility using Poisson regression model(PRM) based on the National Family Health Survey-3 dataset.The observed and predicted probabilities under PRM indicate modal value of two children for the Poisson distribution modeled data.Presence of dominance of two child in the data motivates the authors to adopt multinomial regression model(MRM) in order to link fertility with various socioeconomic indicators responsible for fertility variation.Choice of the explanatory factors is limited to the availability of data.Trends and patterns of preference for birth counts suggest that religion,caste,wealth,female education,and occupation are the dominant factors shaping the observed birth process.Empirical analysis suggests that both the models used in the study perform similarly on the sample data.However,fitting of MRM by taking birth count of two as comparison category shows improved Akaike information criterion and consistent Akaike information criterion values.Current work contributes to the existing literature as it attempts to provide more insight into the determinants of Indian fertility using Poisson and MRM. 展开更多
关键词 count data FERTILITY POISSON model MULTINOMIAL regression MODELS
下载PDF
Determinants of Antenatal Health Care Utilization in Egypt (2000-2014) Using Binary and Count Outcomes
13
作者 Hassan H. M. Zaky Dina M. Armanious Mohamed Ali Hussein 《Health》 2019年第1期25-39,共15页
Aim: This study seeks to investigate the factors determining the utilization of antenatal care services, the frequency of that use, and the timing of receiving antenatal care among Egyptian women utilizing a national ... Aim: This study seeks to investigate the factors determining the utilization of antenatal care services, the frequency of that use, and the timing of receiving antenatal care among Egyptian women utilizing a national representative data from Egypt Demographic and Health Surveys (EDHS) in 2000 and 2014. Methods: The paper estimates the logistic regression model, zero-inflated negative binomial model (ZINB), and negative binomial regression model (NB) to identify the most important determinants of antenatal health care utilization. Results: The findings indicate that the period 2000-2014 has experienced a significant increase in the use of antenatal health care services. The use of the public sector antenatal care services relative to that of the private sector has been decreasing over time. Moreover, wealth index, women’s education and quality of health services play significant roles in increasing accessibility of antenatal health care services. On the other hand, women’s empowerment has shown a positive effect in 2000 only. Conclusion: The study highlights the most vulnerable groups that are less likely to have access to antenatal health care services, mainly women who are less educated, poor and living in rural areas especially Upper Egypt. This certainly requires a more targeted health strategy with an equity lens. 展开更多
关键词 ANTENATAL Health Care Services BINARY and count data Negative BINOMIAL Regression Determinants EGYPT
下载PDF
Comparative Assessment of Zero-Inflated Models with Application to HIV Exposed Infants Data
14
作者 Faith Nekesa Collins Odhiambo Linda Chaba 《Open Journal of Statistics》 2019年第6期664-685,共22页
In a typical Kenyan HIV clinical setting, there is a likelihood of registering many zeros during the routine monthly data collection of new HIV infections among HIV exposed infants (HEI). This is attributed to the imp... In a typical Kenyan HIV clinical setting, there is a likelihood of registering many zeros during the routine monthly data collection of new HIV infections among HIV exposed infants (HEI). This is attributed to the implementation of the prevention of mother to child transmission (PMTCT) policies. However, even though the PMTCT policy is implemented uniformly across all public health facilities, implementation naturally differs from every facility due to differential health systems and infrastructure. This leads to structured zero among reported positive HEI (where PMTCT implementation is optimum) and non-structured zero among reported positive HEI (where PMTCT implementation is not optimum). Hence the classical zero-inflated and hurdle models that do not account for the abundance of structured and non-structured zeros in the data can give misleading results. The purpose of this study is to systematically compare performance of the various zero-inflated models with an application to HIV Exposed Infants (HEI) in the context of structured and unstructured zeros. We revisit zero-inflated, hurdle models, Poisson and negative binomial count models and conduct the simulations by varying sample size and levels of abundance zeros. Results from simulation study and real data analysis of exposed infant diagnosis show the negative binomial emerging as the best performing model when fitting data with both structured and non-structured zeros under various settings. 展开更多
关键词 ZERO-INFLATED Models HIV EXPOSED INFANTS Structured Zeroes Mother-to-Child Transmission count data
下载PDF
万家企业节能减排政策对企业绿色技术创新的影响及其内在机制 被引量:3
15
作者 赖小东 詹伟灵 《中国人口·资源与环境》 CSCD 北大核心 2023年第4期104-114,共11页
万家企业节能减排政策是实现“十二五”能耗和碳减排约束性指标的重要支撑和保证,是具有中国特色的大规模节能减排政策。绿色技术创新是促进低碳发展的重要动力,企业是绿色技术创新的主体。因此,基于万家企业政策这一外生冲击,采用2004... 万家企业节能减排政策是实现“十二五”能耗和碳减排约束性指标的重要支撑和保证,是具有中国特色的大规模节能减排政策。绿色技术创新是促进低碳发展的重要动力,企业是绿色技术创新的主体。因此,基于万家企业政策这一外生冲击,采用2004—2018年中国A股上市公司绿色专利面板数据,运用双重差分模型与计数数据模型,从微观层面研究万家企业政策对企业绿色技术创新的影响,并进一步运用中介效应模型检验地方政府行为在其中所发挥的中介作用,运用三重差分模型分析异质性特征。研究表明:(1)万家企业政策显著地提高了企业的绿色技术创新水平。(2)考虑到数据过度离散、零膨胀等问题,计数数据模型的回归结果进一步支持了该结论。(3)在更换变量指标、安慰剂检验、匹配后回归等一系列稳健性检验后,该结论仍成立。(4)万家企业政策的考核约束性影响了地方政府对政策企业的补贴行为,进而促进了企业的绿色技术创新。(5)万家企业政策的绿色技术创新效应在时间上具有动态特征、在创新水平上具有边际特征、在企业产权特性上具有异质性特征:政策实施期间该效应显著,政策结束后不显著;与城市创新指数较低的地区相比,万家企业政策对企业绿色技术创新的促进作用在城市创新指数较高地区更为显著;与私营企业相比,万家企业政策对企业绿色技术创新的促进作用在国有企业中更显著。基于此,应优化政策设计以积极引导企业提高绿色技术创新水平,完善绿色发展导向的地方政府考核体系,“因企制宜”实施环境规制与技术创新激励政策。 展开更多
关键词 万家企业政策 绿色技术创新 双重差分法 计数数据模型
下载PDF
RTDMiner:基于数据挖掘的引用计数更新缺陷检测方法
16
作者 边攀 梁彬 +3 位作者 黄建军 游伟 石文昌 张健 《软件学报》 EI CSCD 北大核心 2023年第10期4724-4742,共19页
在Linux内核等大型底层系统中广泛采用引用计数来管理共享资源.引用计数需要与引用资源的对象个数保持一致,否则可能导致不恰当引用计数更新缺陷,使得资源永远无法释放或者被提前释放.为检测不恰当引用计数更新缺陷,现有静态检测方法通... 在Linux内核等大型底层系统中广泛采用引用计数来管理共享资源.引用计数需要与引用资源的对象个数保持一致,否则可能导致不恰当引用计数更新缺陷,使得资源永远无法释放或者被提前释放.为检测不恰当引用计数更新缺陷,现有静态检测方法通常需要知道哪些函数增加引用计数,哪些函数减少引用计数.而手动获取这些关于引用计数的先验知识过于费时且可能有遗漏.基于挖掘的缺陷检测方法虽然可以减少对先验知识的依赖,但难以有效检测像不恰当引用计数更新缺陷这类路径敏感的缺陷.为此,提出一个将数据挖掘技术和静态分析技术深度融合的不恰当引用计数更新缺陷检测方法RTDMiner.首先,根据引用计数的通用规律,利用数据挖掘技术从大规模代码中自动识别增加或减少引用计数的函数.然后,采用路径敏感的静态分析方法检测增加了引用计数但没有减少引用计数的缺陷路径.为了降低误报,在检测阶段再次利用数据挖掘技术来识别例外模式.在Linux内核上的实验结果表明,所提方法能够以将近90%的准确率自动识别增加或减少引用计数的函数.而且RTDMiner检测到的排行靠前的50个疑似缺陷中已经有24个被内核维护人员确认为真实缺陷. 展开更多
关键词 引用计数 缺陷检测 数据挖掘 静态分析
下载PDF
基于标准贯入试验的土壤液化判别公式锤击数基准值研究
17
作者 贾端阳 陈龙伟 +1 位作者 谢旺青 李鑫洋 《岩土力学》 EI CAS CSCD 北大核心 2023年第10期3031-3038,共8页
我国抗震设计规范中基于标准贯入试验(standard penetration test,SPT)锤击数的砂土液化判别公式是我国科学家提出的适于我国国情、具有中国特色且工程应用最广泛、最权威的液化判别公式。该公式的基本原理是采用场地地下水位和饱和砂... 我国抗震设计规范中基于标准贯入试验(standard penetration test,SPT)锤击数的砂土液化判别公式是我国科学家提出的适于我国国情、具有中国特色且工程应用最广泛、最权威的液化判别公式。该公式的基本原理是采用场地地下水位和饱和砂层埋深修正SPT锤击数基准值得到临界锤击数。该方法中SPT基准锤击数取决于液化数据,而构建规范判别公式所采用的数据库主要来自我国20世纪六七十年代发生的几次地震的震后勘察测试数据及震害经验,但数据一直未得到系统地更新。通过搜集、整理、吸收近期国内地震的液化数据,极大地扩充了我国液化数据库。借鉴我国规范液化判别方法的理论框架,通过数据分析给出不同烈度下SPT锤击数基准值,构建新的液化判别公式。利用新的判别公式对液化数据进行回判检验。结果显示,新公式能够较好地判别液化数据和非液化数据,保持较高、且液化数据和非液化数据均衡的回判成功率。该研究结果可为我国规范液化判别方法的改进提供支撑。 展开更多
关键词 标准贯入试验 液化判别公式 锤击数基准值 液化数据
下载PDF
基于数据挖掘的王笑民教授治疗非小细胞肺癌的用药规律 被引量:1
18
作者 丁彤晶 念家云 +4 位作者 张佳慧 陈宇晗 王秀慧 于明薇 王笑民 《世界中医药》 CAS 2023年第17期2524-2530,共7页
目的:基于数据挖掘方法,对王笑民教授治疗非小细胞肺癌用药规律进行探究。方法:收集王笑民教授中医药治疗非小细胞肺癌病历资料并建立数据库,采用Excel、SPSS Statistics 25.0、SPSS Modeler18.0统计软件进行频数、聚类及关联规则分析等... 目的:基于数据挖掘方法,对王笑民教授治疗非小细胞肺癌用药规律进行探究。方法:收集王笑民教授中医药治疗非小细胞肺癌病历资料并建立数据库,采用Excel、SPSS Statistics 25.0、SPSS Modeler18.0统计软件进行频数、聚类及关联规则分析等,对用药规律进行总结。结果:本研究纳入病例168例,男82例,女86例,年龄33~83岁,症状以咳嗽、咳痰、眠差等常见,中医辨证以瘀毒、相火妄动、脾虚等多见。纳入中药处方424首,涉及药物181味,高频中药29味,归经以肺、肝、脾为多,药性以寒、微寒、温为主,药味以甘、苦、辛多见。聚类分析得到5个药组。关联规则分析得到28条,白花蛇舌草-龙葵-白英-半枝莲-车前子,柴胡-法半夏-白芍,生黄芪-防风-炒白术常联合使用。症-证-药关联分析得出非小细胞肺癌患者以瘀毒证最为常见,常用半枝莲、白花蛇舌草、白英、龙葵、车前子。结论:王笑民教授治疗非小细胞肺癌,重在从肺、肝、脾论治,以清肺解毒、活血通滞、疏肝醒脾为治,明确邪正辨证关系及疾病发展的不同阶段,选择合适用药。 展开更多
关键词 非小细胞肺癌 中医药 数据挖掘 频数统计 聚类分析 关联规则 用药规律 @王笑民
下载PDF
超大型城市火警次数月度分布影响因素分析研究
19
作者 陈永胜 钱顾荣 +1 位作者 施楠 钟兆宁 《消防科学与技术》 CAS 北大核心 2023年第4期571-574,共4页
基于超大型城市月度火警数据,结合气候、经济等相关数据,构建火警次数月度分布影响因素模型,通过计数数据的负二项回归,分析气候、经济变量对火警发生的解释力。结果表明,在控制日历因素的前提下,降雨天数与工业总产值的自变量组合对月... 基于超大型城市月度火警数据,结合气候、经济等相关数据,构建火警次数月度分布影响因素模型,通过计数数据的负二项回归,分析气候、经济变量对火警发生的解释力。结果表明,在控制日历因素的前提下,降雨天数与工业总产值的自变量组合对月度火警次数有较好的解释力,气温、用电量等数据对火警次数重要性较低。在其他变量不变的条件下,降雨天数较月平均水平每增加1天,火警次数减少2.1%。工业总产值较月平均水平每增加1亿元,火警次数增加0.033%。节假日天数较月平均水平每增加1天,火警次数增加2.4%。 展开更多
关键词 火警次数 月度分布 经济因素 气候因素 负二项回归 计数数据
下载PDF
Empirical Bayesian Approach to Testing Homogeneity of Several Means of Inflated Poisson Distributions (IPD)
20
作者 Mohamed M. Shoukri Maha Aleid 《Open Journal of Statistics》 2023年第3期285-299,共15页
Objectives: We introduce a special form of the Generalized Poisson Distribution. The distribution has one parameter, yet it has a variance that is larger than the mean a phenomenon known as “over dispersion”. We dis... Objectives: We introduce a special form of the Generalized Poisson Distribution. The distribution has one parameter, yet it has a variance that is larger than the mean a phenomenon known as “over dispersion”. We discuss potential applications of the distribution as a model of counts, and under the assumption of independence we will perform statistical inference on the ratio of two means, with generalization to testing the homogeneity of several means. Methods: Bayesian methods depend on the choice of the prior distributions of the population parameters. In this paper, we describe a Bayesian approach for estimation and inference on the parameters of several independent Inflated Poisson (IPD) distributions with two possible priors, the first is the reciprocal of the square root of the Poisson parameter and the other is a conjugate Gamma prior. The parameters of Gamma distribution are estimated in the empirical Bayesian framework using the maximum likelihood (ML) solution using nonlinear mixed model (NLMIXED) in SAS. With these priors we construct the highest posterior confidence intervals on the ratio of two IPD parameters and test the homogeneity of several populations. Results: We encountered convergence problem in estimating the hyperparameters of the posterior distribution using the NLMIXED. However, direct maximization of the predictive density produced solutions to the maximum likelihood equations. We apply the methodologies to RNA-SEQ read count data of gene expression values. 展开更多
关键词 Distributions of Over-Dispersed counts Lagrange Class of Distributions Knowledge Transfer Gamma Prior Posterior Inference Wilson-Hilferty Transformation RNA_SEQ Read counts data
下载PDF
上一页 1 2 14 下一页 到第
使用帮助 返回顶部