公众号
“医学论文与统计分析”公众号
扫码关注公众号

统计咨询
“公共数据库与孟德尔随机化”公众号
扫码关注公众号

意见反馈
邮箱:17357190071@163.com
微信:aq566665

如果统计学家说您的数据不是正态分布。。。

Administrator
发布于 2025-03-26 / 21 阅读
0
0

《英国医学会杂志》(BMJ)自2008年9月开始至2015年由两位流行病与统计学专家不间断地出了300多期statistical question系列。在这个系列中,两位学者每次出一道统计学选择题,进行选择并解释。现在我精选300道Statistical Question,形成中文版,请有兴趣的朋友们进行回答。

统计问题(18):偏态分布

如果统计学家告诉您您的数据不是正态分布,并且该分布具有“胖尾巴”,下面哪个说法是正确的。

数据是偏态分布时,下面哪项说法是准确的?

A.极端值出现的频率可能比正态分布时的更高

B.无法计算95%的置信区间

C.无法使用参数法进行统计分析

D.95%的置信区间将比正态分布时的区间要窄

Question

If a statistician tells you that your data are not normally distributed and the distribution has “fat tails,” which if any of thefollowing will be true?A.Extreme values may be more frequent than in a normal distribution

B. 95% confidence intervals can not be computed

C.Parametric statistics can not be used

D.95% confidence intervals will be narrower than those suggested by a normal distribution.

Answer

The tails of a distribution are the regions furthest from the middle or main body of the distribution. Many things measured in medicine do look like normal distributions, but not all of them do. When extreme values happen more often than expected in a normal distribution this is described as a distribution with fat tails. As more observations lie outside the 95% confidence limits of a normal distribution, the true 95% confidence interval for a fat tailed distribution will be wider than expected on the basis of a normal distribution.

The distribution of stock market values has distinctly fat tails. More large falls and gains occur than expected. In general, when multiple correlated factors influence an outcome then extreme values are more likely to occur.Cardiovascular risk can follow this pattern.

Lack of conformity to a normal distribution does not mean one has to immediately give up all hope of using parametric statistics. A transformation using logarithms, square root or other suitable procedure may produce a normal fit. Cardiovascular risk is not always normally distributed and the Framingham model uses a different statistical parameter (the Weibull distribution) as the basis for its predictions. So long as a suitable statistical distribution can befound, 95% confidence intervals may still be computed.

中文解释:

分布的尾部是距离分布的中部或主体最远的区域。医学上衡量的许多事物确实看起来像正态分布。如果在正态分布中出现极端值的频率比预期的多,这被描述为带有尾的分布。随着更多的观察结果超出正态分布的95%置信区间,胖尾分布的真实95%置信区间将比基于正态分布的预期范围宽。

股票市场价值的分布明显具有发尾现象。跌幅和收益往往比预期的更大要多。通常,当多个相关因素共同影响结果时,则极有可能出现极端值。心血管风险可以遵循这种模式。

缺乏与正态分布的一致性并不意味着人们必须立即放弃使用参数统计的所有希望。使用对数、平方根或其他合适方法对数据进行转换可能会产生正态分布数据。心血管风险并不总是正态分布的,弗雷明翰心血管风险模型使用其它的分布(威布尔分布)作为其预测的基础。只要可以找到合适的统计分布,仍可以计算95%的置信区间。

所以答案是选择 A


评论