PDF version

PMF

Suppose that a sample of size $n$ is to be chosen randomly (without replacement) from an urn containing $N$ balls, of which $m$ are white and $N-m$ are black. If we let $X$ denote the number of white balls selected, then $$f(x; N, m, n) = \Pr(X = x) = {{m\choose x}{N-m\choose n-x}\over {N\choose n}}$$ for $x= 0, 1, 2, \cdots, n$.

Proof:

This is essentially the Vandermonde's identity: $${m+n\choose r} = \sum_{k=0}^{r}{m\choose k}{n\choose r-k}$$ where $m$, $n$, $k$, $r\in \mathbb{N}_0$. Because $$ \begin{align*} \sum_{r=0}^{m+n}{m+n\choose r}x^r &= (1+x)^{m+n} \quad\quad\quad\quad\quad\quad\quad\quad \mbox{(binomial theorem)}\\ &= (1+x)^m(1+x)^n\\ &= \left(\sum_{i=0}^{m}{m\choose i}x^{i}\right)\left(\sum_{j=0}^{n}{n\choose j}x^{j}\right)\\ &= \sum_{r=0}^{m+n}\left(\sum_{k=0}^{r}{m\choose k}{n\choose r-k}\right)x^r \quad\quad\mbox{(product of two binomials)} \end{align*} $$ Using the product of two binomials: $$ \begin{eqnarray*} \left(\sum_{i=0}^{m}a_i x^i\right)\left(\sum_{j=0}^{n}b_j x^j\right) &=& \left(a_0+a_1x+\cdots + a_mx^m\right)\left(b_0+b_1x+\cdots + b_nx^n\right)\\ &=& a_0b_0 + a_0b_1x +a_1b_0x +\cdots +a_0b_2x^2 + a_1b_1x^2 + a_2b_0x^2 +\\ & &\cdots + a_mb_nx^{m+n}\\ &=& \sum_{r=0}^{m+n}\left(\sum_{k=0}^{r}a_{k}b_{r-k}\right)x^{r} \end{eqnarray*} $$ Hence $$ \begin{eqnarray*} & &\sum_{r=0}^{m+n}{m+n\choose r}x^r = \sum_{r=0}^{m+n}\left(\sum_{k=0}^{r}{m\choose k}{n\choose r-k}\right)x^r\\ &\implies& {m+n\choose r} = \sum_{k=0}^{r}{m\choose k}{n\choose r-k}\\ & \implies& \sum_{k=0}^{r}{{m\choose k}{n\choose r-k}\over {m+n\choose r}} = 1 \end{eqnarray*} $$

Mean

The expected value is $$\mu = E[X] = {nm\over N}$$

Proof:

$$ \begin{eqnarray*} E[X^k] &=& \sum_{x=0}^{n}x^kf(x; N, m, n)\\ &=& \sum_{x=0}^{n}x^k{{m\choose x}{N-m\choose n-x}\over {N\choose n}}\\ &=& {nm\over N}\sum_{x=0}^{n} x^{k-1} {{m-1 \choose x-1}{N-m\choose n-x}\over {N-1 \choose n-1}}\\ & & (\mbox{identities:}\ x{m\choose x} = m{m-1\choose x-1},\ n{N\choose n} = N{N-1\choose n-1})\\ &=& {nm\over N}\sum_{x=0}^{n} (y+1)^{k-1} {{m-1 \choose y}{(N-1) - (m - 1)\choose (n-1)-y}\over {N-1 \choose n-1}}\quad\quad(\mbox{setting}\ y=x-1)\\ &=& {nm\over N}E\left[(Y+1)^{k-1}\right] \quad\quad\quad \quad\quad \quad\quad\quad\quad (\mbox{since}\ Y\sim g(y; m-1, n-1, N-1)) \end{eqnarray*} $$ Hence, setting $k=1$ we have $$E[X] = {nm\over N}$$ Note that this follows the mean of the binomial distribution $\mu = np$, where $p = {m\over N}$.

Variance

The variance is $$\sigma^2 = \mbox{Var}(X) = np(1-p)\left(1 - {n-1 \over N-1}\right)$$ where $p = {m\over N}$.

Proof:

$$ \begin{align*} E[X^2] &= {nm\over N}E[Y+1] \quad\quad\quad \quad\quad\quad \quad (\mbox{setting}\ k=2)\\ &= {nm\over N}\left(E[Y] + 1\right)\\ & = {nm\over N}\left[{(n-1) (m-1) \over N-1}+1\right] \end{align*} $$ Hence the variance is $$ \begin{align*} \mbox{Var}(X) &= E\left[X^2\right] - E[X]^2\\ &= {mn\over N}\left[{(n-1) (m-1) \over N-1}+1 - {nm\over N}\right]\\ &= np \left[ (n-1) \cdot {pN-1\over N-1}+1-np\right] \quad\quad \quad \quad \quad\quad(\mbox{setting}\ p={m\over N})\\ &= np\left[(n-1)\cdot {p(N-1) + p -1 \over N-1} + 1 -np\right]\\ &= np\left[(n-1)p + (n-1)\cdot{p-1 \over N-1} + 1-np\right]\\ &= np\left[1-p - (1-p)\cdot {n-1\over N-1}\right] \\ &= np(1-p)\left(1 - {n-1 \over N-1}\right) \end{align*} $$ Note that it is approximately equal to 1 when $N$ is sufficient large (i.e. ${n-1\over N-1}\rightarrow 0$ when $N\rightarrow +\infty$). And then it is the same as the variance of the binomial distribution $\sigma^2 = np(1-p)$, where $p = {m\over N}$.

Examples

1. At a lotto game, seven balls are drawn randomly from an urn containing 37 balls numbered from 0 to 36. Calculate the probability $P$ of having exactly $k$ balls with an even number for $k=0, 1, \cdots, 7$.

Solution:

$$P(X = k) = {{19\choose k}{18\choose 7-k}\over {37 \choose 7}}$$

p = NA; k = 0:7
for (i in k){
+ p[i+1] = round(choose(19, i) * choose(18, 7-i)
+ / choose(37, 7), 3)
+ }
p
# [1] 0.003 0.034 0.142 0.288 0.307 0.173 0.047 0.005

2. Determine the same probabilities as in the previous problem, this time using the normal approximation.

Solution:

The mean is $$\mu = {nm\over N} = {7\times19\over 37} = 3.594595$$ and the standard deviation is $$\sigma = \sqrt{{nm\over N}\left(1-{m\over N}\right)\left(1 - {n-1\over N-1}\right)} = \sqrt{{7\times19\over 37}\left(1 - {19\over 37}\right) \left(1 - {7-1\over 37-1}\right)} = 1.207174$$ The probability of normal approximation is

p = NA; k = 0:7
mu = 7 * 19 / 37
s = sqrt(7 * 19 / 37 * (1 - 19/37) * (1 - 6/36))
for (i in k){
+ p[i+1] = round(dnorm(i, mu, s), 3)
+ }
p
# [1] 0.004 0.033 0.138 0.293 0.312 0.168 0.045 0.006

Reference

  1. Ross, S. (2010). A First Course in Probability (8th Edition). Chapter 4. Pearson. ISBN: 978-0-13-603313-4.
  2. Brink, D. (2010). Essentials of Statistics: Exercises. Chapter 11. ISBN: 978-87-7681-409-0.

基本概率分布Basic Concept of Probability Distributions 5: Hypergemometric Distribution的更多相关文章

  1. 基本概率分布Basic Concept of Probability Distributions 8: Normal Distribution

    PDF version PDF & CDF The probability density function is $$f(x; \mu, \sigma) = {1\over\sqrt{2\p ...

  2. 基本概率分布Basic Concept of Probability Distributions 7: Uniform Distribution

    PDF version PDF & CDF The probability density function of the uniform distribution is $$f(x; \al ...

  3. 基本概率分布Basic Concept of Probability Distributions 6: Exponential Distribution

    PDF version PDF & CDF The exponential probability density function (PDF) is $$f(x; \lambda) = \b ...

  4. 基本概率分布Basic Concept of Probability Distributions 3: Geometric Distribution

    PDF version PMF Suppose that independent trials, each having a probability $p$, $0 < p < 1$, o ...

  5. 基本概率分布Basic Concept of Probability Distributions 2: Poisson Distribution

    PDF version PMF A discrete random variable $X$ is said to have a Poisson distribution with parameter ...

  6. 基本概率分布Basic Concept of Probability Distributions 1: Binomial Distribution

    PDF下载链接 PMF If the random variable $X$ follows the binomial distribution with parameters $n$ and $p$ ...

  7. 基本概率分布Basic Concept of Probability Distributions 4: Negative Binomial Distribution

    PDF version PMF Suppose there is a sequence of independent Bernoulli trials, each trial having two p ...

  8. PRML Chapter 2. Probability Distributions

    PRML Chapter 2. Probability Distributions P68 conjugate priors In Bayesian probability theory, if th ...

  9. Common Probability Distributions

    Common Probability Distributions Probability Distribution A probability distribution describes the p ...

随机推荐

  1. 偶遇this之坑

    前言 在写一个懒加载插件时,遇到一个坑,就是this的指向问题,我想这种情况大部分人都会遇到,就写下来,新手也有个参考. 事件 有些页面图片比较多,但用户还不一定会全看,这样的话,全部去加载这些图片, ...

  2. javaSwing文本框组件

    public class JTextFieldTest extends JFrame{    private static final long serialVersionUID = 1L;    p ...

  3. mht文件无法打开的解决办法

    对于喜欢上网的人士来说,经常会将自己看到的好的文章保存下来,以便日后再次翻阅,保存方法有两种:一种是通过浏览器的收藏夹进行收藏,这种方式适合于能够一直上网的电脑:另一种是通过浏览器“文件->另存 ...

  4. Linux下p2p的聊天功能实现

    Linux下p2p的聊天功能实现细节 Do one thing at a time, and do well. 今天闲着没事,写一个P2P的点对点的聊天功能的小程序,我觉得对网络编程初学者的学习很有用 ...

  5. ruby 知识点

    $LOAD_PATH 执行 require 读取文件时搜索的目录名数组,也可以写作 $: 创建 URI 的时候可以直接这样 URI("http://www.dy2018.com/i/9751 ...

  6. 漫谈项目设计&amp;重构&amp;性能优化

    重构的好处:重构能够改进软件设计,随着项目需求的变更,项目体积的变大早已与最初的设计大相径庭,代码结构变得凌乱.复杂,如果不进行重构,则很难添加新的功能. 1.使项目代码更容易理解很多情况下是由于项目 ...

  7. Linux 学习之网络故障排查

    1.ping www.baidu.com 查看高速有没有修通,如果通,但还不能上网:可能是浏览器.中毒等问题2.ping 网关(10.0.0.254),目的是排除物理链路(网线,网卡,驱动,IP设置等 ...

  8. Oracle中的Spool缓冲池技术可以实现Oracle导出txt格式文件

    利用Oracle中的Spool缓冲池技术可以实现Oracle数据导出到文本文件 1.在Oracle PL/SQL中输入缓冲开始命令,并指定输出的文件名: spool d:output.txt; 2.设 ...

  9. Swift - 使用表格组件(UITableView)实现单列表

    1,样例说明: (1)列表内容从Controls.plist文件中读取,类型为Array. (2)点击列表项会弹出消息框显示该项信息. (3)按住列表项向左滑动,会出现删除按钮.点击删除即可删除该项. ...

  10. QML Object Attributes QML对象属性

    QML Object Attributes Every QML object type has a defined set of attributes. Each instance of an obj ...