# Poisson Distribution

### General description

The Poisson distribution is a discrete probability distribution

- A discrete probability distribution models variables who’s values are countable
- I.e. Number of e-mails received in a day
- Note that discrete variables can only be whole numbers (as receiving 5.5 emails in a day does not make sense)

- This contrasts with a continuous probability distribution like the normal distribution

- I.e. Number of e-mails received in a day

The Poisson distribution models the probability of obtaining particular count of a discrete variable, given the average count of that variable

- This model is represented as \(f(x) = \frac{\lambda^xe^{-\lambda}}{x!}\)
- \(f(x) =\) probability of discrete variable at value \(x\)
- \(\lambda =\) mean of the discrete variable being modeled

- For the Poisson distribution, the variance (\(\sigma^2\)) is equal to the mean (\(\lambda\) or \(\mu\))

### Examples

Radioactive decay

- For Pu-239, with an average of 2.3 decays per second, what is the probability of observing 3 decays over a period of two seconds?
- \(\lambda = (2.3 decays/second)*(2 seconds) = 4.6 seconds\)
- \(x = 3\)
- \(f(x) = \frac{4.6^3e^{-4.6}}{3!} = 0.163\)

With R:

`dpois(x = 3, lambda = 4.6) #Probability of observing EXACTLY 3 events`

`## [1] 0.1630676`

`ppois(q = 3, lambda = 4.6) #Probability of observing 3 events OR LESS`

`## [1] 0.3257063`

`ppois(q = 3, lambda = 4.6, lower.tail = FALSE) #Probability of observing MORE THAN 3 events`

`## [1] 0.6742937`

Plotting the distribution in R

`N <- 10000 #Arbitrarily large number of values R should generate from modeled distribution x <- rpois(N, lambda = 4.6) #Generates N-number values from indicated poisson distribution hist(x, xlim=c(min(x),max(x)), probability=T, nclass=max(x)-min(x)+1, col='lightblue', main='Poisson distribution, lambda=1') lines(density(x,bw=1), col='red', lwd=3)`

### Poisson vs. binomial distribution

The binomial distribution

- The binomial distribution models the probaility of observing a certain frequency of events given the average frequency of the event and the number of events observed
- Represented as \(f(x) = ( \begin{array}{r}n \\ k \end{array} ) p^k(1-p)^{(n-k)}\)
- \(n =\) number of total observations (“trials”)
- \(k =\) number of outcomes of interest (“successes”)
- \(( \begin{array}{r}n \\ k \end{array} ) = n\) choose \(k =\) the number of ways you can draw k-numbered outcomes out of n-values
- I.e \(( \begin{array}{r}2 \\ 3 \end{array} ) = 3\) since \(n = \{1,2,3\}\) and outcomes with \(k=2\) are \(\{1,2\},\{1,3\},\{2,3\}\) (3 outcomes)
- Calculated as \(( \begin{array}{r}n \\ k \end{array} ) = \frac{n!}{k!(n-k)!}\) or in R as
`choose(n,k)`

- \(p =\) probability of the outcome of interest (“probability of a success”)

- Represented as \(f(x) = ( \begin{array}{r}n \\ k \end{array} ) p^k(1-p)^{(n-k)}\)
- Example - “What is the probability of obtaining a p-value of 0.05 in 20 comparisons of random data”
- A p-value of 0.05 is equivalent to 1 “success” in 20 so:
- \(n = 20, k = 1, p = 0.05\)

`dbinom(x=1,size=20, prob=0.05) #Probability of obtaining 1 positive p-value from 20 random comparisons`

`## [1] 0.3773536`

`pbinom(q=1-1,size=20, prob=0.05, lower.tail = FALSE) #Probability of obtaining AT LEAST 1 positive p-value from 20 random comparisons ("-1", since lower.tail = F isn't inclusive)`

`## [1] 0.6415141`