### Entropy vs. diversity

• Entropy is a combined measure of total number of species and the eveness of their distribution
• Entropy can also be defined as the uncertainty of a particular sample’s species identity
• Shannon entropy was originally used to quantify the uncertainty in predicting the next character in a text string. The more types of characters available (diversity), the higher the uncertainty (entropy) in predicting the correct one.
• Entropies are reasonable indices of diversity, but do not match the intuitive understanding of diversity
• Example: population A = 16 equally common species population B = 8 equally common species
• Intuitively, population A is twice as diverse as population B
• However, this is not true for entropies

library(vegan)
A <- rep(1/16, 16)
B <- rep(1/8, 8)
data.frame("Shannon A" = diversity(A, base = 2), "Shannon B" = diversity(B, base = 2)) #Base = 2 used to match up with Ref.1
##   Shannon.A Shannon.B
## 1         4         3
• Note that population A’s Shannon entropy is not twice that of population B
• Effective number of species (diversity)
• All communities that share the same entropy value, have the same diversity
• For every type of diversity (e.g. Shannon), every possible value entropy has a corresponding community where each species is equally common
• The effective number of species is the number of species in this equivalent equally-common community
• Thus, the effective number of species, $$D$$, for a community can be found by setting its entropy value equal to the entropy equation applied to a population of $$D$$ number of species with frequencies of $$1/D$$ and solving for $$D$$
• Interpretation for ENS/divesity - For any order $$q$$ (see below), ENS/diversity represents the number of species in a equally common community that gives the same ENS/diversity value, at that order of $$q$$

library(vegan)
A <- c(1,43,40,5,1)
exp(renyi(A))  #See Hill number's and Renyi entropy below for explanation
##        0     0.25      0.5        1        2        4        8       16
## 5.000000 3.914042 3.255867 2.648191 2.330265 2.222133 2.182638 2.158958
##       32       64      Inf
## 2.136983 2.117379 2.093023
## attr(,"class")
## [1] "renyi"   "numeric"
• For order = 1 (exponential Shannon) - “at order = 1, population A has the same is as diverse as a community with 2.648191 equally common species”
• For order = 0 (species richness/total species) - “at order = 0, population A has the same is as diverse as a community with 5 equally common species”
• See Hill/Renyi section for updated explanation of order and updated updated interpretation when taken into account
• Proof that diversity $$^qD = (\sum\limits_{i=1}^{S}p_i{}^q)^{(1/(1-q))}$$ for any particular entorpy index
• Variables
• $$H()$$ - any specific entropy
• $$D$$ - “diversity” or effective number of species, the number of equally common species
• $$S$$ - the total number of species in an actual sample
• $$q$$ - order
• $$p_i$$ - the frequency of a given species in a community
• Given
1. Entropy can be generalized as $$H(\Sigma_{i=1}^{S}(p_{i})^{q})$$
2. $$H(\sum\limits_{i=1}^{D}(\frac{1}{D})^q) = x = H(\sum\limits_{i=1}^{S}(p_{i})^{q})$$
3. $$H()$$ is an invertible function (it is continuous and monotonic(is either only increasing or decreasing))
• Solve for $$D$$ in terms of $$x$$
1. $$H(\sum\limits_{i=1}^{D}(\frac{1}{D})^q) = x$$
2. $$H(D(\frac{1}{D})^q) = x$$
3. $$(\frac{1}{D})^{q-1} = H^{-1}(x)$$
• $$(1/D)^q = 1/(D^q) = D^{-q}$$
• $$D(D^{-q}) = D^1D^{-q} = D^{1-q} = 1/D^{-(1-q)} = 1/(D^{q-1}) = (1/D)^{q-1}$$
4. $$D = (\frac{1}{H^{-1}(x)})^\frac{1}{q-1}$$
• Solve for $$D$$ in terms of $$p_i$$
1. $$D = (\frac{1}{H^{-1}(H(\Sigma(p_{i})^{q}))})^\frac{1}{q-1}$$
• This subs in the left arm of (2) for $$x$$ in (7)
2. $$D = (\frac{1}{\Sigma(p_{i})^{q}})^\frac{1}{q-1}$$
• By (3), applying a function to an inverted version of itself cancels out the function
3. $$D = (\sum\limits_{i=1}^{S}p_i{}^q)^\frac{1}{1-q}$$
• Algebra similar to that in (6)
• Corollary - diversity depends only on species frequencies and order $$q$$, not on the particular entropy function
Entropy index Equation To convert to diversity
Species richnes $$x = \Sigma_{i=1}^Sp_i{}^0$$ $$x$$
Shannon entropy $$x = -\Sigma_{i=1}^Sp_i\textrm{ln}(p_i)$$ $$e^x$$
Gini-Simpson index $$x = 1-\Sigma_{i=1}^Sp_i{}^2$$ $$1/(1-x)$$
Renyi entropy $$x = (-\textrm{ln}\Sigma_{i=1}^S p_{i}{}^{q})/(q-1)$$ $$e^x$$
• Note on logarithms and conversion
• Entropy depends on the log base, diversity does not
• Equation for converting Shannon entropy to diversity is $$e^x$$ because equation for obtaining Shannon diversity uses the natural logarithm (base = $$e$$)
• If a different log base was used to calculate entropy, that base would be used for the exponent
• Similarly for any other diversity index
library(vegan)
A <- rep(1/16, 16)
base.e <- c(diversity(A), exp(diversity(A)))
base.2 <- c(diversity(A, base = 2),2^(diversity(A, base = 2)))
data.frame(base.e, base.2, row.names = c("Shannon", "diversity")) #Entropy differs, diversity doesn't
##              base.e base.2
## Shannon    2.772589      4
## diversity 16.000000     16

### Hill numbers, order, and Renyi entropy

• $$^qD = (\sum\limits_{i=1}^{S}p_i{}^q)^{(1/(1-q))}$$ are known as Hill numbers and give values of diversity at various orders of $$q$$
• Order, $$q$$, dictates the sensitivity of a diversity metric to common and rare species
• $$q = 0$$ is completely insensitive to species frequency (i.e. all species weighted equally)
• Corresponds to the harmonic mean
• $$q = 1$$ weighs all species by their frequency
• Corresponds to the geometric mean
• The value of a Hill number is undefined at $$q = 1$$ but is obtained from its limit
• $$(\sum\limits_{i=1}^{S}p_i{}^1)^{(1/(1-1))} = (\sum\limits_{i=1}^{S}p_i)^{Inf}$$
• $$q > 2$$ increasingly favors the more common species
• $$q = 2$$ specifically corresponds to the arithmetic mean
• Extended Interpretation for ENS/divesity - ENS/diversity represents the number of species in a equally common community that gives the same ENS/diversity value as the region of the population focused on by the order of $$q$$

library(vegan)
A <- c(1,43,40,5,1)
exp(renyi(A)) 
##        0     0.25      0.5        1        2        4        8       16
## 5.000000 3.914042 3.255867 2.648191 2.330265 2.222133 2.182638 2.158958
##       32       64      Inf
## 2.136983 2.117379 2.093023
## attr(,"class")
## [1] "renyi"   "numeric"
• For order = 1 - “Altogether, population A has the same is as diverse as a community with 2.648191 equally common species”
• For order = $$Inf$$ - “When considering only the most abundant species, population A has the same is as diverse as a community with 2.093023 equally common species”
• Since, starting after 1, increasing values of $$q$$ focus more and more on the most abundant species, $$Inf$$ is the value of $$q$$ that is most focused on the abundant species
• Population A is an extreme example of a population with abundant species (i.e. very uneven), so even diversity values at $$q = 2$$ are close to those at $$q = Inf$$ and could be said to approximate the diversity of high abundance species
• However, in a population where there is less dominance by few abundant species (i.e. more even), only extreme $$q$$ values like $$Inf$$ will explain the diversity of only the high abundance species
• The insensitivity of $$q=1$$ to eveness is one of the reasons exponential Shannon diversity is one of the best/most interpretable single measures of diversity.
• Hill numbers satisfy the “doubling rule”

library(vegan)
A <- sample(1:100, 10, replace = T)
A <- A/sum(A)
B <- c(A/2, A/2) #Population with same number of individuals, but twice as many species
Hill.A <- exp(renyi(A))
Hill.B <- exp(renyi(B))
data.frame(Hill.A, Hill.B, "Ratio Hill B:A" = Hill.B/Hill.A)
##         Hill.A    Hill.B Ratio.Hill.B.A
## 0    10.000000 20.000000              2
## 0.25  8.894739 17.789478              2
## 0.5   8.083409 16.166818              2
## 1     7.048758 14.097516              2
## 2     6.086147 12.172293              2
## 4     5.404975 10.809949              2
## 8     5.017098 10.034196              2
## 16    4.825909  9.651818              2
## 32    4.710580  9.421161              2
## 64    4.624034  9.248068              2
## Inf   4.516484  9.032967              2
• Renyi entropy $$=\frac{1}{1-q} \textrm{ln}\sum\limits_{i=1}^S p_{i}{}^{q}$$

*Special cases of order values

Order Hill number equivalent Renyi entropy equivalent
0 Species richness /
1 / Shannon entropy
2 Inverse Simpson entropy* /
Inf Berger-Parker index /

*$$\textrm{Inverse Simpson entropy} = 1/(1-\textrm{Gini-Simpson index})$$ ### References 1. “Entropy and diversity” Lou Jost, OIKOS 20016