Required packages

  • library(HDMD) #Contains Atchley factor data
    library(dplyr); library(reshape2); library(tidyr) #Packages for data cleaning and shaping
    library(vegan) #Contains similarity functions
    library(ggplot2) #Plotting package
    library(gplots); library(RColorBrewer) #Heatmap packages

Background

Components of the immune system

  • The immune system is made up of several different types of cells, each performing a specialized function in the body’s defense against infective pathogens
    • Most of these cell types circulate in the blood and are collectively known as white blood cells
  • The adaptive immune system is a component of the immune system that is able to recognize new pathogens, and coordinate an a more rapid and robust memory response to the same pathogen if it is encountered again in the future
    • This phenomenon is the mechanism underlying the efficacy of vaccination
  • Lymphocytes, including T-cells and B-cells are immune cells that make up the majority of the adaptive immune system

How T-cells contibute to adaptive immunity

  • T-cells recognize pathogens through their T-cell receptor (TCR), a protein expressed on their cellular surface
  • The TCR protein is encoded by the Tcr gene, which undergoes a unique genetic process in each T-cell.
  • During the development of each T-cell, several components of the Tcr gene are swapped and edited in a process known as VDJ recombination to produce an entirely unique TCR molecule
  • Thus, each T cell is has the potential to recognize a different pathogen due to its unique TCR sequence
    • It is estimated that humans can theoretically generate 1015 different TCR sequences
  • When a T-cell encounters a pathogen that can interact with its TCR, it clonally expands in to many identical cellular clones which direct the immune response against that pathogen
  • When the pathogen is eliminated, a portion of these expanded cells remain, leaving a pool of cells to respond if the pathogen should be encountered again
    • This pool of cells, rather than a single cell, underlies the more rapid and robust nature of the secondary immune response

The problem

  • The dynamics of an immune response can be visualized by changes in the TCR repertoire
    • A TCR repertoire is the combined set of TCR sequences from all T cells in an organism
    • Changes in the frequencies of specific TCR sequences may correspond to pathogen-stimulated proliferation of the cells that expressed them.
    • In other words, TCRs that are derived from a T-cell that can recognize a particular pathogen will make up a larger component of the TCR repertoire
  • However, with a theoretical population of 1015 different sequences, it can be very difficult to visualize changes in the frequencies of a handful of specific sequences
    • Experimentally, this can be mitigated by artifically reducing the total number of possible sequences in model organisms
  • Moreover, comparing frequencies of TCR sequences do not account for similarities in the amino acid structure between different sequences
  • Thus, interpretation of the TCR repertoire could be improved by:
    1. Evaluating the full TCR repertoire, rather than simplified model systems
    2. Factoring in sequence similarity in addition to differences in sequence frequency