- B-Cells produce antibodies which can bind to matching antigens of pathogens (outside of cells) and disable them OR allow other cells to easily find and bind to them by providing a convenient Fc receptor that those cells (like innate immune system neutrophils and macrophages) can attach to in order to more easily kill those enemies.
- Killer T-Cells are able to determine if a cell is infected with a known pathogen by inspecting the MHC1 proteins on the surface of cells that are continually displaying small peptide parts (small amino acid segments are called peptides) of the pathogens and checking if it can bind to their T-Cell receptors. When they have determined that their unique receptors can bind the antigen, and they receive the required confirmatory signals that it is indeed a pathogen that they have previously been warned about, they then instruct the cells to kill themselves gracefully.
- Natural Killer cells check which cells are not displaying any peptides in MHC1 proteins and if not, they send similar commands (as T-Cells do) to the cells to self destruct.
- That they have T-Cell receptor proteins on their surface that have been properly formed from the Alpha and Beta chains of the TRB and TRA gene segments.
- That their T-Cell receptors are able to bind to peptides presented to them in your own body's MHC1 proteins. This is partly achieved by the CD8 protein which will bind to a part of the MHC1 protein during docking.
- That these T-Cells will not react to the body's own proteins (if they do, they are immediately ordered to self destruct)
- That they do not bind too weakly or too strongly to their specific antigen, but just the correct amount (like in Goldilocks and the three bears).
The specific amino acids in the TAX protein peptide (depicted in grey) are: LLFGYPVYV
If you want to explore the 3D structure of the above complex for yourself, you can find it here on the Protein Databank Website for the entry: 1AO7
In this same way, all kinds of proteins from our own body proteins as well as those of invading viruses, are being digested into small peptide sequences and then presented for inspection in MHC1 receptors on the surface of all of our body cells (except red blood cells) to the Killer-T-Cells for inspection. There would normally not be Killer-T-Cells that will target normal body proteins because they would have been eliminated by the strict screening process in the thymus.
If you are interested in more T-Cell to MHC "docking" examples, have a look at this data. It comes from an article on the topic.
Next, I will show you how one would go about finding the specific gene segments which were stitched together to produce this specific T-Cell receptor proteins that are able to recognise this virus peptide so elegantly.
The first step is to download the actual sequences of amino acids that make up the different chains of the T-Cell Receptors. This is done by clicking on the Download-FASTA menu item:
This will provide you with a FASTA text file containing the following sequences:
>1AO7_1|Chain A|HLA-A 0201|Homo sapiens (9606)
GSHSMRYFFTSVSRPGRGEPRFIAVGYVDDTQFVRFDSDAASQRMEPRAPWIEQEGPEYWDGETRKVKAHSQTHRVDLGTLRGYYNQSEAGSHTVQRMYGCDVGSDWRFLRGYHQYAYDGKDYIALKEDLRSWTAADMAAQTTKHKWEAAHVAEQLRAYLEGTCVEWLRRYLENGKETLQRTDAPKTHMTHHAVSDHEATLRCWALSFYPAEITLTWQRDGEDQTQDTELVETRPAGDGTFQKWAAVVVPSGQEQRYTCHVQHEGLPKPLTLRWE
(This is the the main chain of the MHC1 protein which interacts with the T-Cell receptor and which also holds the peptide being presented. HLA stands for Human Leukocyte Antigen. This is also the protein that differs so much between different people making organ transplantation very difficult.)
>1AO7_2|Chain B|BETA-2 MICROGLOBULIN|Homo sapiens (9606)
MIQRTPKIQVYSRHPAENGKSNFLNCYVSGFHPSDIEVDLLKNGERIEKVEHSDLSFSKDWSFYLLYCTEFTPTEKDEYACRVNHVTLSQPCIVKWDRDM
(This is part of the MHC1 protein complex, but it does not interact directly with the T-Cell receptor) . See this Wikipedia article.
>1AO7_3|Chain C|TAX PEPTIDE|Human T-lymphotropic virus 1 (11908)
LLFGYPVYV
(This is the Viral Peptide sequence)
>1AO7_4|Chain D|T CELL RECEPTOR ALPHA|Homo sapiens (9606)
KEVEQNSGPLSVPEGAIASLNCTYSDRGSQSFFWYRQYSGKSPELIMSIYSNGDKEDGRFTAQLNKASQYVSLLIRDSQPSDSATYLCAVTTDSWGKLQFGAGTQVVVTPDIQNPDPAVYQLRDSKSSDKSVCLFTDFDSQTNVSQSKDSDVYITDKTVLDMRSMDFKSNSAVAWSNKSDFACANAFNNSIIPEDTFFPSPESS
(This is the T-Cell Alpha chain protein sequence)
>1AO7_5|Chain E|T CELL RECEPTOR BETA|Homo sapiens (9606)
NAGVTQTPKFQVLKTGQSMTLQCAQDMNHEYMSWYRQDPGMGLRLIHYSVGAGITDQGEVPNGYNVSRSTTEDFPLRLLSAAPSQTSVYFCASRPGLAGGRPEQYFGPGTRLTVTEDLKNVFPPEVAVFEPSEAEISHTQKATLVCLATGFYPDHVELSWWVNGKEVHSGVSTDPQPLKEQPALNDSRYALSSRLRVSATFWQNPRNHFRCQVQFYGLSENDEWTQDRAKPVTQIVSAEAWGRAD
We start off by locating where the T-Cell Receptor Alpha gene is located on the human genome:
I just type in TRAV in the GENES field. (Also obtained by just pressing CTRL-G)
Selecting any of them and pressing Enter immediately jumps to Chromosome 14 and highlights where the TRAV genes can be found.
You can also filter the display to only show the required genes starting with TRA or TRB.
Now make sure your have built the local BLAST search database for chromosome 14:
Next step is to paste the T-Cell Receptor Alpha sequence into the first align search box:
KEVEQNSGPLSVPEGAIASLNCTYSDRGSQSFFWYRQYSGKSPELIMSIYSNGDKEDGRFTAQLNKASQYVSLLIRDSQPSDSATYLCAVTTDSWGKLQFGAGTQVVVTPDIQNPDPAVYQLRDSKSSDKSVCLFTDFDSQTNVSQSKDSDVYITDKTVLDMRSMDFKSNSAVAWSNKSDFACANAFNNSIIPEDTFFPSPESS
This will execute the BLAST (Basic Local Alignment Search Tool) command:
tblastn.exe -task tblastn -evalue 1 -num_threads 4 -max_target_seqs 10 -outfmt "6 qaccver saccver pident length mismatch gapopen qstart qend sstart send evalue bitscore frames sseq" -db "E:\Genomes\hg38\Blast\hg38" -query "E:\Genomes\hg38\Temp_hg38\Query.fa" -out "E:\Genomes\hg38\Temp_hg38\QueryResults.txt" -seqidlist "E:\Genomes\hg38\Temp_hg38\QuerySequenceIds.txt"
It will give you the BLAST output:
Which the Visual Genome browser will then interpret the matches and provide an output where the gene segment names are presented against the highest matching entries from BLAST.
This indicates that this T-Cell Alpha sequence is highly likely made up of:
TRAV12-2 (Variable gene segment)
TRAJ24 (Joining gene segment)
TRAC (Constant gene segment)
And this is indeed the case. When I align the query sequence from the Protein Databank with the protein from the HG38 Human genome sequence I get:
The top sequence represent the query sequence while the bottom sequence represent the actual amino acids obtained from the human reference Genome HG38.
From this output on the "Comparisons" tab you can see that there is 99.02% identity/match with the query sequence for this combination of V and J and C segments.
202 amino acids match out of a total of 275. 73 are different
The join between V and J happens after amino acid 113 and the constant segment starts after 135.
There is no Diversity (D) segment in the Alpha chain.
MTTDSWGKFQFGAGTQVVVTP
When we go to the TRAJ24 gene segment in the genome:
Make sure the display settings is as follows:
We can search for the protein with 1 mismatch in the amino sequence by putting the MTTDSWGKFQFGAGTQVVVTP sequence in the Search box. This will search for the protein in all of the 6 reading frames.
Because we know there might be one or more amino acid not matching due to genes not having a multiple of 3 codon bases, we put 1 in the mismatches field as depicted above. This highlights the genome sequence that matches in the genome:
Now let us do the same with the T-Cell Receptor Beta chain:
>1AO7_5|Chain E|T CELL RECEPTOR BETA|Homo sapiens (9606)
NAGVTQTPKFQVLKTGQSMTLQCAQDMNHEYMSWYRQDPGMGLRLIHYSVGAGITDQGEVPNGYNVSRSTTEDFPLRLLSAAPSQTSVYFCASRPGLAGGRPEQYFGPGTRLTVTEDLKNVFPPEVAVFEPSEAEISHTQKATLVCLATGFYPDHVELSWWVNGKEVHSGVSTDPQPLKEQPALNDSRYALSSRLRVSATFWQNPRNHFRCQVQFYGLSENDEWTQDRAKPVTQIVSAEAWGRAD
We navigate to Chromosome 7 and again use a filter: op_gene^TRB
After again running the "Search Align" it has found :
TRBV6-5 to be the Variable segment
TRBC2 to be the constant segment
But we are not sure which makes up the Diversity (D) and Joining (J) segments.
This time we start by selecting the TRBV6-5 and TRBC2 segments which we have a more than 96% certainty of.
Then we press CTRL+select any joining segment. This will go through all of the joining segments and then compare the resulting protein with the query sequence entered in the top text box on the Comparisons tab. It will keep the best matching one.
After doing this we find that the diversity and joining segments are likely:
TRBD1
and
TRBJ2-7
The following protein sequence is assembled from
TRBV6-5 => 1-114 (Bases=344/3 remaining bases=2)
TRBD1 => 115-118 (Bases=12/3 remaining bases=0)
TRBJ2-7 => 119-134 (Bases=47/3 remaining bases=2)
TRBC2 => 135-314 (Bases=539/3 remaining bases=2)
1 11 21 31 41 51 61 71
MSIGLLCCAALSLLWAGPVNAGVTQTPKFQVLKTGQSMTLQCAQDMNHEYMSWYRQDPGMGLRLIHYSVGAGITDQGEVP
81 91 101 111 121 131 141 151
NGYNVSRSTTEDFPLRLLSAAPSQTSVYFCASSYSGQGASYEQYFGPGTRLTVTEDLKNVFPPKVAVFEPSEAEISHTQK
161 171 181 191 201 211 221 231
ATLVCLATGFYPDHVELSWWVNGKEVHSGVSTDPQPLKEQPALNDSRYCLSSRLRVSATFWQNPRNHFRCQVQFYGLSEN
241 251 261 271 281 291 301 311
DEWTQDRAKPVTQIVSAEAWGRADCGFTSESYQQGVLSATILYEILLGKATLYAVLVSALVLMAMVKRKDSRG
We can see that bases remaining from the previous segment will still contribute to the next segment if you look at the remaining bases. This is because segments are not always a multiple of 3 bases to make full codons.
When we jump to the TRBJ2-7 gene segment we can see the Joining gene segment:
The browser will now show the amino acids that is encoded in the normal reading frame:
We want to see how this segment produces : ASYEQYFGPGTRLTVT
This will then show where the protein sequence matches on the genome in one of the 6 reading frames and we see that there is a protein which starts 2 bases earlier:
We now use the feature that will look for proteins coded on the genome:
This will generate the protein that is formed by this reading frame:
>chr7:142797454-142797501 (Bases=48, Codons=16)
TGCTCCTACGAGCAGTACTTCGGGCCGGGCACCAGGCTCACGGTCACA
>Protein of chr7:142797454-142797501 (L=16)
SYEQYFGPGTRLTVT
When we then compare it with the protein sequence we are looking for:
ASYEQYFGPGTRLTVT
CSYEQYFGPGTRLTVT
(The first letter mismatches due to the bases at the join between D and J not being a multiple of 3 to make up a full codon on 3 bases)
When we follow the same procedure by going to gene segment TRBD1:
When we again put SGQGA in the search box:
>chr7:142786211-142786225 (Bases=15, Codons=5)
TGGGGACAGGGGGCC
>Protein of chr7:142786211-142786225 (L=5)
WGQGA
SGQGA
TRBV6-5 => 1-114
TRBD1 => 115-118
TRBJ2-7 => 119-134
TRBC2 => 135-314
If you want to learn more about activation of Cytotoxic (CD8 positive or Killer)-T Cells via Dendritic cells, please go and read the following excellent article on the topic:
Activation of CD8 T Lymphocytes during Viral Infections
No comments:
Post a Comment
Please leave me a comment