CELL2007 Molecular Exploration Project
Group 7
Key:
Csemi – TESK1 from Cynoglossus semilaevis
Dreri – PREDICTED TESK1 from Danio rerio
TESK2 - from Homo Sapiens
Ggall – PREDICTED TESK1 from Gallus Gallus
Cferu – TESK1 from Camelus ferus
Clupf – PREDICTED TESK1 from Canis lupus familiaris
Btour – TESK1 from Bos Taurus
Mmula – TESK1 from Macaca mulatta
Hsap1 – TESK1 from Homo sapiens
Ptrog – TESK1 from Pan troglodytes
Mmusc – TESK1 from Mus musculus
Rnorv – TESK1 from Rattus norvegicus
Cgris – TESK1 from Cricetulus griseus
Of these homologous sequences, Csemi, Cferu, Btour, Mmula, Mmusc, and Cgris were all found through a BLAST search of Homo sapiens TESK1. On the other hand Dreri, Ggall, Clupf, Ptrog, and Rnorv were found through HomoloGene on the NCBI website. Mmula and Mmusc can also be found on HomoloGene as well as through BLAST.
MSA
Figure 4: Annotated MSA of selected FASTA sequences. Click on figure for full document. A batch Domain analysis using Prosite was used to determine the conserved domains.
Figure 5: Image of Scan prosite domain analysis for TESK1. Click on link for full domain analysis.
[5] Sigrist CJA, de Castro E, Cerutti L, Cuche BA, Hulo N, Bridge A, Bougueleret L, Xenarios I.
New and continuing developments at PROSITE
Nucleic Acids Res. 2012; doi: 10.1093/nar/gks1067
PubMed: 23161676 [Full text] [PDF version]
The MSA was conducted on multiple FASTA sequences for TESK including numerous species using clustal omega [4]. Boxshade was used to highlight conserved sequences between these.
In order to determine what these conserved regions corresponded to, a batch domain analysis using Scan Prosite was performed- shown in Figure 5.
From Figure 4 it can be deduced that although a siginificant proportion of the whole TESK1 sequence is conserved, only the Kinase domain can be identified through sequence analysis. Within the conserved kinase domains of each homologous protein there is an ATP binding site and Active Site. On the figure the consensus sequences for these features has been highlighted.
The part of the protein beyond approximately 320 was not successfully predicted. This is congruent with the structural predictions and other domain analyses. Despite the limitations of these analyses, another domain analysis was run using Robetta for the TESK1 FASTA sequence. This indicated that the region beyond residue 320 could be a glycoprotein - http://robetta.bakerlab.org/results.jsp?id=54271.
However, the confidence for this prediction was low, and evidence for this would need to be determined through experimentation.

