Thursday, August 27, 2015

Hemicentin 1 (HMCN1)

Human Hemicentin protein is encoded by the HMCN1 gene on chromosome 1 at position 
chr1:186024525-186024806 and consists of 107 exons encoded on the + strand which are spliced together to code for a protein of length 5636 amino acids. 
Mutations in this gene may be associated with age-related macular degeneration in the eye.


I have arranged the amino acids based in a row length of 93 amino acids, which gives me the following ordered pattern of amino acids....
Notice how the exon splice junctions (marked with red blocks) nicely line up.  One can also observe that there are clearly visible hydrophobic (pink), polar uncharged (blue) bands.  The negatively charged (green) and positively charged (cyan) amino acids are scattered at the edges of the hydrophobic regions.


Here is the same protein using a heat map.



When one looks at the amino acid fasta letters the same pattern is observable.


For those interested here is the FASTA sequence:

>HMCN1 ; Homo sapiens hemicentin 1 (HMCN1), mRNA. ; Protein length = 5636 (uc001grq.1 Full gene position = chr1:185703683-186160085)
MISWEVVHTVFLFALLYSSLAQDASPQSEIRAEEIPEGASTLAFVFDVTGSMYDDLVQVIEGASKILETSLKRPKRPLFNFALVPFHDPEIGPV
TITTDPKKFQYELRELYVQGGGDCPEMSIGAIKIALEISLPGSFIYVFTDARSKDYRLTHEVLQLIQQKQSQVVFVLTGDCDDRTHIGYKVYEE
IASTSSGQVFHLDKKQVNEVLKWVEEAVQASKVHLLSTDHLEQAVNTWRIPFDPSLKEVTVSLSGPSPMIEIRNPLGKLIKKGFGLHELLNIHN
SAKVVNVKEPEAGMWTVKTSSSGRHSVRITGLSTIDFRAGFSRKPTLDFKKTVSRPVQGIPTYVLLNTSGISTPARIDLLELLSISGSSLKTIP
VKYYPHRKPYGIWNISDFVPPNEAFFLKVTGYDKDDYLFQRVSSVSFSSIVPDAPKVTMPEKTPGYYLQPGQIPCSVDSLLPFTLSFVRNGVTL
GVDQYLKESASVNLDIAKVTLSDEGFYECIAVSSAGTGRAQTFFDVSEPPPVIQVPNNVTVTPGERAVLTCLIISAVDYNLTWQRNDRDVRLAE
PARIRTLANLSLELKSVKFNDAGEYHCMVSSEGGSSAASVFLTVQEPPKVTVMPKNQSFTGGSEVSIMCSATGYPKPKIAWTVNDMFIVGSHRY
RMTSDGTLFIKNAAPKDAGIYGCLASNSAGTDKQNSTLRYIEAPKLMVVQSELLVALGDITVMECKTSGIPPPQVKWFKGDLELRPSTFLIIDP
LLGLLKIQETQDLDAGDYTCVAINEAGRATGKITLDVGSPPVFIQEPADVSMEIGSNVTLPCYVQGYPEPTIKWRRLDNMPIFSRPFSVSSISQ
LRTGALFILNLWASDKGTYICEAENQFGKIQSETTVTVTGLVAPLIGISPSVANVIEGQQLTLPCTLLAGNPIPERRWIKNSAMLLQNPYITVR
SDGSLHIERVQLQDGGEYTCVASNVAGTNNKTTSVVVHVLPTIQHGQQILSTIEGIPVTLPCKASGNPKPSVIWSKKGELISTSSAKFSAGADG
SLYVVSPGGEESGEYVCTATNTAGYAKRKVQLTVYVRPRVFGDQRGLSQDKPVEISVLAGEEVTLPCEVKSLPPPIITWAKETQLISPFSPRHT
FLPSGSMKITETRTSDSGMYLCVATNIAGNVTQAVKLNVHVPPKIQRGPKHLKVQVGQRVDIPCNAQGTPLPVITWSKGGSTMLVDGEHHVSNP
DGTLSIDQATPSDAGIYTCVATNIAGTDETEITLHVQEPPTVEDLEPPYNTTFQERVANQRIEFPCPAKGTPKPTIKWLHNGRELTGREPGISI
LEDGTLLVIASVTPYDNGEYICVAVNEAGTTERKYNLKVHVPPVIKDKEQVTNVSVLLNQLTNLFCEVEGTPSPIIMWYKDNVQVTESSTIQTV
NNGKILKLFRATPEDAGRYSCKAINIAGTSQKYFNIDVLVPPTIIGTNFPNEVSVVLNRDVALECQVKGTPFPDIHWFKDGKPLFLGDPNVELL
DRGQVLHLKNARRNDKGRYQCTVSNAAGKQAKDIKLTIYIPPSIKGGNVTTDISVLINSLIKLECETRGLPMPAITWYKDGQPIMSSSQALYID
KGQYLHIPRAQVSDSATYTCHVANVAGTAEKSFHVDVYVPPMIEGNLATPLNKQVVIAHSLTLECKAAGNPSPILTWLKDGVPVKANDNIRIEA
GGKKLEIMSAQEIDRGQYICVATSVAGEKEIKYEVDVLVPPAIEGGDETSYFIVMVNNLLELDCHVTGSPPPTIMWLKDGQLIDERDGFKILLN
GRKLVIAQAQVSNTGLYRCMAANTAGDHKKEFEVTVHVPPTIKSSGLSERVVVKYKPVALQCIANGIPNPSITWLKDDQPVNTAQGNLKIQSSG
RVLQIAKTLLEDAGRYTCVATNAAGETQQHIQLHVHEPPSLEDAGKMLNETVLVSNPVQLECKAAGNPVPVITWYKDNRLLSGSTSMTFLNRGQ
IIDIESAQISDAGIYKCVAINSAGATELFYSLQVHVAPSISGSNNMVAVVVNNPVRLECEARGIPAPSLTWLKDGSPVSSFSNGLQVLSGGRIL
ALTSAQISDTGRYTCVAVNAAGEKQRDIDLRVYVPPNIMGEEQNVSVLISQAVELLCQSDAIPPPTLTWLKDGHPLLKKPGLSISENRSVLKIE
DAQVQDTGRYTCEATNVAGKTEKNYNVNIWVPPNIGGSDELTQLTVIEGNLISLLCESSGIPPPNLIWKKKGSPVLTDSMGRVRILSGGRQLQI
SIAEKSDAALYSCVASNVAGTAKKEYNLQVYIRPTITNSGSHPTEIIVTRGKSISLECEVQGIPPPTVTWMKDGHPLIKAKGVEILDEGHILQL
KNIHVSDTGRYVCVAVNVAGMTDKKYDLSVHAPPSIIGNHRSPENISVVEKNSVSLTCEASGIPLPSITWFKDGWPVSLSNSVRILSGGRMLRL
MQTTMEDAGQYTCVVRNAAGEERKIFGLSVLVPPHIVGENTLEDVKVKEKQSVTLTCEVTGNPVPEITWHKDGQPLQEDEAHHIISGGRFLQIT
NVQVPHTGRYTCLASSPAGHKSRSFSLNVFVSPTIAGVGSDGNPEDVTVILNSPTSLVCEAYSYPPATITWFKDGTPLESNRNIRILPGGRTLQ
ILNAQEDNAGRYSCVATNEAGEMIKHYEVKVYIPPIINKGDLWGPGLSPKEVKIKVNNTLTLECEAYAIPSASLSWYKDGQPLKSDDHVNIAAN
GHTLQIKEAQISDTGRYTCVASNIAGEDELDFDVNIQVPPSFQKLWEIGNMLDTGRNGEAKDVIINNPISLYCETNAAPPPTLTWYKDGHPLTS
SDKVLILPGGRVLQIPRAKVEDAGRYTCVAVNEAGEDSLQYDVRVLVPPIIKGANSDLPEEVTVLVNKSALIECLSSGSPAPRNSWQKDGQPLL
EDDHHKFLSNGRILQILNTQITDIGRYVCVAENTAGSAKKYFNLNVHVPPSVIGPKSENLTVVVNNFISLTCEVSGFPPPDLSWLKNEQPIKLN
TNTLIVPGGRTLQIIRAKVSDGGEYTCIAINQAGESKKKFSLTVYVPPSIKDHDSESLSVVNVREGTSVSLECESNAVPPPVITWYKNGRMITE
STHVEILADGQMLHIKKAEVSDTGQYVCRAINVAGRDDKNFHLNVYVPPSIEGPEREVIVETISNPVTLTCDATGIPPPTIAWLKNHKRIENSD
SLEVRILSGGSKLQIARSQHSDSGNYTCIASNMEGKAQKYYFLSIQVPPSVAGAEIPSDVSVLLGENVELVCNANGIPTPLIQWLKDGKPIASG
ETERIRVSANGSTLNIYGALTSDTGKYTCVATNPAGEEDRIFNLNVYVTPTIRGNKDEAEKLMTLVDTSINIECRATGTPPPQINWLKNGLPLP
LSSHIRLLAAGQVIRIVRAQVSDVAVYTCVASNRAGVDNKHYNLQVFAPPNMDNSMGTEEITVLKGSSTSMACITDGTPAPSMAWLRDGQPLGL
DAHLTVSTHGMVLQLLKAETEDSGKYTCIASNEAGEVSKHFILKVLEPPHINGSEEHEEISVIVNNPLELTCIASGIPAPKMTWMKDGRPLPQT
DQVQTLGGGEVLRISTAQVEDTGRYTCLASSPAGDDDKEYLVRVHVPPNIAGTDEPRDITVLRNRQVTLECKSDAVPPPVITWLRNGERLQATP
RVRILSGGRYLQINNADLGDTANYTCVASNIAGKTTREFILTVNVPPNIKGGPQSLVILLNKSTVLECIAEGVPTPRITWRKDGAVLAGNHARY
SILENGFLHIQSAHVTDTGRYLCMATNAAGTDRRRIDLQVHVPPSIAPGPTNMTVIVNVQTTLACEATGIPKPSINWRKNGHLLNVDQNQNSYR
LLSSGSLVIISPSVDDTATYECTVTNGAGDDKRTVDLTVQVPPSIADEPTDFLVTKHAPAVITCTASGVPFPSIHWTKNGIRLLPRGDGYRILS
SGAIEILATQLNHAGRYTCVARNAAGSAHRHVTLHVHEPPVIQPQPSELHVILNNPILLPCEATGTPSPFITWQKEGINVNTSGRNHAVLPSGG
LQISRAVREDAGTYMCVAQNPAGTALGKIKLNVQVPPVISPHLKEYVIAVDKPITLSCEADGLPPPDITWHKDGRAIVESIRQRVLSSGSLQIA
FVQPGDAGHYTCMAANVAGSSSTSTKLTVHVPPRIRSTEGHYTVNENSQAILPCVADGIPTPAINWKKDNVLLANLLGKYTAEPYGELILENVV
LEDSGFYTCVANNAAGEDTHTVSLTVHVLPTFTELPGDVSLNKGEQLRLSCKATGIPLPKLTWTFNNNIIPAHFDSVNGHSELVIERVSKEDSG
TYVCTAENSVGFVKAIGFVYVKEPPVFKGDYPSNWIEPLGGNAILNCEVKGDPTPTIQWNRKGVDIEISHRIRQLGNGSLAIYGTVNEDAGDYT
CVATNEAGVVERSMSLTLQSPPIITLEPVETVINAGGKIILNCQATGEPQPTITWSRQGHSISWDDRVNVLSNNSLYIADAQKEDTSEFECVAR
NLMGSVLVRVPVIVQVHGGFSQWSAWRACSVTCGKGIQKRSRLCNQPLPANGGKPCQGSDLEMRNCQNKPCPVDGSWSEWSLWEECTRSCGRGN
QTRTRTCNNPSVQHGGRPCEGNAVEIIMCNIRPCPVHGAWSAWQPWGTCSESCGKGTQTRARLCNNPPPAFGGSYCDGAETQMQVCNERNCPIH
GKWATWASWSACSVSCGGGARQRTRGCSDPVPQYGGRKCEGSDVQSDFCNSDPCPTHGNWSPWSGWGTCSRTCNGGQMRRYRTCDNPPPSNGGR
ACGGPDSQIQRCNTDMCPVDGSWGSWHSWSQCSASCGGGEKTRKRLCDHPVPVKGGRPCPGDTTQVTRCNVQACPGGPQRARGSVIGNINDVEF
GIAFLNATITDSPNSDTRIIRAKITNVPRSLGSAMRKIVSILNPIYWTTAKEIGEAVNGFTLTNAVFKRETQVEFATGEILQMSHIARGLDSDG
SLLLDIVVSGYVLQLQSPAEVTVKDYTEDYIQTGPGQLYAYSTRLFTIDGISIPYTWNHTVFYDQAQGRMPFLVETLHASSVESDYNQIEETLG
FKIHASISKGDRSNQCPSGFTLDSVGPFCADEDECAAGNPCSHSCHNAMGTYYCSCPKGLTIAADGRTCQDIDECALGRHTCHAGQDCDNTIGS
YRCVVRCGSGFRRTSDGLSCQDINECQESSPCHQRCFNAIGSFHCGCEPGYQLKGRKCMDVNECRQNVCRPDQHCKNTRGGYKCIDLCPNGMTK
AENGTCIDIDECKDGTHQCRYNQICENTRGSYRCVCPRGYRSQGVGRPCMDINECEQVPKPCAHQCSNTPGSFKCICPPGQHLLGDGKSCAGLE
RLPNYGTQYSSYNLARFSPVRNNYQPQQHYRQYSHLYSSYSEYRNSRTSLSRTRRTIRKTCPEGSEASHDTCVDIDECENTDACQHECKNTFGS
YQCICPPGYQLTHNGKTCQDIDECLEQNVHCGPNRMCFNMRGSYQCIDTPCPPNYQRDPVSGFCLKNCPPNDLECALSPYALEYKLVSLPFGIA
TNQDLIRLVAYTQDGVMHPRTTFLMVDEEQTVPFALRDENLKGVVYTTRPLREAETYRMRVRASSYSANGTIEYQTTFIVYIAVSAYPY*

No comments:

Post a Comment

Please leave me a comment