WERAM Information


Tag Content
WERAM ID WERAM-Hos-0018
Ensembl Protein ID ENSP00000347325.3
Uniprot Accession Q8NEZ4; KMT2C_HUMAN; Q8NC02; Q8NDF6; Q9H9P4; Q9NR13; Q9P222; Q9UDR7
Genbank Protein ID NP_733751.2
Protein Name Histone-lysine N-methyltransferase 2C
Genbank Nucleotide ID NM_170606.2
Gene Name KMT2C;HALR;MLL3
Ensembl Information
Ensembl Gene ID Ensembl Transcript ID Ensembl Protein ID
ENSG00000055609.17 ENST00000355193.6 ENSP00000347325.3
ENSG00000055609.17 ENST00000262189.10 ENSP00000262189.6
ENSG00000055609.17 ENST00000360104.7 ENSP00000353218.3
ENSG00000055609.17 ENST00000418673.1 ENSP00000403483.1
ENSG00000055609.17 ENST00000424877.5 ENSP00000410411.1
ENSG00000055609.17 ENST00000558084.5 ENSP00000453752.1
Details
Type Family Domain Substrates AA References (PMIDs)
HMT SET1 SET H3K4 K 25537518; 20951770; 26886794
Status Reviewed
Classification
Type Family E-value Score Start End
HMT SET1 2.00e-44 152.8 4772 4887
Me_Reader PHD 6.70e-25 89.3 284 1137
Organism Homo sapiens
NCBI Taxa ID 9606
Functional Description
(View)
Histone methyltransferase. Methylates 'Lys-4' of histone H3. H3 'Lys-4' methylation represents a specific tag for epigenetic transcriptional activation. Central component of the MLL2/3 complex, a coactivator complex of nuclear receptors, involved in transcriptional coactivation. KMT2C/MLL3 may be a catalytic subunit of this complex. May be involved in leukemogenesis and developmental disorder.
Domain Profile
  HMT SET1

           SET1.txt    2 elevakskikglglvakkeiekeelviEYvGevirsevadkrekeyekkeigvylfrldedaevvvdatkkgniarfinhscepNceakv 91  
++++a+s+i+glgl+a+++iek+++viEY+G++ir+eva+++ek ye++++gvy+fr+d+d +v+dat +g+ ar+inhsc+pNc+a+v
ENSP00000347325.3 4772 NVYLARSRIQGLGLYAARDIEKHTMVIEYIGTIIRNEVANRKEKLYESQNRGVYMFRMDND--HVIDATLTGGPARYINHSCAPNCVAEV 4859
7999*********************************************************..*************************** PP
SET1.txt 92 vavdgekkiviyakraIekgeeltydYk 119
v++++ +ki+i ++r+I+kgeel+ydYk
ENSP00000347325.3 4860 VTFERGHKIIISSSRRIQKGEELCYDYK 4887
***************************7 PP

  Me_Reader PHD

            PHD.txt   3 iClvCgkddegekemvqCde.CddwfHlkCvklp 35 
C +C++ ++ + +C+e C + +H+ C +
ENSP00000347325.3 284 RCAFCKHLGAT---IKCCEEkCTQMYHYPCAAGA 314
6****433333...6688889*********8655 PP
PHD.txt 3 iClvCgkddegekemvqCdeCddwfHlkCvklplsslpegkswyCpsCk 51
C vC+ +++ + C +C + +H+ C++++ ++l+ w Cp+Ck
ENSP00000347325.3 343 NCAVCDSPGDL-LDQFFCTTCGQHYHGMCLDIAVTPLKRA-GWQCPECK 389
5****544444.45999****************8888865.7******7 PP
PHD.txt 3 iClvCgkddegekemvqCdeCddwfHlkCvklplsslpegkswyCpsCk 51
+C+ C++++e++k m+ Cd+Cd+ +H+ C+++ ++s+p + w C++C+
ENSP00000347325.3 390 VCQNCKQSGEDSK-MLVCDTCDKGYHTFCLQPVMKSVPTN-GWKCKNCR 436
8****88888876.************************77.7******8 PP
PHD.txt 2 tiClvCgkddegeke..mvqCdeCddwfHlkCvklplsslpeg..kswyCpsCk 51
++C++Cgk+++ e + m+ C+ C++w+Hl+C k++ ++l + ++++C Ck
ENSP00000347325.3 465 NLCPFCGKCYHPELQkdMLHCNMCKRWVHLECDKPTDHELDTQlkEEYICMYCK 518
68999998888865566*******************77777666668******8 PP
PHD.txt 2 tiClvCgkddegeke.mvqCdeCddwfHlkCvklplsslpegkswyCpsCk 51
++C+vCg ++g++ ++ C++C + +H +Cv+++ +++ +k w+C +C+
ENSP00000347325.3 958 DMCVVCGSFGQGAEGrLLACSQCGQCYHPYCVSIKITKVVLSKGWRCLECT 1008
68****7544443334*******************999996658******7 PP
PHD.txt 2 tiClvCgkddegekemvqCdeCddwfHlkCvklplsslpegkswyCpsCk 51
t+C +Cgk+ + + ++ Cd Cd +H++C+++pl+++p+g w C+ C+
ENSP00000347325.3 1008 TVCEACGKATDPGR-LLLCDDCDISYHTYCLDPPLQTVPKG-GWKCKWCV 1055
68999976666655.9*************************.9**99996 PP
PHD.txt 3 iClvCgkddegekemvqCdeCddwfHlkCvklplsslpeg.ks..wyCpsCk 51
+C+vC +++ +e+ ++qC +Cd+w+H+ C +l+ ++ e+ ++ + C C+
ENSP00000347325.3 1086 SCPVCYRNYREEDLILQCRQCDRWMHAVCQNLNTEEEVENvADigFDCSMCR 1137
7****777777777*******************5444455422458999997 PP

Protein Sequence
(Fasta)
MSSEEDKSVE QPQPPPPPPE EPGAPAPSPA AADKRPRGRP RKDGASPFQR ARKKPRSRGK 60
TAVEDEDSMD GLETTETETI VETEIKEQSA EEDAEAEVDN SKQLIPTLQR SVSEESANSL 120
VSVGVEAKIS EQLCAFCYCG EKSSLGQGDL KQFRITPGFI LPWRNQPSNK KDIDDNSNGT 180
YEKMQNSAPR KQRGQRKERS PQQNIVSCVS VSTQTASDDQ AGKLWDELSL VGLPDAIDIQ 240
ALFDSTGTCW AHHRCVEWSL GVCQMEEPLL VNVDKAVVSG STERCAFCKH LGATIKCCEE 300
KCTQMYHYPC AAGAGTFQDF SHIFLLCPEH IDQAPERSKE DANCAVCDSP GDLLDQFFCT 360
TCGQHYHGMC LDIAVTPLKR AGWQCPECKV CQNCKQSGED SKMLVCDTCD KGYHTFCLQP 420
VMKSVPTNGW KCKNCRICIE CGTRSSSQWH HNCLICDNCY QQQDNLCPFC GKCYHPELQK 480
DMLHCNMCKR WVHLECDKPT DHELDTQLKE EYICMYCKHL GAEMDRLQPG EEVEIAELTT 540
DYNNEMEVEG PEDQMVFSEQ AANKDVNGQE STPGIVPDAV QVHTEEQQKS HPSESLDTDS 600
LLIAVSSQHT VNTELEKQIS NEVDSEDLKM SSEVKHICGE DQIEDKMEVT ENIEVVTHQI 660
TVQQEQLQLL EEPETVVSRE ESRPPKLVME SVTLPLETLV SPHEESISLC PEEQLVIERL 720
QGEKEQKENS ELSTGLMDSE MTPTIEGCVK DVSYQGGKSI KLSSETESSF SSSADISKAD 780
VSSSPTPSSD LPSHDMLHNY PSALSSSAGN IMPTTYISVT PKIGMGKPAI TKRKFSPGRP 840
RSKQGAWSTH NTVSPPSWSP DISEGREIFK PRQLPGSAIW SIKVGRGSGF PGKRRPRGAG 900
LSGRGGRGRS KLKSGIGAVV LPGVSTADIS SNKDDEENSM HNTVVLFSSS DKFTLNQDMC 960
VVCGSFGQGA EGRLLACSQC GQCYHPYCVS IKITKVVLSK GWRCLECTVC EACGKATDPG 1020
RLLLCDDCDI SYHTYCLDPP LQTVPKGGWK CKWCVWCRHC GATSAGLRCE WQNNYTQCAP 1080
CASLSSCPVC YRNYREEDLI LQCRQCDRWM HAVCQNLNTE EEVENVADIG FDCSMCRPYM 1140
PASNVPSSDC CESSLVAQIV TKVKELDPPK TYTQDGVCLT ESGMTQLQSL TVTVPRRKRS 1200
KPKLKLKIIN QNSVAVLQTP PDIQSEHSRD GEMDDSREGE LMDCDGKSES SPEREAVDDE 1260
TKGVEGTDGV KKRKRKPYRP GIGGFMVRQR SRTGQGKTKR SVIRKDSSGS ISEQLPCRDD 1320
GWSEQLPDTL VDESVSVTES TEKIKKRYRK RKNKLEETFP AYLQEAFFGK DLLDTSRQSK 1380
ISLDNLSEDG AQLLYKTNMN TGFLDPSLDP LLSSSSAPTK SGTHGPADDP LADISEVLNT 1440
DDDILGIISD DLAKSVDHSD IGPVTDDPSS LPQPNVNQSS RPLSEEQLDG ILSPELDKMV 1500
TDGAILGKLY KIPELGGKDV EDLFTAVLSP ANTQPTPLPQ PPPPTQLLPI HNQDAFSRMP 1560
LMNGLIGSSP HLPHNSLPPG SGLGTFSAIA QSSYPDARDK NSAFNPMASD PNNSWTSSAP 1620
TVEGENDTMS NAQRSTLKWE KEEALGEMAT VAPVLYTNIN FPNLKEEFPD WTTRVKQIAK 1680
LWRKASSQER APYVQKARDN RAALRINKVQ MSNDSMKRQQ QQDSIDPSSR IDSELFKDPL 1740
KQRESEHEQE WKFRQQMRQK SKQQAKIEAT QKLEQVKNEQ QQQQQQQFGS QHLLVQSGSD 1800
TPSSGIQSPL TPQPGNGNMS PAQSFHKELF TKQPPSTPTS TSSDDVFVKP QAPPPPPAPS 1860
RIPIQDSLSQ AQTSQPPSPQ VFSPGSSNSR PPSPMDPYAK MVGTPRPPPV GHSFSRRNSA 1920
APVENCTPLS SVSRPLQMNE TTANRPSPVR DLCSSSTTNN DPYAKPPDTP RPVMTDQFPK 1980
SLGLSRSPVV SEQTAKGPIA AGTSDHFTKP SPRADVFQRQ RIPDSYARPL LTPAPLDSGP 2040
GPFKTPMQPP PSSQDPYGSV SQASRRLSVD PYERPALTPR PIDNFSHNQS NDPYSQPPLT 2100
PHPAVNESFA HPSRAFSQPG TISRPTSQDP YSQPPGTPRP VVDSYSQSSG TARSNTDPYS 2160
QPPGTPRPTT VDPYSQQPQT PRPSTQTDLF VTPVTNQRHS DPYAHPPGTP RPGISVPYSQ 2220
PPATPRPRIS EGFTRSSMTR PVLMPNQDPF LQAAQNRGPA LPGPLVRPPD TCSQTPRPPG 2280
PGLSDTFSRV SPSAARDPYD QSPMTPRSQS DSFGTSQTAH DVADQPRPGS EGSFCASSNS 2340
PMHSQGQQFS GVSQLPGPVP TSGVTDTQNT VNMAQADTEK LRQRQKLREI ILQQQQQKKI 2400
AGRQEKGSQD SPAVPHPGPL QHWQPENVNQ AFTRPPPPYP GNIRSPVAPP LGPRYAVFPK 2460
DQRGPYPPDV ASMGMRPHGF RFGFPGGSHG TMPSQERFLV PPQQIQGSGV SPQLRRSVSV 2520
DMPRPLNNSQ MNNPVGLPQH FSPQSLPVQQ HNILGQAYIE LRHRAPDGRQ RLPFSAPPGS 2580
VVEASSNLRH GNFIPRPDFP GPRHTDPMRR PPQGLPNQLP VHPDLEQVPP SQQEQGHSVH 2640
SSSMVMRTLN HPLGGEFSEA PLSTSVPSET TSDNLQITTQ PSDGLEEKLD SDDPSVKELD 2700
VKDLEGVEVK DLDDEDLENL NLDTEDGKVV ELDTLDNLET NDPNLDDLLR SGEFDIIAYT 2760
DPELDMGDKK SMFNEELDLP IDDKLDNQCV SVEPKKKEQE NKTLVLSDKH SPQKKSTVTN 2820
EVKTEVLSPN SKVESKCETE KNDENKDNVD TPCSQASAHS DLNDGEKTSL HPCDPDLFEK 2880
RTNRETAGPS ANVIQASTQL PAQDVINSCG ITGSTPVLSS LLANEKSDNS DIRPSGSPPP 2940
PTLPASPSNH VSSLPPFIAP PGRVLDNAMN SNVTVVSRVN HVFSQGVQVN PGLIPGQSTV 3000
NHSLGTGKPA TQTGPQTSQS GTSSMSGPQQ LMIPQTLAQQ NRERPLLLEE QPLLLQDLLD 3060
QERQEQQQQR QMQAMIRQRS EPFFPNIDFD AITDPIMKAK MVALKGINKV MAQNNLGMPP 3120
MVMSRFPFMG QVVTGTQNSE GQNLGPQAIP QDGSITHQIS RPNPPNFGPG FVNDSQRKQY 3180
EEWLQETQQL LQMQQKYLEE QIGAHRKSKK ALSAKQRTAK KAGREFPEED AEQLKHVTEQ 3240
QSMVQKQLEQ IRKQQKEHAE LIEDYRIKQQ QQCAMAPPTM MPSVQPQPPL IPGATPPTMS 3300
QPTFPMVPQQ LQHQQHTTVI SGHTSPVRMP SLPGWQPNSA PAHLPLNPPR IQPPIAQLPI 3360
KTCTPAPGTV SNANPQSGPP PRVEFDDNNP FSESFQERER KERLREQQER QRIQLMQEVD 3420
RQRALQQRME MEQHGMVGSE ISSSRTSVSQ IPFYSSDLPC DFMQPLGPLQ QSPQHQQQMG 3480
QVLQQQNIQQ GSINSPSTQT FMQTNERRQV GPPSFVPDSP SIPVGSPNFS SVKQGHGNLS 3540
GTSFQQSPVR PSFTPALPAA PPVANSSLPC GQDSTITHGH SYPGSTQSLI QLYSDIIPEE 3600
KGKKKRTRKK KRDDDAESTK APSTPHSDIT APPTPGISET TSTPAVSTPS ELPQQADQES 3660
VEPVGPSTPN MAAGQLCTEL ENKLPNSDFS QATPNQQTYA NSEVDKLSME TPAKTEEIKL 3720
EKAETESCPG QEEPKLEEQN GSKVEGNAVA CPVSSAQSPP HSAGAPAAKG DSGNELLKHL 3780
LKNKKSSSLL NQKPEGSICS EDDCTKDNKL VEKQNPAEGL QTLGAQMQGG FGCGNQLPKT 3840
DGGSETKKQR SKRTQRTGEK AAPRSKKRKK DEEEKQAMYS STDTFTHLKQ QNNLSNPPTP 3900
PASLPPTPPP MACQKMANGF ATTEELAGKA GVLVSHEVTK TLGPKPFQLP FRPQDDLLAR 3960
ALAQGPKTVD VPASLPTPPH NNQEELRIQD HCGDRDTPDS FVPSSSPESV VGVEVSRYPD 4020
LSLVKEEPPE PVPSPIIPIL PSTAGKSSES RRNDIKTEPG TLYFASPFGP SPNGPRSGLI 4080
SVAITLHPTA AENISSVVAA FSDLLHVRIP NSYEVSSAPD VPSMGLVSSH RINPGLEYRQ 4140
HLLLRGPPPG SANPPRLVSS YRLKQPNVPF PPTSNGLSGY KDSSHGIAES AALRPQWCCH 4200
CKVVILGSGV RKSFKDLTLL NKDSRESTKR VEKDIVFCSN NCFILYSSTA QAKNSENKES 4260
IPSLPQSPMR ETPSKAFHQY SNNISTLDVH CLPQLPEKAS PPASPPIAFP PAFEAAQVEA 4320
KPDELKVTVK LKPRLRAVHG GFEDCRPLNK KWRGMKWKKW SIHIVIPKGT FKPPCEDEID 4380
EFLKKLGTSL KPDPVPKDYR KCCFCHEEGD GLTDGPARLL NLDLDLWVHL NCALWSTEVY 4440
ETQAGALINV ELALRRGLQM KCVFCHKTGA TSGCHRFRCT NIYHFTCAIK AQCMFFKDKT 4500
MLCPMHKPKG IHEQELSYFA VFRRVYVQRD EVRQIASIVQ RGERDHTFRV GSLIFHTIGQ 4560
LLPQQMQAFH SPKALFPVGY EASRLYWSTR YANRRCRYLC SIEEKDGRPV FVIRIVEQGH 4620
EDLVLSDISP KGVWDKILEP VACVRKKSEM LQLFPAYLKG EDLFGLTVSA VARIAESLPG 4680
VEACENYTFR YGRNPLMELP LAVNPTGCAR SEPKMSAHVK RFVLRPHTLN STSTSKSFQS 4740
TVTGELNAPY SKQFVHSKSS QYRKMKTEWK SNVYLARSRI QGLGLYAARD IEKHTMVIEY 4800
IGTIIRNEVA NRKEKLYESQ NRGVYMFRMD NDHVIDATLT GGPARYINHS CAPNCVAEVV 4860
TFERGHKIII SSSRRIQKGE ELCYDYKFDF EDDQHKIPCH CGAVNCRKWM N 4911
Nucleotide Sequence
(Fasta)
GAGGTGCGCG CGCCCGCGCC GATGTGTGTG AGTGCGTGTC CTGCTCGCTC CATGTTGCCG 60
CCTCTCCCGG TACCTGCTGC TGCTCCCGGG GCTGCGGGAA ATGCGAGAGG CTGAGCCGGG 120
GAGGAGGAAC CCGAGCAGCA GCGGCGGCGG CGGCGGCCGC GGCGGCGGGA GCCCCCCAGG 180
AGGAGGACCG GGATCCATGT GTCTTTCCTG GTGACTAGGA TGTCGTCGGA GGAGGACAAG 240
AGCGTGGAGC AGCCGCAGCC GCCGCCACCA CCCCCCGAGG AGCCTGGAGC CCCGGCCCCG 300
AGCCCCGCAG CCGCAGACAA AAGACCTCGG GGCCGGCCTC GCAAAGATGG CGCTTCCCCT 360
TTCCAGAGAG CCAGAAAGAA ACCTCGAAGT AGGGGGAAAA CTGCAGTGGA AGATGAGGAC 420
AGCATGGATG GGCTGGAGAC AACAGAAACA GAAACGATTG TGGAAACAGA AATCAAAGAA 480
CAATCTGCAG AAGAGGATGC TGAAGCAGAA GTGGATAACA GCAAACAGCT AATTCCAACT 540
CTTCAGCGAT CTGTGTCTGA GGAATCGGCA AACTCCCTGG TCTCTGTTGG TGTAGAAGCC 600
AAAATCAGTG AACAGCTCTG CGCTTTTTGT TACTGTGGGG AAAAAAGTTC CTTAGGACAA 660
GGAGACTTAA AACAATTCAG AATAACGCCT GGATTTATCT TGCCATGGAG AAACCAACCT 720
TCTAACAAGA AGGACATTGA TGACAACAGC AATGGAACCT ATGAGAAAAT GCAAAACTCA 780
GCACCACGAA AACAAAGAGG ACAGAGAAAA GAACGATCTC CTCAGCAGAA TATAGTATCT 840
TGTGTAAGTG TAAGCACCCA GACAGCTTCA GATGATCAAG CTGGTAAACT GTGGGATGAA 900
CTCAGTCTGG TTGGGCTTCC AGATGCCATT GATATCCAAG CCTTATTTGA TTCTACAGGC 960
ACTTGTTGGG CTCATCACCG TTGTGTGGAG TGGTCACTAG GAGTATGCCA GATGGAAGAA 1020
CCATTGTTAG TGAACGTGGA CAAAGCTGTT GTCTCAGGGA GCACAGAACG ATGTGCATTT 1080
TGTAAGCACC TTGGAGCCAC TATCAAATGC TGTGAAGAGA AATGTACCCA GATGTATCAT 1140
TATCCTTGTG CTGCAGGAGC CGGCACCTTT CAGGATTTCA GTCACATCTT CCTGCTTTGT 1200
CCAGAACACA TTGACCAAGC TCCTGAAAGA TCGAAGGAAG ATGCAAACTG TGCAGTGTGC 1260
GACAGCCCGG GAGACCTCTT AGATCAGTTC TTTTGTACTA CTTGTGGTCA GCACTATCAT 1320
GGAATGTGCC TGGATATAGC GGTTACTCCA TTAAAACGTG CAGGTTGGCA ATGTCCTGAG 1380
TGCAAAGTGT GCCAGAACTG CAAACAATCG GGAGAAGATA GCAAGATGCT AGTGTGTGAT 1440
ACGTGTGACA AAGGGTATCA TACTTTTTGT CTTCAACCAG TTATGAAATC AGTACCAACC 1500
AATGGCTGGA AATGCAAAAA TTGCAGAATA TGTATAGAGT GTGGCACACG GTCTAGTTCT 1560
CAGTGGCACC ACAATTGCCT GATATGTGAC AATTGTTACC AACAGCAGGA TAACTTATGT 1620
CCCTTCTGTG GGAAGTGTTA TCATCCAGAA TTGCAGAAAG ACATGCTTCA TTGTAATATG 1680
TGCAAAAGGT GGGTTCACCT AGAGTGTGAC AAACCAACAG ATCATGAACT GGATACTCAG 1740
CTCAAAGAAG AGTATATCTG CATGTATTGT AAACACCTGG GAGCTGAGAT GGATCGTTTA 1800
CAGCCAGGTG AGGAAGTGGA GATAGCTGAG CTCACTACAG ATTATAACAA TGAAATGGAA 1860
GTTGAAGGCC CTGAAGATCA AATGGTATTC TCAGAGCAGG CAGCTAATAA AGATGTCAAC 1920
GGTCAGGAGT CCACTCCTGG AATTGTTCCA GATGCGGTTC AAGTCCACAC TGAAGAGCAA 1980
CAGAAGAGTC ATCCCTCAGA AAGTCTTGAC ACAGATAGTC TTCTTATTGC TGTATCATCC 2040
CAACATACAG TGAATACTGA ATTGGAAAAA CAGATTTCTA ATGAAGTTGA TAGTGAAGAC 2100
CTGAAAATGT CTTCTGAAGT GAAGCATATT TGTGGCGAAG ATCAAATTGA AGATAAAATG 2160
GAAGTGACAG AAAACATTGA AGTCGTTACA CACCAGATCA CTGTGCAGCA AGAACAACTG 2220
CAGTTGTTAG AGGAACCTGA AACAGTGGTA TCCAGAGAAG AATCAAGGCC TCCAAAATTA 2280
GTCATGGAAT CTGTCACTCT TCCACTAGAA ACCTTAGTGT CCCCACATGA GGAAAGTATT 2340
TCATTATGTC CTGAGGAACA GTTGGTTATA GAAAGGCTAC AAGGAGAAAA GGAACAGAAA 2400
GAAAATTCTG AACTTTCTAC TGGATTGATG GACTCTGAAA TGACTCCTAC AATTGAGGGT 2460
TGTGTGAAAG ATGTTTCATA CCAAGGAGGC AAATCTATAA AGTTATCATC TGAGACAGAG 2520
TCATCATTTT CATCATCAGC AGACATAAGC AAGGCAGATG TGTCTTCCTC CCCAACACCT 2580
TCTTCAGACT TGCCTTCGCA TGACATGCTG CATAATTACC CTTCAGCTCT TAGTTCCTCT 2640
GCTGGAAACA TCATGCCAAC AACTTACATC TCAGTCACTC CAAAAATTGG CATGGGTAAA 2700
CCAGCTATTA CTAAGAGAAA ATTTTCTCCT GGTAGACCTC GGTCCAAACA GGGGGCTTGG 2760
AGTACCCATA ATACAGTGAG CCCACCTTCC TGGTCCCCAG ACATTTCAGA AGGTCGGGAA 2820
ATTTTTAAAC CCAGGCAGCT TCCTGGCAGT GCCATTTGGA GCATCAAAGT GGGCCGTGGG 2880
TCTGGATTTC CAGGAAAGCG GAGACCTCGA GGTGCAGGAC TGTCGGGGCG AGGTGGCCGA 2940
GGCAGGTCAA AGCTGAAAAG TGGAATCGGA GCTGTTGTAT TACCTGGGGT GTCTACTGCA 3000
GATATTTCAT CAAATAAGGA TGATGAAGAA AACTCTATGC ACAATACAGT TGTGTTGTTT 3060
TCTAGCAGTG ACAAGTTCAC TTTGAATCAG GATATGTGTG TAGTTTGTGG CAGTTTTGGC 3120
CAAGGAGCAG AAGGAAGATT ACTTGCCTGT TCTCAGTGTG GTCAGTGTTA CCATCCATAC 3180
TGTGTCAGTA TTAAGATCAC TAAAGTGGTT CTTAGCAAAG GTTGGAGGTG TCTTGAGTGC 3240
ACTGTGTGTG AGGCCTGTGG GAAGGCAACT GACCCAGGAA GACTCCTGCT GTGTGATGAC 3300
TGTGACATAA GTTATCACAC CTACTGCCTA GACCCTCCAT TGCAGACAGT TCCCAAAGGA 3360
GGCTGGAAGT GCAAATGGTG TGTTTGGTGC AGACACTGTG GAGCAACATC TGCAGGTCTA 3420
AGATGTGAAT GGCAGAACAA TTACACACAG TGCGCTCCTT GTGCAAGCTT ATCTTCCTGT 3480
CCAGTCTGCT ATCGAAACTA TAGAGAAGAA GATCTTATTC TGCAATGTAG ACAATGTGAT 3540
AGATGGATGC ATGCAGTTTG TCAGAACTTA AATACTGAGG AAGAAGTGGA AAATGTAGCA 3600
GACATTGGTT TTGATTGTAG CATGTGCAGA CCCTATATGC CTGCGTCTAA TGTGCCTTCC 3660
TCAGACTGCT GTGAATCTTC ACTTGTAGCA CAAATTGTCA CAAAAGTAAA AGAGCTAGAC 3720
CCACCCAAGA CTTATACCCA GGATGGTGTG TGTTTGACTG AATCAGGGAT GACTCAGTTA 3780
CAGAGCCTCA CAGTTACAGT TCCAAGAAGA AAACGGTCAA AACCAAAATT GAAATTGAAG 3840
ATTATAAATC AGAATAGCGT GGCCGTCCTT CAGACCCCTC CAGACATCCA ATCAGAGCAT 3900
TCAAGGGATG GTGAAATGGA TGATAGTCGA GAAGGAGAAC TTATGGATTG TGATGGAAAA 3960
TCAGAATCTA GTCCTGAGCG GGAAGCTGTG GATGATGAAA CTAAGGGAGT GGAAGGAACA 4020
GATGGTGTCA AAAAGAGAAA AAGGAAACCA TACAGACCAG GTATTGGTGG ATTTATGGTG 4080
CGGCAAAGAA GTCGAACTGG GCAAGGGAAA ACCAAAAGAT CTGTGATCAG AAAAGATTCC 4140
TCAGGCTCTA TTTCCGAGCA GTTACCTTGC AGAGATGATG GCTGGAGTGA GCAGTTACCA 4200
GATACTTTAG TTGATGAATC TGTTTCTGTT ACTGAAAGCA CTGAAAAAAT AAAGAAGAGA 4260
TACCGAAAAA GGAAAAATAA GCTTGAAGAA ACTTTCCCTG CCTATTTACA AGAAGCTTTC 4320
TTTGGAAAAG ATCTTCTAGA TACAAGTAGA CAAAGCAAGA TAAGTTTAGA TAATCTGTCA 4380
GAAGATGGAG CTCAGCTTTT ATATAAAACA AACATGAACA CAGGTTTCTT GGATCCTTCC 4440
TTAGATCCAC TACTTAGTTC ATCCTCGGCT CCAACAAAAT CTGGAACTCA CGGTCCTGCT 4500
GATGACCCAT TAGCTGATAT TTCTGAAGTT TTAAACACAG ATGATGACAT TCTTGGAATA 4560
ATTTCAGATG ATCTAGCAAA ATCAGTTGAT CATTCAGATA TTGGTCCTGT CACTGATGAT 4620
CCTTCCTCTT TGCCTCAGCC AAATGTCAAT CAGAGTTCAC GACCATTAAG TGAAGAACAG 4680
CTAGATGGGA TCCTCAGTCC TGAACTAGAC AAAATGGTCA CAGATGGAGC AATTCTTGGA 4740
AAATTATATA AAATTCCAGA GCTTGGCGGA AAAGATGTTG AAGACTTATT TACAGCTGTA 4800
CTTAGTCCTG CGAACACTCA GCCAACTCCA TTGCCACAGC CTCCCCCACC AACACAGCTG 4860
TTGCCAATAC ACAATCAGGA TGCTTTTTCA CGGATGCCTC TCATGAATGG CCTTATTGGA 4920
TCCAGTCCTC ATCTCCCACA TAATTCTTTG CCACCTGGAA GCGGACTGGG AACTTTCTCT 4980
GCAATTGCAC AATCCTCTTA TCCTGATGCC AGGGATAAAA ATTCAGCCTT TAATCCAATG 5040
GCAAGTGATC CTAACAACTC TTGGACATCA TCAGCTCCCA CTGTGGAAGG AGAAAATGAC 5100
ACAATGTCGA ATGCCCAGAG AAGCACGCTT AAGTGGGAGA AAGAGGAGGC TCTGGGTGAA 5160
ATGGCAACTG TTGCCCCAGT TCTCTACACC AATATTAATT TCCCCAACTT AAAGGAAGAA 5220
TTCCCTGATT GGACTACTAG AGTGAAGCAA ATTGCCAAAT TGTGGAGAAA AGCAAGCTCA 5280
CAAGAAAGAG CACCATATGT GCAAAAAGCC AGAGATAACA GAGCTGCTTT ACGCATTAAT 5340
AAAGTACAGA TGTCAAATGA TTCCATGAAA AGGCAGCAAC AGCAAGATAG CATTGATCCC 5400
AGCTCTCGTA TTGATTCGGA GCTTTTTAAA GATCCTTTAA AGCAAAGAGA ATCAGAACAT 5460
GAACAGGAAT GGAAATTTAG ACAGCAAATG CGTCAGAAAA GTAAGCAGCA AGCTAAAATT 5520
GAAGCCACAC AGAAACTTGA ACAGGTGAAA AATGAGCAGC AGCAGCAGCA ACAACAGCAA 5580
TTTGGTTCTC AGCATCTTCT GGTGCAGTCT GGTTCAGATA CACCAAGTAG TGGGATACAG 5640
AGTCCCTTGA CACCTCAGCC TGGCAATGGA AATATGTCTC CTGCACAGTC ATTCCATAAA 5700
GAACTGTTTA CAAAACAGCC ACCCAGTACC CCTACGTCTA CATCTTCAGA TGATGTGTTT 5760
GTAAAGCCAC AAGCTCCACC TCCTCCTCCA GCCCCATCCC GGATTCCCAT CCAGGATAGT 5820
CTTTCTCAGG CTCAGACTTC TCAGCCACCC TCACCGCAAG TGTTTTCACC TGGGTCCTCT 5880
AACTCACGAC CACCATCTCC AATGGATCCA TATGCAAAAA TGGTTGGTAC CCCTCGACCA 5940
CCTCCTGTGG GCCATAGTTT TTCCAGAAGA AATTCTGCTG CACCAGTGGA AAACTGTACA 6000
CCTTTATCAT CGGTATCTAG GCCCCTTCAA ATGAATGAGA CAACAGCAAA TAGGCCATCC 6060
CCTGTCAGAG ATTTATGTTC TTCTTCCACG ACAAATAATG ACCCCTATGC AAAACCTCCA 6120
GACACACCTA GGCCTGTGAT GACAGATCAA TTTCCCAAAT CCTTGGGCCT ATCCCGGTCT 6180
CCTGTAGTTT CAGAACAAAC TGCAAAAGGC CCTATAGCAG CTGGAACCAG TGATCACTTT 6240
ACTAAACCAT CTCCTAGGGC AGATGTGTTT CAAAGACAAA GGATACCTGA CTCATATGCA 6300
CGACCCTTGT TGACACCTGC ACCTCTTGAT AGTGGTCCTG GACCTTTTAA GACTCCAATG 6360
CAACCTCCTC CATCCTCTCA GGATCCTTAT GGATCAGTGT CACAGGCATC AAGGCGATTG 6420
TCTGTTGACC CTTATGAAAG GCCTGCTTTG ACACCAAGAC CTATAGATAA TTTTTCTCAT 6480
AATCAGTCAA ATGATCCATA TAGTCAGCCT CCCCTTACCC CACATCCAGC AGTGAATGAA 6540
TCTTTTGCCC ATCCTTCAAG GGCTTTTTCC CAGCCTGGAA CCATATCAAG GCCAACATCT 6600
CAGGACCCAT ACTCCCAACC CCCAGGAACT CCACGACCTG TTGTAGATTC TTATTCCCAA 6660
TCTTCAGGAA CAGCTAGGTC CAATACAGAC CCTTACTCTC AACCTCCTGG AACTCCCCGG 6720
CCTACTACTG TTGACCCATA TAGTCAGCAG CCCCAAACCC CAAGACCATC TACACAAACT 6780
GACTTGTTTG TTACACCTGT AACAAATCAG AGGCATTCTG ATCCATATGC TCATCCTCCT 6840
GGAACACCAA GACCTGGAAT TTCTGTCCCT TACTCTCAGC CACCAGCAAC ACCAAGGCCA 6900
AGGATTTCAG AGGGTTTTAC TAGGTCCTCA ATGACAAGAC CAGTCCTCAT GCCAAATCAG 6960
GATCCTTTCC TGCAAGCAGC ACAAAACCGA GGACCAGCTT TACCTGGCCC GTTGGTAAGG 7020
CCACCTGATA CATGTTCCCA GACACCTAGG CCCCCTGGAC CTGGTCTTTC AGACACATTT 7080
AGCCGTGTTT CCCCATCTGC TGCCCGTGAT CCCTATGATC AGTCTCCAAT GACTCCAAGA 7140
TCTCAGTCTG ACTCTTTTGG AACAAGTCAA ACTGCCCATG ATGTTGCTGA TCAGCCAAGG 7200
CCTGGATCAG AGGGGAGCTT CTGTGCATCT TCAAACTCTC CAATGCACTC CCAAGGCCAG 7260
CAGTTCTCTG GTGTCTCCCA ACTTCCTGGA CCTGTGCCAA CTTCAGGAGT AACTGATACA 7320
CAGAATACTG TAAATATGGC CCAAGCAGAT ACAGAGAAAT TGAGACAGCG GCAGAAGTTA 7380
CGTGAAATCA TTCTCCAGCA GCAACAGCAG AAGAAGATTG CAGGTCGACA GGAGAAGGGG 7440
TCACAGGACT CACCCGCAGT GCCTCATCCA GGGCCTCTTC AACACTGGCA ACCAGAGAAT 7500
GTTAACCAGG CTTTCACCAG ACCCCCACCT CCCTATCCTG GGAACATTAG GTCTCCTGTT 7560
GCCCCTCCTT TAGGACCTAG ATATGCTGTT TTCCCAAAAG ATCAGCGTGG ACCCTATCCT 7620
CCTGATGTTG CTAGTATGGG GATGAGACCT CATGGATTTA GATTTGGATT TCCAGGAGGT 7680
AGTCATGGTA CCATGCCGAG TCAAGAGCGC TTCCTTGTGC CTCCTCAGCA AATACAGGGA 7740
TCTGGAGTTT CTCCACAGCT AAGAAGATCA GTATCTGTAG ATATGCCTAG GCCTTTAAAT 7800
AACTCACAAA TGAATAATCC AGTTGGACTT CCTCAGCATT TTTCACCACA GAGCTTGCCA 7860
GTTCAGCAGC ACAACATACT GGGCCAAGCA TATATTGAAC TGAGACATAG GGCTCCTGAC 7920
GGAAGGCAAC GGCTGCCTTT CAGTGCTCCA CCTGGCAGCG TTGTAGAGGC ATCTTCTAAT 7980
CTGAGACATG GAAACTTCAT TCCCCGGCCA GACTTTCCGG GCCCTAGACA CACAGACCCC 8040
ATGCGACGAC CTCCCCAGGG TCTACCTAAT CAGCTACCTG TGCACCCAGA TTTGGAACAA 8100
GTGCCACCAT CTCAACAAGA GCAAGGTCAT TCTGTCCATT CATCTTCTAT GGTCATGAGG 8160
ACTCTGAACC ATCCACTAGG TGGTGAATTT TCAGAAGCTC CTTTGTCAAC ATCTGTACCG 8220
TCTGAAACAA CGTCTGATAA TTTACAGATA ACCACCCAGC CTTCTGATGG TCTAGAGGAA 8280
AAACTTGATT CTGATGACCC TTCTGTGAAG GAACTGGATG TTAAAGACCT TGAGGGGGTT 8340
GAAGTCAAAG ACTTAGATGA TGAAGATCTT GAAAACTTAA ATTTAGATAC AGAGGATGGC 8400
AAGGTAGTTG AATTGGATAC TTTAGATAAT TTGGAAACTA ATGATCCCAA CCTGGATGAC 8460
CTCTTAAGGT CAGGAGAGTT TGATATCATT GCATATACAG ATCCAGAACT TGACATGGGA 8520
GATAAGAAAA GCATGTTTAA TGAGGAACTA GACCTTCCAA TTGATGATAA GTTAGATAAT 8580
CAGTGTGTAT CTGTTGAACC AAAAAAAAAG GAACAAGAAA ACAAAACTCT GGTTCTCTCT 8640
GATAAACATT CACCACAGAA AAAATCCACT GTTACCAATG AGGTAAAAAC GGAAGTACTG 8700
TCTCCAAATT CTAAGGTGGA ATCCAAATGT GAAACTGAAA AAAATGATGA GAATAAAGAT 8760
AATGTTGACA CTCCTTGCTC ACAGGCTTCT GCTCACTCAG ACCTAAATGA TGGAGAAAAG 8820
ACTTCTTTGC ATCCTTGTGA TCCAGATCTA TTTGAGAAAA GAACCAATCG AGAAACTGCT 8880
GGCCCCAGTG CAAATGTCAT TCAGGCATCC ACTCAACTAC CTGCTCAAGA TGTAATAAAC 8940
TCTTGTGGCA TAACTGGATC AACTCCAGTT CTCTCAAGTT TACTTGCTAA TGAGAAATCT 9000
GATAATTCAG ACATTAGGCC ATCGGGGTCT CCACCACCAC CAACTCTGCC GGCCTCCCCA 9060
TCCAATCATG TGTCAAGTTT GCCTCCTTTC ATAGCACCGC CTGGCCGTGT TTTGGATAAT 9120
GCCATGAATT CTAATGTGAC AGTAGTCTCT AGGGTAAACC ATGTTTTTTC TCAGGGTGTG 9180
CAGGTAAACC CAGGGCTCAT TCCAGGTCAA TCAACAGTTA ACCACAGTCT GGGGACAGGA 9240
AAACCTGCAA CTCAAACTGG GCCTCAAACA AGTCAGTCTG GTACCAGTAG CATGTCTGGA 9300
CCCCAACAGC TAATGATTCC TCAAACATTA GCACAGCAGA ATAGAGAGAG GCCCCTTCTT 9360
CTAGAAGAAC AGCCTCTACT TCTACAGGAT CTTTTGGATC AAGAAAGGCA AGAACAGCAG 9420
CAGCAAAGAC AGATGCAAGC CATGATTCGT CAGCGATCAG AACCGTTCTT CCCTAATATT 9480
GATTTTGATG CAATTACAGA TCCTATAATG AAAGCCAAAA TGGTGGCCCT TAAAGGTATA 9540
AATAAAGTGA TGGCACAAAA CAATCTGGGC ATGCCACCAA TGGTGATGAG CAGGTTCCCT 9600
TTTATGGGCC AGGTGGTAAC TGGAACACAG AACAGTGAAG GACAGAACCT TGGACCACAG 9660
GCCATTCCTC AGGATGGCAG TATAACACAT CAGATTTCTA GGCCTAATCC TCCAAATTTT 9720
GGTCCAGGCT TTGTCAATGA TTCACAGCGT AAGCAGTATG AAGAGTGGCT CCAGGAGACC 9780
CAACAGCTGC TTCAAATGCA GCAGAAGTAT CTTGAAGAAC AAATTGGTGC TCACAGAAAA 9840
TCTAAGAAGG CCCTTTCAGC TAAACAACGT ACTGCCAAGA AAGCTGGGCG TGAATTTCCA 9900
GAGGAAGATG CAGAACAACT CAAGCATGTT ACTGAACAGC AAAGCATGGT TCAGAAACAG 9960
CTAGAACAGA TTCGTAAACA ACAGAAAGAA CATGCTGAAT TGATTGAAGA TTATCGGATC 10020
AAACAGCAGC AGCAATGTGC AATGGCCCCA CCTACCATGA TGCCCAGTGT CCAGCCCCAG 10080
CCACCCCTAA TTCCAGGTGC CACTCCACCC ACCATGAGCC AACCCACCTT TCCCATGGTG 10140
CCACAGCAGC TTCAGCACCA GCAGCACACA ACAGTTATTT CTGGCCATAC TAGCCCTGTT 10200
AGAATGCCCA GTTTACCTGG ATGGCAACCC AACAGTGCTC CTGCCCACCT GCCCCTCAAT 10260
CCTCCTAGAA TTCAGCCCCC AATTGCCCAG TTACCAATAA AAACTTGTAC ACCAGCCCCA 10320
GGGACAGTCT CAAATGCAAA TCCACAGAGT GGACCACCAC CTCGGGTAGA ATTTGATGAC 10380
AACAATCCCT TTAGTGAAAG TTTTCAAGAA CGGGAACGTA AGGAACGTTT ACGAGAACAG 10440
CAAGAGAGAC AACGGATCCA ACTCATGCAG GAGGTAGATA GACAAAGAGC TTTGCAGCAG 10500
AGGATGGAAA TGGAGCAGCA TGGTATGGTG GGCTCTGAGA TAAGTAGTAG TAGGACATCT 10560
GTGTCCCAGA TTCCCTTCTA CAGTTCCGAC TTACCTTGTG ATTTTATGCA ACCTCTAGGA 10620
CCCCTTCAGC AGTCTCCACA ACACCAACAG CAAATGGGGC AGGTTTTACA GCAGCAGAAT 10680
ATACAACAAG GATCAATTAA TTCACCCTCC ACCCAAACTT TCATGCAGAC TAATGAGCGA 10740
AGGCAGGTAG GCCCTCCTTC ATTTGTTCCT GATTCACCAT CAATCCCTGT TGGAAGCCCA 10800
AATTTTTCTT CTGTGAAGCA GGGACATGGA AATCTTTCTG GGACCAGCTT CCAGCAGTCC 10860
CCAGTGAGGC CTTCTTTTAC ACCTGCTTTA CCAGCAGCAC CTCCAGTAGC TAATAGCAGT 10920
CTCCCATGTG GCCAAGATTC TACTATAACC CATGGACACA GTTATCCGGG ATCAACCCAA 10980
TCGCTCATTC AGTTGTATTC TGATATAATC CCAGAGGAAA AAGGGAAAAA GAAAAGAACA 11040
AGAAAGAAGA AAAGAGATGA TGATGCAGAA TCCACCAAGG CTCCATCAAC TCCCCATTCA 11100
GATATAACTG CCCCACCGAC TCCAGGCATC TCAGAAACTA CCTCTACTCC TGCAGTGAGC 11160
ACACCCAGTG AGCTTCCTCA ACAAGCCGAC CAAGAGTCGG TGGAACCAGT CGGCCCATCC 11220
ACTCCCAATA TGGCAGCAGG CCAGCTATGT ACAGAATTAG AGAACAAACT GCCCAATAGT 11280
GATTTCTCAC AAGCAACTCC AAATCAACAG ACGTATGCAA ATTCAGAAGT AGACAAGCTC 11340
TCCATGGAAA CCCCTGCCAA AACAGAAGAG ATAAAACTGG AAAAGGCTGA GACAGAGTCC 11400
TGCCCAGGCC AAGAGGAGCC TAAATTGGAG GAACAGAATG GTAGTAAGGT AGAAGGAAAC 11460
GCTGTAGCCT GTCCTGTCTC CTCAGCACAG AGTCCTCCCC ATTCTGCTGG GGCCCCTGCT 11520
GCCAAAGGAG ACTCAGGGAA TGAACTTCTG AAACACTTGT TGAAAAATAA AAAGTCATCT 11580
TCTCTTTTGA ATCAAAAACC TGAGGGCAGT ATTTGTTCAG AAGATGACTG TACAAAGGAT 11640
AATAAACTAG TTGAGAAGCA GAACCCAGCT GAAGGACTGC AAACTTTGGG GGCTCAAATG 11700
CAAGGTGGTT TTGGATGTGG CAACCAGTTG CCAAAAACAG ATGGAGGAAG TGAAACCAAG 11760
AAACAGCGAA GCAAACGGAC TCAGAGGACG GGTGAGAAAG CAGCACCTCG CTCAAAGAAA 11820
AGGAAAAAGG ACGAAGAGGA GAAACAAGCT ATGTACTCTA GCACTGACAC GTTTACCCAC 11880
TTGAAACAGC AGAATAATTT AAGTAATCCT CCAACACCCC CTGCCTCTCT TCCTCCTACA 11940
CCACCTCCTA TGGCTTGTCA GAAGATGGCC AATGGTTTTG CAACAACTGA AGAACTTGCT 12000
GGAAAAGCCG GAGTGTTAGT GAGCCATGAA GTTACCAAAA CTCTAGGACC TAAACCATTT 12060
CAGCTGCCCT TCAGACCCCA GGACGACTTG TTGGCCCGAG CTCTTGCTCA GGGCCCCAAG 12120
ACAGTTGATG TGCCAGCCTC CCTCCCAACA CCACCTCATA ACAATCAGGA AGAATTAAGG 12180
ATACAGGATC ACTGTGGTGA TCGAGATACT CCTGACAGTT TTGTTCCCTC ATCCTCTCCT 12240
GAGAGTGTGG TTGGGGTAGA AGTGAGCAGG TATCCAGATC TGTCATTGGT CAAGGAGGAG 12300
CCTCCAGAAC CGGTGCCGTC CCCCATCATT CCAATTCTTC CTAGCACTGC TGGGAAAAGT 12360
TCAGAATCAA GAAGGAATGA CATCAAAACT GAGCCAGGCA CTTTATATTT TGCGTCACCT 12420
TTTGGTCCTT CCCCAAATGG TCCCAGATCA GGTCTTATAT CTGTAGCAAT TACTCTGCAT 12480
CCTACAGCTG CTGAGAACAT TAGCAGTGTT GTGGCTGCAT TTTCCGACCT TCTTCACGTC 12540
CGAATCCCTA ACAGCTATGA GGTTAGCAGT GCTCCAGATG TCCCATCCAT GGGTTTGGTC 12600
AGTAGCCACA GAATCAACCC GGGTTTGGAG TATCGACAGC ATTTACTTCT CCGTGGGCCT 12660
CCGCCAGGAT CTGCAAACCC TCCCAGATTA GTGAGCTCTT ACCGGCTGAA GCAGCCTAAT 12720
GTACCATTTC CTCCAACAAG CAATGGTCTT TCTGGATATA AGGATTCTAG TCATGGTATT 12780
GCAGAAAGCG CAGCACTCAG ACCACAGTGG TGTTGTCATT GTAAAGTGGT TATTCTTGGA 12840
AGTGGTGTGC GGAAATCTTT CAAAGATCTG ACCCTTTTGA ACAAGGATTC CCGAGAAAGC 12900
ACCAAGAGGG TAGAGAAGGA CATTGTCTTC TGTAGTAATA ACTGCTTTAT TCTTTATTCA 12960
TCAACTGCAC AAGCGAAAAA CTCAGAAAAC AAGGAATCCA TTCCTTCATT GCCACAATCA 13020
CCTATGAGAG AAACGCCTTC CAAAGCATTT CATCAGTACA GCAACAACAT CTCCACTTTG 13080
GATGTGCACT GTCTCCCCCA GCTCCCAGAG AAAGCTTCTC CCCCTGCCTC ACCACCCATC 13140
GCCTTCCCTC CTGCTTTTGA AGCAGCCCAA GTCGAGGCCA AGCCAGATGA GCTGAAGGTG 13200
ACAGTCAAGC TGAAGCCTCG GCTAAGAGCT GTCCATGGTG GGTTTGAAGA TTGCAGGCCG 13260
CTCAATAAAA AATGGAGAGG AATGAAATGG AAGAAGTGGA GCATTCATAT TGTAATCCCT 13320
AAGGGGACAT TTAAACCACC TTGTGAGGAT GAAATAGATG AATTTCTAAA GAAATTGGGC 13380
ACTTCCCTTA AACCTGATCC TGTGCCCAAA GACTATCGGA AATGTTGCTT TTGTCATGAA 13440
GAAGGTGATG GATTGACAGA TGGACCAGCA AGGCTACTCA ACCTTGACTT GGATCTGTGG 13500
GTCCACTTGA ACTGCGCTCT GTGGTCCACG GAGGTCTATG AGACTCAGGC TGGTGCCTTA 13560
ATAAATGTGG AGCTAGCTCT GAGGAGAGGC CTACAAATGA AATGTGTCTT CTGTCACAAG 13620
ACGGGTGCCA CTAGTGGATG CCACAGATTT CGATGCACCA ACATTTATCA CTTCACTTGC 13680
GCCATTAAAG CACAATGCAT GTTTTTTAAG GACAAAACTA TGCTTTGCCC CATGCACAAA 13740
CCAAAGGGAA TTCATGAGCA AGAATTAAGT TACTTTGCAG TCTTCAGGAG GGTCTATGTT 13800
CAGCGTGATG AGGTGCGACA GATTGCTAGC ATCGTGCAAC GAGGAGAACG GGACCATACC 13860
TTTCGCGTGG GTAGCCTCAT CTTCCACACA ATTGGTCAGC TGCTTCCACA GCAGATGCAA 13920
GCATTCCATT CTCCTAAAGC ACTCTTCCCT GTGGGCTATG AAGCCAGCCG GCTGTACTGG 13980
AGCACTCGCT ATGCCAATAG GCGCTGCCGC TACCTGTGCT CCATTGAGGA GAAGGATGGG 14040
CGCCCAGTGT TTGTCATCAG GATTGTGGAA CAAGGCCATG AAGACCTGGT TCTAAGTGAC 14100
ATCTCACCTA AAGGTGTCTG GGATAAGATT TTGGAGCCTG TGGCATGTGT GAGAAAAAAG 14160
TCTGAAATGC TCCAGCTTTT CCCAGCGTAT TTAAAAGGAG AGGATCTGTT TGGCCTGACC 14220
GTCTCTGCAG TGGCACGCAT AGCGGAATCA CTTCCTGGGG TTGAGGCATG TGAAAATTAT 14280
ACCTTCCGAT ACGGCCGAAA TCCTCTCATG GAACTTCCTC TTGCCGTTAA CCCCACAGGT 14340
TGTGCCCGTT CTGAACCTAA AATGAGTGCC CATGTCAAGA GGTTTGTGTT AAGGCCTCAC 14400
ACCTTAAACA GCACCAGCAC CTCAAAGTCA TTTCAGAGCA CAGTCACTGG AGAACTGAAC 14460
GCACCTTATA GTAAACAGTT TGTTCACTCC AAGTCATCGC AGTACCGGAA GATGAAAACT 14520
GAATGGAAAT CCAATGTGTA TCTGGCACGG TCTCGGATTC AGGGGCTGGG CCTGTATGCT 14580
GCTCGAGACA TTGAGAAACA CACCATGGTC ATTGAGTACA TCGGGACTAT CATTCGAAAC 14640
GAAGTAGCCA ACAGGAAAGA GAAGCTTTAT GAGTCTCAGA ACCGTGGTGT GTACATGTTC 14700
CGCATGGATA ACGACCATGT GATTGACGCG ACGCTCACAG GAGGGCCCGC AAGGTATATC 14760
AACCATTCGT GTGCACCTAA TTGTGTGGCT GAAGTGGTGA CTTTTGAGAG AGGACACAAA 14820
ATTATCATCA GCTCCAGTCG GAGAATCCAG AAAGGAGAAG AGCTCTGCTA TGACTATAAG 14880
TTTGACTTTG AAGATGACCA GCACAAGATT CCGTGTCACT GTGGAGCTGT GAACTGCCGG 14940
AAGTGGATGA ACTGAAATGC ATTCCTTGCT AGCTCAGCGG GCGGCTTGTC CCTAGGAAGA 15000
GGCGATTCAA CACACCATTG GAATTTTGCA GACAGAAAGA GATTTTTGTT TTCTGTTTTA 15060
TGACTTTTTG AAAAAGCTTC TGGGAGTTCT GATTTCCTCA GTCCTTTAGG TTAAAGCAGC 15120
GCCAGGAGGA AGCTGACAGA AGCAGCGTTC CTGAAGTGGC CGAGGTTAAA CGGAATCACA 15180
GAATGGTCCA GCACTTTTGC TTTTTTTTCT TTTCCTTTTC TTTTTTTTTT GTTTGTTTTT 15240
TGTTTTGTTT TTCCCTTGTG GGTGGGTTTC ATTGTTTTGG TTTTCTAGTC TCACTAAGGA 15300
GAAACTTTTA CTGGGGCAAA GAGCCGATGG CTGCCCTGCC CCGGGCAGGG GCCTTCCTAT 15360
GAATGTAAGA CTGAAATCAC CAGCGAGGGG GACAGAGAGT GCTGGCCACG GCCTTATTAA 15420
AAAGGGGCAG GCCCTCTAAC TTCAAAATGT TTTTAAATAA AGTAGACACC ACTGAACAAG 15480
GAATGTACTG AAATGACTTC CTTAGGGATA GAGCTAAGGG ATAATAACTT GCACTAAATA 15540
CATTTAAATA CTTGATTCCA TGAGTCAGTT TATTGTAGTT TTTGATTTCT GTAAAATAAG 15600
AGAAACTTTT GTATTTATTA TTGAATAAGT GAATGAAGCT ATTTTTAAAT AAAGTTAGAA 15660
GAAAGCCAAG CTGCTGCTGT TACCTGCAGA ACTAACAAAC CCTGTTACTT TGTACAGATA 15720
TGTAAATATT TTGAGAAAAA ATACAGTATA AAAATAGTTA TTGACCAAAT GCTACCAGGC 15780
TCTGCAGCAG CTCGGGGGCT TATAAAATGT TCATAGGGAT GTTACAATAT AATTTTGTGT 15840
TATAAAATAT GCCATTATAA TTATGTAATA ACCAAAATTT CAACCTAGAG TGTTGGGGGT 15900
TTTTTGGAAA CCGCAGTCTA TTAGTACTCA ATGGTTTTAT ACACCTTACT TCTGACAGAG 15960
CGGGGCGTAT GCTACGACTA CAACTTTTAT AGCTGTTTTG GTAATTTAAA CTAATTTTTT 16020
CATATTATAT TGTTGCATCC CTACTTCTTC AGTCAGGTTT TTTTGTGCTT ACAATTTGTG 16080
ATAACTGTGA ATAACTGCTT AAAAATACAC CCAAATGGAG GCTGAATTTT TTCTTCAGCA 16140
AAAGTAGTTT TGATTAGAAC TTTGTTTCAG CCACAGAGAA TCATGTAAAC GTAATAGGAT 16200
CATGTAGCAG AAACTTAAAT CTAACCCTTT AGCCTTCTAT TTAACACAAA AATTTGAAAA 16260
AGTTAAAAAA AAAAAGGAGA TGTGATTATG CTTACAGCTG CAGGACTCTG GCAATAGGGT 16320
TTTTGGAAGA TGTAATTTTA AAATGTGTTT GTATGAACTG TTTGTTTACA TTTCTTTAAT 16380
AAAAAAAACA CTGTTTTGTG TTTGCTTGTA GAAACTTAAT CAGCATTTTG AACCAGGTTA 16440
GCTTTTTATT TTGTACTTAA AATTCTGGTA CTGACACTTC ACAGGCTAAG TATAAAATGA 16500
AGTTTTGTGT GCACAATTCA AGTGGACTGT AAACTGTTGG TATATTCAGT GATGCAGTTC 16560
TGAACTTGTA TATGGCATGA TGTATTTTTA TCTTACAGAA TAAATCAATT GTATATATTT 16620
TTCTCTTGAT AAATAGCTGT ATGAAATTTG TTTCCTGAAT ATTTTTCTTC TCTTGTACAA 16680
TATCCTGACA TCCTACCAGT ATTTGTCCTA CCGGGTTTTT GTTGTTTTCT GTTCTGTATA 16740
ATAGTATCTA ATGTTGGCAA AAATTGAATT TTTTGAAGTA TACAGAGTGT TATGGGTTTT 16800
GGAATTTGTG GACACAGATT TAGAAGATCA CCATTTACAA ATAAAATATT TTACATCT 16859
Sequence Source Ensembl
Keyword

KW-0002--3D-structure
KW-0007--Acetylation
KW-0010--Activator
KW-0025--Alternative splicing
KW-0156--Chromatin regulator
KW-0175--Coiled coil
KW-0181--Complete proteome
KW-0238--DNA-binding
KW-0479--Metal-binding
KW-0489--Methyltransferase
KW-0539--Nucleus
KW-0597--Phosphoprotein
KW-0621--Polymorphism
KW-1185--Reference proteome
KW-0677--Repeat
KW-0949--S-adenosyl-L-methionine
KW-0804--Transcription
KW-0805--Transcription regulation
KW-0808--Transferase
KW-0862--Zinc
KW-0863--Zinc-finger
--

Interpro

IPR003889--FYrich_C
IPR003888--FYrich_N
IPR009071--HMG_box_dom
IPR000637--HMGI/Y_DNA-bd_CS
IPR003616--Post-SET_dom
IPR001214--SET_dom
IPR001594--Znf_DHHC_palmitoyltrfase
IPR011011--Znf_FYVE_PHD
IPR001965--Znf_PHD
IPR019787--Znf_PHD-finger
IPR001841--Znf_RING
IPR013083--Znf_RING/FYVE/PHD

PROSITE

PS51543--FYRC
PS51542--FYRN
PS00354--HMGI_Y
PS50868--POST_SET
PS50280--SET
PS50216--ZF_DHHC
PS01359--ZF_PHD_1
PS50016--ZF_PHD_2
PS50089--ZF_RING_2

Pfam

PF05965--FYRC
PF05964--FYRN
PF00628--PHD
PF00856--SET

Gene Ontology

GO:0035097--C:histone methyltransferase complex
GO:0044666--C:MLL3/4 complex
GO:0005654--C:nucleoplasm
GO:0005634--C:nucleus
GO:0003677--F:DNA binding
GO:0042800--F:histone methyltransferase activity (H3-K4 specific)
GO:0018024--F:histone-lysine N-methyltransferase activity
GO:0044822--F:poly(A) RNA binding
GO:0008270--F:zinc ion binding
GO:0051568--P:histone H3-K4 methylation
GO:0034968--P:histone lysine methylation
GO:0006355--P:regulation of transcription, DNA-templated
GO:0006351--P:transcription, DNA-templated

Orthology
WERAM ID Ensembl Protein ID Species Identity E-value Score
WERAM-Pat-0168 ENSPTRP00000046674.3 Pan troglodytes 99 0.0 8513
WERAM-Gog-0210 ENSGGOP00000027941.1 Gorilla gorilla 96 0.0 8183
WERAM-Paa-0005 ENSPANP00000006150.1 Papio anubis 95 0.0 8052
WERAM-Ict-0124 ENSSTOP00000012342.2 Ictidomys tridecemlineatus 87 0.0 7454
WERAM-Aim-0154 ENSAMEP00000014067.1 Ailuropoda melanoleuca 88 0.0 7439
WERAM-Caf-0059 ENSCAFP00000007370.4 Canis familiaris 87 0.0 7438
WERAM-Mum-0152 ENSMUSP00000043874.7 Mus musculus 85 0.0 7227
WERAM-Fec-0096 ENSFCAP00000008002.3 Felis catus 86 0.0 7174
WERAM-Tut-0198 ENSTTRP00000016174.1 Tursiops truncatus 85 0.0 7130
WERAM-Cap-0018 ENSCPOP00000001682.2 Cavia porcellus 84 0.0 7127
WERAM-Bot-0193 ENSBTAP00000028347.5 Bos taurus 84 0.0 7074
WERAM-Ova-0045 ENSOARP00000005594.1 Ovis aries 83 0.0 6925
WERAM-Mup-0098 ENSMPUP00000009152.1 Mustela putorius furo 85 0.0 6870
WERAM-Chs-0076 ENSCSAP00000002864.1 Chlorocebus sabaeus 97 0.0 6448
WERAM-Nol-0046 ENSNLEP00000005663.2 Nomascus leucogenys 97 0.0 6371
WERAM-Gaga-0065 ENSGALP00000010110.4 Gallus gallus 72 0.0 6067
WERAM-Anp-0025 ENSAPLP00000003226.1 Anas platyrhynchos 73 0.0 6007
WERAM-Meg-0048 ENSMGAP00000004625.2 Meleagris gallopavo 72 0.0 5849
WERAM-Tas-0126 ENSTSYP00000013377.1 Tarsius syrichta 91 0.0 5727
WERAM-Tag-0008 ENSTGUP00000000641.1 Taeniopygia guttata 72 0.0 5657
WERAM-Orc-0115 ENSOCUP00000009766.3 Oryctolagus cuniculus 85 0.0 5583
WERAM-Mam-0056 ENSMMUP00000009467.2 Macaca mulatta 91 0.0 5568
WERAM-Eqc-0189 ENSECAP00000020200.1 Equus caballus 88 0.0 5402
WERAM-Poa-0169 ENSPPYP00000020408.2 Pongo abelii 98 0.0 5402
WERAM-Anc-0148 ENSACAP00000014142.2 Anolis carolinensis 67 0.0 5370
WERAM-Lac-0079 ENSLACP00000010253.1 Latimeria chalumnae 62 0.0 4809
WERAM-Otg-0078 ENSOGAP00000005885.2 Otolemur garnettii 86 0.0 4736
WERAM-Caj-0209 ENSCJAP00000036628.3 Callithrix jacchus 95 0.0 4722
WERAM-Dio-0019 ENSDORP00000002189.1 Dipodomys ordii 81 0.0 4670
WERAM-Ora-0065 ENSOANP00000009850.3 Ornithorhynchus anatinus 73 0.0 4649
WERAM-Sah-0130 ENSSHAP00000013860.1 Sarcophilus harrisii 74 0.0 4647
WERAM-Myl-0122 ENSMLUP00000010086.2 Myotis lucifugus 83 0.0 4352
WERAM-Loa-0084 ENSLAFP00000006640.4 Loxodonta africana 85 0.0 3166
WERAM-Mim-0010 ENSMICP00000000977.1 Microcebus murinus 86 0.0 3004
WERAM-Ptv-0086 ENSPVAP00000007862.1 Pteropus vampyrus 78 0.0 2692
WERAM-Chh-0008 ENSCHOP00000000593.1 Choloepus hoffmanni 83 0.0 2406
WERAM-Sus-0158 ENSSSCP00000023447.1 Sus scrofa 85 0.0 2272
WERAM-Leo-0127 ENSLOCP00000015481.1 Lepisosteus oculatus 53 0.0 2182
WERAM-Mod-0039 ENSMODP00000005827.3 Monodelphis domestica 70 0.0 2181
WERAM-Fia-0086 ENSFALP00000007141.1 Ficedula albicollis 76 0.0 2057
WERAM-Vip-0013 ENSVPAP00000001498.1 Vicugna pacos 83 0.0 1950
WERAM-Ten-0186 ENSTNIP00000018287.1 Tetraodon nigroviridis 46 0.0 1730
WERAM-Pes-0170 ENSPSIP00000020005.1 Pelodiscus sinensis 87 0.0 1730
WERAM-Xet-0065 ENSXETP00000021458.2 Xenopus tropicalis 64 0.0 1642
WERAM-Prc-0090 ENSPCAP00000008256.1 Procavia capensis 78 0.0 1615
WERAM-Tar-0071 ENSTRUP00000014027.1 Takifugu rubripes 54 0.0 1456
WERAM-Gaa-0138 ENSGACP00000017696.1 Gasterosteus aculeatus 55 0.0 1456
WERAM-Dar-0184 ENSDARP00000115827.2 Danio rerio 47 0.0 1447
WERAM-Asm-0011 ENSAMXP00000001840.1 Astyanax mexicanus 53 0.0 1441
WERAM-Orla-0181 ENSORLP00000020984.1 Oryzias latipes 55 0.0 1432
WERAM-Ocp-0124 ENSOPRP00000012666.1 Ochotona princeps 83 0.0 1411
WERAM-Mae-0021 ENSMEUP00000001693.1 Macropus eugenii 68 0.0 1375
WERAM-Pof-0064 ENSPFOP00000005925.2 Poecilia formosa 48 0.0 1364
WERAM-Xim-0205 ENSXMAP00000016470.1 Xiphophorus maculatus 48 0.0 1359
WERAM-Ran-0259 ENSRNOP00000072878.1 Rattus norvegicus 77 0.0 1097
WERAM-Orn-0120 ENSONIP00000012272.1 Oreochromis niloticus 57 0.0 1012
WERAM-Dan-0166 ENSDNOP00000021359.1 Dasypus novemcinctus 73 0.0 914
WERAM-Pem-0016 ENSPMAP00000002311.1 Petromyzon marinus 72 0.0 870
WERAM-Gam-0044 ENSGMOP00000004922.1 Gadus morhua 73 0.0 829
WERAM-Ect-0018 ENSETEP00000001138.1 Echinops telfairi 75 0.0 759
WERAM-Soa-0040 ENSSARP00000003919.1 Sorex araneus 63 0.0 746
WERAM-Cii-0007 ENSCINP00000003156.3 Ciona intestinalis 56 0.0 649
WERAM-Drm-0022 FBpp0070347 Drosophila melanogaster 54 1e-175 617
WERAM-Cis-0001 ENSCSAVP00000000096.1 Ciona savignyi 51 1e-158 560
WERAM-Ere-0080 ENSEEUP00000007438.1 Erinaceus europaeus 42 2e-150 533
WERAM-Cae-0045 T12D8.1 Caenorhabditis elegans 35 2e-94 347
WERAM-Tub-0051 ENSTBEP00000006325.1 Tupaia belangeri 37 6e-46 186
Created Date 25-Jun-2016