WERAM Information


Tag Content
WERAM ID WERAM-Mum-0008
Ensembl Protein ID ENSMUSP00000110337.1
Uniprot Accession P55200; KMT2A_MOUSE; E9QNE7; Q3UEU1; Q3USE7
Genbank Protein ID NP_001074518.1
Protein Name Histone-lysine N-methyltransferase 2A
Genbank Nucleotide ID NM_001081049.1
Gene Name KMT2A
Ensembl Information
Ensembl Gene ID Ensembl Transcript ID Ensembl Protein ID
ENSMUSG00000002028.12 ENSMUST00000114689.7 ENSMUSP00000110337.1
ENSMUSG00000002028.12 ENSMUST00000002095.9 ENSMUSP00000002095.3
Details
Type Family Domain Substrates AA References (PMIDs)
HMT SET1 SET H3K4 K 21113167
Status Reviewed
Classification
Type Family E-value Score Start End
HMT SET1 1.90e-49 168.2 3827 3942
Me_Reader PHD 2.00e-12 48.6 1432 1980
Ac_Reader Bromodomain 0.00019 22.9 1705 1740
Organism Mus musculus
NCBI Taxa ID 10090
Functional Description
(View)
Histone methyltransferase that plays an essential role in early development and hematopoiesis. Catalytic subunit of the MLL1/MLL complex, a multiprotein complex that mediates both methylation of 'Lys-4' of histone H3 (H3K4me) complex and acetylation of 'Lys-16' of histone H4 (H4K16ac). In the MLL1/MLL complex, it specifically mediates H3K4me, a specific tag for epigenetic transcriptional activation. Has weak methyltransferase activity by itself, and requires other component of the MLL1/MLL complex to obtain full methyltransferase activity. Has no activity toward histone H3 phosphorylated on 'Thr-3', less activity toward H3 dimethylated on 'Arg-8' or 'Lys-9', while it has higher activity toward H3 acetylated on 'Lys-9'. Required for transcriptional activation of HOXA9. Promotes PPP1R15A-induced apoptosis (By similarity). Plays a critical role in the control of circadian gene expression and is essential for the transcriptional activation mediated by the CLOCK-ARNTL/BMAL1 heterodimer. Establishes a permissive chromatin state for circadian transcription by mediating a rhythmic methylation of 'Lys-4' of histone H3 (H3K4me) and this histone modification directs the circadian acetylation at H3K9 and H3K14 allowing the recruitment of CLOCK-ARNTL/BMAL1 to chromatin (PubMed:21113167).
Domain Profile
  HMT SET1

              SET1.txt    2 elevakskikglglvakkeiekeelviEYvGevirsevadkrekeyekkeigvylfrldedaevvvdatkkgniarfinhscepNce 88  
++ v++s+i+g+gl++k++i+++e+viEY+G+virs ++dkrek+y++k+ig+y+fr+d++ vvdat++gn+arfinhscepNc+
ENSMUSP00000110337.1 3827 AVGVYRSPIHGRGLFCKRNIDAGEMVIEYAGNVIRSIQTDKREKYYDSKGIGCYMFRIDDS--EVVDATMHGNAARFINHSCEPNCY 3911
58899********************************************************..************************ PP
SET1.txt 89 akvvavdgekkiviyakraIekgeeltydYk 119
++v+++dg+k+ivi+a+r+I +geeltydYk
ENSMUSP00000110337.1 3912 SRVINIDGQKHIVIFAMRKIYRGEELTYDYK 3942
******************************7 PP

  Me_Reader PHD

               PHD.txt    3 iClvCgkddegekemvqCdeCddwfHlkCvklplsslpeg.kswyCpsCk 51  
+C++C +++ e+v+C+ C + fH C++ ++++l+++ ++w+C++Ck
ENSMUSP00000110337.1 1432 VCFLCASSGHV--EFVYCQVCCEPFHKFCLEENERPLEDQlENWCCRRCK 1479
8****554444..59******************6666655778******7 PP
PHD.txt 2 tiClvCgkddegekemvqCdeCddwfHlkCvklplsslp.egks.wyCpsCke 52
++C vCg++++ +k++++C++C++ +H +C++++ + p ++k+ w+C +C++
ENSMUSP00000110337.1 1479 KFCHVCGRQHQATKQLLECNKCRNSYHPECLGPNYPTKPtKKKKvWICTKCVR 1531
59****988888888*******************88888444336******86 PP
PHD.txt 2 tiClvCgkddegeke...mvqCdeCddwfHlkCvklp.........lsslpegkswyCpsCke 52
++C++C+k+++++++ m+qC +Cd+w+H kC +l+ ls+lpe+ +++C +C+e
ENSMUSP00000110337.1 1566 NFCPLCDKCYDDDDYeskMMQCGKCDRWVHSKCESLSgtedemyeiLSNLPESVAYTCVNCTE 1628
789999877666555566*******************999999998899998899******97 PP
PHD.txt 3 iClvCgkddegekemvqCde.CddwfHlkCvklplsslpegkswyCpsCk 51
C +C+k+++ + +C + C +H C + ++k+ yC++++
ENSMUSP00000110337.1 1934 RCEFCQKPGAT---VGCCLTsCTSNYHFMCSRAKNCVFLDDKKVYCQRHR 1980
599**766665...4566558*************5555577899**9997 PP

  Ac_Reader Bromodomain

             BROMO.txt   27 kePmdLstikerleegnYsspeefvkDvrlifnNak 62  
++P+dL+ +k+r+++g Y s++ef +D+ i++ a
ENSMUSP00000110337.1 1705 QQPLDLEGVKKRMDQGSYVSVLEFSDDIVKIIQAAI 1740
79***************************9998775 PP

Protein Sequence
(Fasta)
MAHSCRWRFP ARPGTTGGGG GGGRRGLGGA PRQRVPALLL PPGPQAGGGG PGAPPSPPAV 60
AAAAAGSSGA GVPGGAAAAS AASSSSASSS SSSSSSASSG PALLRVGPGF DAALQVSAAI 120
GTNLRRFRAV FGESGGGGGS GEDEQFLGFG SDEEVRVRSP TRSPSVKASP RKPRGRPRSG 180
SDRNPAILSD PSVFSPLNKS ETKSADKIKK KDSKSIEKKR GRPPTFPGVK IKITHGKDIA 240
ELTQGSKEDS LKKVKRTPSA MFQQATKIKK LRAGKLSPLK SKFKTGKLQI GRKGVQIVRR 300
RGRPPSTERI KTPSGLLINS ELEKPQKVRK DKEGTPPLTK EDKTVVRQSP RRIKPVRIIP 360
SCKRTDATIA KQLLQRAKKG AQKKIEKEAA QLQGRKVKTQ VKNIRQFIMP VVSAISSRII 420
KTPRRFIEDE DYDPPMKIAR LESTPNSRFS ATSCGSSEKS SAASQHSSQM SSDSSRSSSP 480
SIDTTSDSQA SEEIQALPEE RSNTPEVHTP LPISQSPENE SNDRRSRRYS MSERSFGSRA 540
TKKLPTLQSA PQQQTSSSPP PPLLTPPPPL QPASGISDHT PWLMPPTIPL ASPFLPASAA 600
PMQGKRKSIL REPTFRWTSL KHSRSEPQYF SSAKYAKEGL IRKPIFDNFR PPPLTPEDVG 660
FASGFSASGT AASARLFSPL HSGTRFDIHK RSPILRAPRF TPSEAHSRIF ESVTLPSNRT 720
SSGASSSGVS NRKRKRKVFS PIRSEPRSPS HSMRTRSGRL STSELSPLTP PSSVSSSLSI 780
PVSPLAASAL NPTFTFPSHS LTQSGESTEK NQRARKQTSA LAEPFSSNSP ALFPWFTPGS 840
QTEKGRKKDT APEELSKDRD ADKSVEKDKS RERDREREKE NKRESRKEKR KKGSDIQSSS 900
ALYPVGRVSK EKVAGEDVGT SSSAKKATGR KKSSSLDSGA DVAPVTLGDT TAVKAKILIK 960
KGRGNLEKNN LDLGPAAPSL EKERTPCLSA PSSSTVKHST SSIGSMLAQA DKLPMTDKRV 1020
ASLLKKAKAQ LCKIEKSKSL KQTDQPKAQG QESDSSETSV RGPRIKHVCR RAAVALGRKR 1080
AVFPDDMPTL SALPWEEREK ILSSMGNDDK SSVAGSEDAE PLAPPIKPIK PVTRNKAPQE 1140
PPVKKGRRSR RCGQCPGCQV PEDCGICTNC LDKPKFGGRN IKKQCCKMRK CQNLQWMPSK 1200
ASLQKQTKAV KKKEKKSKTT EKKESKESTA VKSPLEPAQK AAPPPREEPA PKKSSSEPPP 1260
RKPVEEKSEE GGAPAPAPAP EPKQVSAPAS RKSSKQVSQP AAVVPPQPPS TAPQKKEAPK 1320
AVPSEPKKKQ PPPPEPGPEQ SKQKKVAPRP SIPVKQKPKD KEKPPPVSKQ ENAGTLNILN 1380
PLSNGISSKQ KIPADGVHRI RVDFKEDCEA ENVWEMGGLG ILTSVPITPR VVCFLCASSG 1440
HVEFVYCQVC CEPFHKFCLE ENERPLEDQL ENWCCRRCKF CHVCGRQHQA TKQLLECNKC 1500
RNSYHPECLG PNYPTKPTKK KKVWICTKCV RCKSCGSTTP GKGWDAQWSH DFSLCHDCAK 1560
LFAKGNFCPL CDKCYDDDDY ESKMMQCGKC DRWVHSKCES LSGTEDEMYE ILSNLPESVA 1620
YTCVNCTERH PAEWRLALEK ELQASLKQVL TALLNSRTTS HLLRYRQAAK PPDLNPETEE 1680
SIPSRSSPEG PDPPVLTEVS KQDEQQPLDL EGVKKRMDQG SYVSVLEFSD DIVKIIQAAI 1740
NSDGGQPEIK KANSMVKSFF IRQMERVFPW FSVKKSRFWE PNKVSNNSGM LPNAVLPPSL 1800
DHNYAQWQER EESSHTEQPP LMKKIIPAPK PKGPGEPDSP TPLHPPTPPI LSTDRSREDS 1860
PELNPPPGID DNRQCALCLM YGDDSANDAG RLLYIGQNEW THVNCALWSA EVFEDDDGSL 1920
KNVHMAVIRG KQLRCEFCQK PGATVGCCLT SCTSNYHFMC SRAKNCVFLD DKKVYCQRHR 1980
DLIKGEVVPE NGFEVFRRVF VDFEGISLRR KFLNGLEPEN IHMMIGSMTI DCLGILNDLS 2040
DCEDKLFPIG YQCSRVYWST TDARKRCVYT CKIMECRPPV VEPDINSTVE HDDNRTIAHS 2100
PSSFIDASCK DSQSTAAILS PPSPDRPHSQ TSGSCYYHVI SKVPRIRTPS YSPTQRSPGC 2160
RPLPSAGSPT PTTHEIVTVG DPLLSSGLRS IGSRRHSTSS LSPLRSKLRI MSPVRTGSAY 2220
SRSSVSSVPS LGTATDPEAS AKASDRGGLL SSSANLGHSA PPSSSSQRTV GGSKTSHLDG 2280
SSPSEVKRCS ASDLVPKGSL VKGEKNRTSS SKSTDGSAHS TAYPGIPKLT PQVHNATPGE 2340
LNISKIGSFA EPSTVPFSSK DTVSYPQLHL RGQRSDRDQH MDPSQSVKPS PNEDGEIKTL 2400
KLPGMGHRPS ILHEHIGSSS RDRRQKGKKS SKETCKEKHS SKSYLEPGQV TTGEEGNLKP 2460
EFADEVLTPG FLGQRPCNNV SSEKIGDKVL PLSGVPKGQS TQVEGSSKEL QAPRKCSVKV 2520
TPLKMEGENQ SKNTQKESGP GSPAHIESVC PAEPVSASRS PGAGPGVQPS PNNTLSQDPQ 2580
SNNYQNLPEQ DRNLMIPDGP KPQEDGSFKR RYPRRSARAR SNMFFGLTPL YGVRSYGEED 2640
IPFYSNSTGK KRGKRSAEGQ VDGADDLSTS DEDDLYYYNF TRTVISSGGE ERLASHNLFR 2700
EEEQCDLPKI SQLDGVDDGT ESDTSVTATS RKSSQIPKRN GKENGTENLK IDRPEDAGEK 2760
EHVIKSAVGH KNEPKLDNCH SVSRVKAQGQ DSLEAQLSSL ESSRRVHTST PSDKNLLDTY 2820
NAELLKSDSD NNNSDDCGNI LPSDIMDFVL KNTPSMQALG ESPESSSSEL LTLGEGLGLD 2880
SNREKDIGLF EVFSQQLPAT EPVDSSVSSS ISAEEQFELP LELPSDLSVL TTRSPTVPSQ 2940
NPSRLAVISD SGEKRVTITE KSVASSEGDP ALLSPGVDPA PEGHMTPDHF IQGHMDADHI 3000
SSPPCGSVEQ GHGNSQDLTR NSGTPGLQVP VSPTVPVQNQ KYVPSSTDSP GPSQISNAAV 3060
QTTPPHLKPA TEKLIVVNQN MQPLYVLQTL PNGVTQKIQL TSPVSSTPSV METNTSVLGP 3120
MGSGLTLTTG LNPSLPPSPS LFPPASKGLL SVPHHQHLHS FPAAAQSSFP PNISSPPSGL 3180
LIGVQPPPDP QLLGSEANQR TDLTTTVATP SSGLKKRPIS RLHTRKNKKL APSSAPSNIA 3240
PSDVVSNMTL INFTPSQLSN HPSLLDLGSL NPSSHRTVPN IIKRSKSGIM YFEQAPLLPP 3300
QSVGGTAATA AGSSTISQDT SHLTSGPVSA LASGSSVLNV VSMQTTAAPT SSTSVPGHVT 3360
LANQRLLGTP DIGSISHLLI KASHQSLGIQ DQPVALPPSS GMFPQLGTSQ TPSAAAMTAA 3420
SSICVLPSSQ TAGMTAASPP GEAEEHYKLQ RGNQLLAGKT GTLTSQRDRD PDSAPGTQPS 3480
NFTQTAEAPN GVRLEQNKTL PSAKPASSAS PGSSPSSGQQ SGSSSVPGPT KPKPKAKRIQ 3540
LPLDKGSGKK HKVSHLRTSS EAHIPHRDTD PAPQPSVTRT PRANREQQDA AGVEQPSQKE 3600
CGQPAGPVAA LPEVQATQNP ANEQENAEPK AMEEEESGFS SPLMLWLQQE QKRKESITER 3660
KPKKGLVFEI SSDDGFQICA ESIEDAWKSL TDKVQEARSN ARLKQLSFAG VNGLRMLGIL 3720
HDAVVFLIEQ LAGAKHCRNY KFRFHKPEEA NEPPLNPHGS ARAEVHLRQS AFDMFNFLAS 3780
KHRQPPEYNP NDEEEEEVQL KSARRATSMD LPMPMRFRHL KKTSKEAVGV YRSPIHGRGL 3840
FCKRNIDAGE MVIEYAGNVI RSIQTDKREK YYDSKGIGCY MFRIDDSEVV DATMHGNAAR 3900
FINHSCEPNC YSRVINIDGQ KHIVIFAMRK IYRGEELTYD YKFPIEDASN KLPCNCGAKK 3960
CRKFLN 3966
Nucleotide Sequence
(Fasta)
CTGCTTCACT TACGGGGCGA ACATGGCGCA CAGCTGTCGG TGGCGCTTCC CCGCCCGACC 60
CGGGACCACC GGGGGCGGCG GCGGCGGGGG GCGCCGGGGC CTAGGGGGCG CCCCGCGGCA 120
ACGCGTCCCG GCCTTGCTGC TTCCCCCCGG GCCCCAGGCC GGCGGTGGCG GCCCCGGGGC 180
GCCCCCCTCC CCCCCGGCGG TGGCGGCGGC GGCGGCGGGA AGCAGCGGGG CTGGGGTTCC 240
AGGGGGAGCG GCCGCCGCCT CAGCAGCCTC TTCGTCGTCC GCCTCGTCTT CGTCTTCGTC 300
ATCGTCCTCA GCCTCCTCAG GGCCGGCCCT GCTCCGGGTG GGCCCGGGCT TCGACGCGGC 360
GCTGCAGGTC TCGGCCGCCA TCGGCACCAA CCTGCGCCGG TTCCGGGCAG TGTTTGGGGA 420
GAGCGGCGGG GGAGGCGGCA GCGGAGAGGA TGAGCAGTTC TTAGGTTTTG GCTCAGATGA 480
AGAAGTCAGA GTGCGAAGCC CCACCAGGTC TCCTTCAGTT AAAGCTAGTC CTCGAAAACC 540
TCGCGGGAGA CCTAGAAGTG GCTCTGACCG GAACCCAGCC ATCCTCTCAG ACCCATCTGT 600
GTTTTCCCCT CTAAACAAAT CAGAGACCAA ATCTGCAGAT AAAATCAAGA AGAAAGATTC 660
TAAGAGCATA GAAAAGAAGA GAGGAAGACC TCCTACCTTC CCTGGGGTAA AAATCAAAAT 720
CACACATGGA AAGGACATTG CAGAGTTAAC ACAAGGAAGC AAAGAGGATA GCCTGAAAAA 780
AGTTAAACGG ACCCCTTCTG CCATGTTCCA GCAAGCCACA AAGATTAAAA AGTTAAGAGC 840
AGGTAAACTG TCTCCTCTCA AGTCTAAGTT TAAGACAGGG AAGCTCCAAA TAGGAAGGAA 900
GGGGGTGCAG ATTGTAAGAC GGCGAGGAAG GCCTCCATCC ACAGAACGGA TAAAGACCCC 960
TTCAGGTCTT CTCATTAATT CTGAACTGGA AAAGCCTCAG AAGGTCCGGA AAGACAAGGA 1020
AGGAACACCC CCTCTCACAA AAGAAGATAA GACAGTTGTC AGACAAAGCC CTCGAAGGAT 1080
TAAACCGGTC AGGATTATTC CTTCTTGTAA AAGGACAGAT GCCACAATTG CTAAGCAACT 1140
CCTGCAGAGG GCAAAGAAGG GGGCGCAGAA GAAAATTGAG AAAGAAGCGG CTCAGCTGCA 1200
GGGAAGGAAG GTGAAGACGC AGGTCAAGAA TATCCGGCAG TTCATCATGC CTGTGGTCAG 1260
CGCCATCTCC TCGAGGATCA TCAAGACTCC CCGGCGGTTC ATAGAGGATG AGGATTATGA 1320
CCCACCCATG AAGATTGCAC GCCTGGAATC TACCCCGAAC AGCAGATTCA GCGCCACGTC 1380
CTGTGGATCC TCGGAGAAGT CCAGTGCCGC CTCCCAGCAC TCCTCTCAGA TGTCTTCAGA 1440
CTCCTCCCGA TCCAGCAGCC CCAGTATCGA TACCACCTCA GATTCTCAGG CCTCTGAAGA 1500
GATCCAGGCA CTTCCCGAGG AGCGCAGTAA TACCCCCGAA GTTCATACTC CACTGCCTAT 1560
TTCCCAGTCC CCAGAAAATG AGAGTAATGA TAGGAGAAGC AGACGGTATT CGATGTCTGA 1620
GAGAAGCTTT GGATCTAGAG CAACTAAAAA ATTACCAACT CTACAAAGTG CCCCCCAGCA 1680
GCAGACCTCC TCCTCGCCAC CTCCGCCTCT GCTCACCCCT CCCCCTCCAC TGCAGCCAGC 1740
CTCCGGCATC TCTGACCACA CACCTTGGCT TATGCCTCCC ACCATCCCTT TAGCATCACC 1800
ATTCCTGCCT GCTTCTGCTG CTCCCATGCA AGGGAAGCGG AAATCTATTT TGCGGGAGCC 1860
AACATTTAGG TGGACTTCTT TAAAACATTC GAGGTCAGAG CCACAGTACT TTTCCTCAGC 1920
AAAGTATGCC AAAGAAGGTC TGATTCGCAA ACCAATATTT GATAACTTCC GACCCCCTCC 1980
GCTGACTCCC GAGGATGTCG GCTTTGCTTC TGGTTTTTCT GCATCTGGTA CTGCCGCTTC 2040
GGCCCGGTTG TTTTCACCAC TCCATTCTGG AACAAGGTTT GATATTCATA AAAGGAGCCC 2100
CATTCTGAGA GCTCCCAGAT TTACTCCAAG TGAGGCACAC TCTAGAATAT TTGAGTCTGT 2160
GACCTTGCCT AGTAATCGAA CTTCTTCTGG AGCGTCCTCT TCGGGAGTAT CTAATAGAAA 2220
AAGGAAAAGG AAAGTGTTTA GTCCGATTCG GTCTGAACCA AGATCACCTT CTCACTCCAT 2280
GAGGACAAGA AGCGGAAGGC TTAGCACCTC TGAGCTGTCA CCTCTCACTC CCCCGTCTTC 2340
TGTCTCCTCC TCATTAAGCA TTCCCGTTAG TCCTCTTGCC GCTAGTGCCT TAAACCCAAC 2400
TTTTACTTTT CCTTCTCATT CCCTAACTCA GTCTGGGGAA TCTACAGAAA AAAATCAGAG 2460
AGCAAGGAAG CAGACTAGTG CTCTGGCAGA GCCATTCTCG TCAAATAGCC CTGCTCTCTT 2520
CCCATGGTTC ACCCCAGGCT CTCAGACCGA GAAGGGGAGA AAGAAAGACA CAGCCCCGGA 2580
GGAGCTGTCC AAAGATCGCG ATGCTGACAA GAGCGTGGAG AAGGACAAGA GTAGAGAGAG 2640
AGACCGGGAG CGAGAGAAGG AGAATAAGCG GGAATCAAGG AAAGAGAAAA GGAAAAAGGG 2700
CTCAGACATT CAGAGTAGCT CTGCTTTGTA TCCTGTGGGT CGGGTTTCCA AAGAGAAGGT 2760
TGCTGGAGAA GATGTTGGCA CTTCATCTTC TGCCAAAAAA GCAACAGGGC GGAAGAAGTC 2820
TTCGTCACTT GATTCTGGGG CTGATGTTGC TCCTGTGACT CTTGGGGACA CAACAGCTGT 2880
CAAAGCCAAA ATTCTTATAA AGAAAGGGAG AGGAAATCTG GAAAAAAACA ACTTGGACCT 2940
CGGCCCAGCT GCCCCGTCCC TGGAGAAGGA GAGAACCCCC TGCCTTTCCG CTCCTTCATC 3000
TAGCACTGTT AAACACTCCA CTTCCTCCAT AGGCTCCATG TTGGCTCAGG CAGACAAGCT 3060
TCCAATGACT GACAAGAGGG TTGCCAGCCT CCTAAAAAAG GCCAAAGCCC AGCTCTGCAA 3120
GATTGAGAAG AGTAAGAGTC TCAAACAGAC TGACCAGCCC AAAGCACAGG GTCAAGAAAG 3180
TGATTCATCA GAAACCTCTG TTCGAGGACC CCGGATTAAA CATGTCTGCA GAAGAGCTGC 3240
TGTTGCCCTT GGCCGCAAAC GAGCTGTGTT TCCTGATGAC ATGCCCACCT TGAGTGCCTT 3300
ACCGTGGGAA GAACGAGAAA AAATTTTGTC TTCCATGGGG AATGATGACA AGTCATCAGT 3360
TGCTGGCTCA GAAGATGCCG AGCCTCTTGC TCCTCCCATC AAACCAATTA AGCCTGTCAC 3420
CAGAAACAAG GCACCTCAGG AGCCTCCGGT GAAGAAAGGG CGGCGATCAA GGCGGTGCGG 3480
ACAATGTCCT GGCTGCCAGG TGCCTGAGGA CTGTGGCATT TGCACTAATT GCCTGGACAA 3540
GCCCAAGTTT GGTGGCCGCA ATATAAAGAA GCAATGCTGC AAGATGAGGA AATGTCAGAA 3600
TCTGCAGTGG ATGCCTTCCA AAGCCTCCCT TCAGAAGCAG ACTAAAGCTG TGAAAAAGAA 3660
AGAGAAAAAG TCTAAGACCA CTGAAAAGAA AGAGAGCAAA GAGAGCACTG CTGTGAAGAG 3720
CCCCTTGGAG CCTGCTCAGA AGGCTGCCCC GCCACCGCGG GAGGAGCCTG CCCCAAAGAA 3780
GAGCAGCAGT GAGCCTCCAC CCCGCAAACC TGTGGAAGAA AAGAGTGAAG AAGGGGGTGC 3840
CCCTGCGCCT GCCCCTGCGC CTGAACCCAA ACAGGTCAGC GCGCCAGCAT CCCGGAAGTC 3900
CAGCAAGCAG GTCTCCCAGC CAGCAGCCGT CGTCCCCCCT CAGCCTCCTA GCACAGCACC 3960
GCAGAAAAAA GAAGCTCCCA AGGCCGTTCC AAGTGAGCCC AAGAAAAAGC AACCTCCACC 4020
CCCAGAACCA GGGCCAGAGC AAAGCAAGCA GAAAAAAGTG GCCCCCCGCC CAAGTATCCC 4080
TGTAAAACAA AAACCAAAGG ACAAGGAGAA GCCACCTCCA GTAAGTAAAC AAGAGAATGC 4140
AGGCACTTTG AACATCCTCA ACCCACTCTC GAATGGCATC AGTTCTAAGC AGAAAATCCC 4200
AGCAGATGGA GTCCACAGGA TCAGAGTGGA CTTTAAGGAA GACTGTGAAG CAGAAAATGT 4260
GTGGGAGATG GGAGGCTTAG GGATCCTGAC CTCTGTCCCC ATAACACCCA GAGTAGTGTG 4320
CTTTCTCTGT GCCAGCAGTG GGCATGTAGA GTTTGTGTAT TGCCAAGTGT GTTGTGAACC 4380
CTTCCACAAG TTTTGCTTAG AGGAGAATGA GCGCCCCCTG GAGGACCAGC TGGAAAACTG 4440
GTGTTGTCGC CGCTGCAAGT TTTGCCATGT GTGTGGAAGA CAGCATCAGG CTACAAAGCA 4500
GTTGCTGGAG TGTAACAAGT GCCGAAACAG CTATCACCCC GAGTGCCTGG GACCAAACTA 4560
CCCCACCAAA CCCACGAAGA AAAAGAAAGT GTGGATCTGC ACCAAGTGTG TCCGCTGCAA 4620
GAGCTGTGGC TCCACCACTC CAGGCAAAGG GTGGGACGCA CAGTGGTCTC ACGATTTCTC 4680
ACTGTGCCAT GACTGTGCCA AACTCTTTGC TAAAGGGAAC TTCTGCCCTC TCTGTGACAA 4740
GTGCTACGAT GACGATGACT ACGAGAGCAA GATGATGCAG TGCGGGAAGT GTGACCGCTG 4800
GGTCCACTCC AAGTGCGAGA GTCTCTCAGG TACAGAAGAT GAGATGTATG AGATTCTGTC 4860
CAACTTGCCA GAAAGTGTGG CCTACACGTG TGTGAACTGC ACTGAGCGGC ACCCCGCAGA 4920
GTGGAGACTG GCCCTGGAGA AGGAGCTGCA GGCGTCCCTC AAGCAGGTTC TCACGGCCCT 4980
GTTGAATTCT CGGACTACCA GTCACTTGCT GCGCTACCGT CAGGCTGCCA AGCCTCCAGA 5040
CTTAAACCCT GAGACTGAGG AAAGCATACC TTCCCGAAGC TCCCCAGAGG GGCCAGACCC 5100
TCCTGTTCTT ACTGAGGTCA GCAAGCAGGA TGAACAGCAG CCGTTAGACC TCGAAGGGGT 5160
CAAGAAGAGA ATGGACCAGG GCAGCTACGT ATCTGTGTTG GAGTTCAGCG ATGATATTGT 5220
GAAGATCATT CAGGCAGCCA TTAACTCAGA TGGAGGGCAG CCAGAGATAA AAAAAGCCAA 5280
CAGCATGGTC AAGTCTTTCT TCATTCGGCA AATGGAGCGA GTTTTTCCGT GGTTCAGTGT 5340
CAAAAAGTCT AGATTTTGGG AGCCAAATAA AGTATCAAAC AACAGTGGGA TGTTACCAAA 5400
CGCAGTGCTT CCGCCTTCAC TTGACCATAA TTATGCTCAG TGGCAGGAGC GAGAGGAGAG 5460
CAGCCACACT GAGCAGCCTC CTCTAATGAA GAAAATCATT CCAGCTCCCA AACCCAAAGG 5520
ACCCGGAGAG CCAGACTCGC CCACGCCGCT CCACCCGCCT ACACCCCCGA TCTTGAGTAC 5580
TGATCGGAGT CGAGAAGACA GTCCAGAGCT GAATCCACCC CCAGGCATCG ATGACAACCG 5640
ACAGTGTGCA CTGTGTCTGA TGTACGGCGA TGACAGTGCT AATGATGCTG GCCGTTTGCT 5700
GTACATTGGC CAAAATGAGT GGACACATGT GAACTGTGCT TTGTGGTCAG CAGAAGTGTT 5760
TGAAGATGAT GACGGATCAC TGAAGAATGT GCATATGGCT GTGATTAGGG GCAAGCAGCT 5820
GAGATGTGAA TTCTGCCAGA AGCCAGGAGC CACCGTGGGT TGCTGCCTCA CATCTTGCAC 5880
CAGCAACTAC CATTTCATGT GTTCCCGGGC CAAGAACTGT GTCTTCCTGG ATGATAAAAA 5940
AGTGTATTGT CAGCGGCATC GGGATTTGAT CAAAGGCGAG GTGGTTCCTG AGAATGGATT 6000
TGAAGTTTTT AGAAGAGTGT TTGTAGATTT TGAAGGAATC AGCTTGCGCA GGAAGTTCCT 6060
TAATGGCTTG GAACCAGAAA ATATCCACAT GATGATAGGC TCAATGACAA TCGACTGTTT 6120
GGGAATCCTG AATGACCTCT CTGACTGTGA AGATAAACTC TTTCCTATTG GATACCAGTG 6180
TTCTCGGGTG TACTGGAGCA CCACAGATGC CCGGAAGCGC TGTGTGTACA CATGCAAGAT 6240
CATGGAGTGC CGCCCTCCTG TTGTAGAGCC GGATATCAAC AGCACGGTTG AGCACGATGA 6300
CAATAGGACC ATTGCCCATA GCCCATCATC ATTTATAGAT GCATCGTGTA AGGACAGTCA 6360
AAGCACAGCT GCAATTCTCA GTCCTCCGTC GCCAGATCGG CCTCATTCAC AGACCTCAGG 6420
CTCCTGTTAT TATCATGTCA TCTCGAAGGT CCCTAGGATT CGAACACCCA GCTACTCGCC 6480
TACACAGAGG TCCCCTGGCT GCCGCCCATT GCCTTCTGCA GGAAGTCCTA CCCCAACCAC 6540
TCACGAAATC GTCACAGTCG GTGACCCGTT ACTGTCTTCT GGTCTTCGGA GCATTGGCTC 6600
TAGGCGTCAC AGTACTTCTT CCTTGTCACC CCTGCGGTCC AAGCTCCGCA TAATGTCTCC 6660
AGTGAGAACG GGGAGCGCTT ACTCCAGGAG TAGTGTTTCC TCAGTCCCCA GCCTTGGGAC 6720
TGCCACAGAT CCTGAGGCCA GTGCCAAAGC ATCGGATCGA GGAGGGCTGT TGAGTTCAAG 6780
TGCTAATCTC GGGCACAGCG CTCCCCCCTC TTCAAGCTCA CAGAGGACAG TTGGAGGCTC 6840
CAAAACCAGT CATCTGGATG GGTCGTCACC CTCGGAAGTG AAGCGGTGTA GTGCTTCAGA 6900
CTTGGTACCC AAAGGCTCCT TAGTAAAGGG AGAGAAAAAC AGAACTTCAA GTTCCAAGAG 6960
CACAGATGGA TCTGCACATA GCACAGCTTA CCCTGGAATC CCTAAACTGA CACCACAGGT 7020
TCATAACGCA ACTCCTGGAG AACTAAACAT TAGCAAAATT GGCAGTTTTG CTGAACCCTC 7080
TACAGTGCCC TTTTCTTCTA AGGATACAGT GTCCTACCCA CAGCTCCACT TGAGGGGCCA 7140
AAGAAGTGAC AGAGACCAGC ACATGGATCC TTCCCAGTCA GTAAAGCCCT CTCCAAATGA 7200
AGATGGTGAA ATCAAAACCT TGAAGCTCCC TGGTATGGGC CACAGGCCAT CCATTCTACA 7260
TGAACACATA GGGTCTAGTT CTAGAGACAG GAGACAGAAA GGGAAAAAGT CTTCTAAAGA 7320
GACTTGCAAA GAAAAGCATT CCAGTAAATC CTACTTGGAA CCTGGCCAGG TGACAACCGG 7380
TGAGGAAGGA AACCTAAAGC CAGAGTTTGC TGATGAGGTG TTGACTCCTG GGTTTCTTGG 7440
GCAACGACCA TGTAATAATG TTTCATCTGA GAAGATTGGA GATAAAGTCC TTCCTCTTTC 7500
AGGAGTCCCT AAAGGTCAAT CCACACAAGT GGAAGGATCT TCCAAGGAGT TACAGGCACC 7560
CCGGAAGTGC TCGGTCAAAG TGACACCTCT GAAGATGGAA GGTGAGAATC AATCCAAAAA 7620
CACCCAGAAA GAGAGTGGCC CTGGCTCCCC CGCACACATA GAGTCAGTGT GCCCAGCAGA 7680
GCCAGTCTCA GCCTCCAGAA GCCCAGGAGC TGGCCCAGGA GTTCAGCCGA GCCCCAACAA 7740
TACCTTATCC CAAGATCCTC AAAGTAACAA CTACCAGAAT CTTCCAGAAC AGGACAGAAA 7800
CCTGATGATT CCAGATGGCC CCAAGCCTCA GGAGGATGGC TCTTTTAAAA GGCGGTACCC 7860
CCGGCGCAGT GCCCGCGCAC GGTCTAACAT GTTCTTTGGG CTCACCCCAC TGTATGGAGT 7920
CAGGTCTTAC GGTGAAGAAG ACATTCCGTT CTACAGCAAT TCCACTGGGA AAAAGCGAGG 7980
AAAGAGATCA GCAGAAGGCC AGGTAGATGG GGCCGACGAC CTGAGCACTT CCGACGAAGA 8040
TGACTTATAC TATTACAACT TCACGCGGAC TGTGATTTCC TCGGGTGGAG AGGAGCGGCT 8100
GGCCTCCCAT AATTTATTTC GGGAGGAAGA ACAATGTGAT CTTCCAAAAA TTTCACAGCT 8160
GGATGGTGTG GATGATGGGA CAGAGAGTGA CACCAGTGTC ACTGCCACAA GCAGGAAAAG 8220
CAGCCAGATT CCAAAGAGAA ATGGCAAAGA AAATGGAACA GAAAACTTAA AGATTGATCG 8280
ACCTGAAGAT GCTGGCGAGA AAGAGCATGT CATTAAGAGT GCTGTTGGCC ACAAAAACGA 8340
GCCAAAGCTG GATAACTGCC ACTCTGTAAG CAGAGTGAAA GCACAGGGCC AGGATTCCTT 8400
GGAAGCTCAG CTCAGCTCCC TGGAATCGAG CCGCAGAGTC CACACAAGCA CCCCCTCAGA 8460
CAAAAACTTA CTGGATACTT ACAACGCTGA GCTGCTGAAG TCAGACTCTG ACAATAACAA 8520
CAGTGATGAC TGTGGGAACA TCCTGCCTTC AGATATCATG GACTTTGTAC TAAAGAATAC 8580
TCCATCTATG CAGGCCCTGG GTGAGAGCCC GGAGTCGTCC TCCTCTGAGC TCTTGACTCT 8640
TGGTGAAGGA CTGGGTCTTG ACAGTAATAG GGAAAAGGAT ATAGGTCTTT TTGAAGTGTT 8700
TTCTCAGCAA CTGCCAGCGA CAGAGCCTGT GGACAGTAGT GTCTCCTCTT CCATCTCAGC 8760
AGAGGAGCAG TTTGAGCTGC CTCTTGAGCT GCCATCTGAC CTCTCTGTCC TGACCACCCG 8820
CAGCCCCACT GTCCCCAGCC AGAATCCCAG CAGACTGGCC GTAATCTCAG ATTCAGGGGA 8880
GAAGCGAGTG ACCATCACAG AAAAATCAGT AGCCTCTTCC GAGGGTGACC CAGCCCTGCT 8940
GAGTCCAGGA GTAGACCCTG CTCCTGAAGG CCACATGACA CCCGATCATT TCATCCAAGG 9000
ACACATGGAT GCAGACCATA TCTCCAGCCC TCCCTGTGGC TCCGTGGAAC AAGGCCATGG 9060
CAACAGTCAG GATTTAACTA GAAACAGTGG CACTCCTGGC CTTCAGGTAC CTGTTTCCCC 9120
CACTGTTCCC GTCCAGAACC AGAAGTATGT GCCCAGTTCC ACTGACAGCC CTGGCCCATC 9180
TCAGATCTCT AACGCAGCTG TCCAGACCAC TCCACCCCAC CTGAAACCAG CCACTGAGAA 9240
ACTCATTGTT GTTAATCAGA ACATGCAGCC ACTTTATGTT CTCCAGACTC TTCCAAATGG 9300
AGTGACCCAA AAAATCCAGT TGACCTCTCC TGTTAGTTCT ACACCCAGTG TGATGGAGAC 9360
AAATACCTCG GTATTGGGGC CCATGGGAAG TGGGCTCACC CTGACCACAG GACTAAACCC 9420
AAGCTTGCCA CCCTCTCCGT CTCTGTTCCC TCCTGCTAGC AAAGGATTGC TCTCTGTGCC 9480
TCACCACCAG CACCTACATT CCTTCCCCGC AGCTGCTCAA AGTAGTTTCC CTCCCAACAT 9540
CAGCAGTCCT CCTTCAGGCC TGCTCATTGG GGTCCAGCCT CCTCCCGATC CCCAACTTCT 9600
GGGTTCAGAA GCCAACCAGA GGACAGACCT CACTACTACA GTGGCCACTC CATCCTCTGG 9660
ACTCAAGAAA AGACCCATAT CTCGTCTGCA CACTCGAAAG AATAAAAAAC TTGCTCCCTC 9720
TAGTGCCCCT TCAAATATTG CCCCTTCTGA TGTGGTTTCT AACATGACGC TGATTAACTT 9780
CACACCCTCC CAGCTTTCAA ACCACCCCAG TCTGTTAGAC TTGGGGTCAC TTAACCCTTC 9840
ATCTCACCGA ACTGTCCCCA ACATCATAAA AAGATCTAAA TCTGGCATCA TGTATTTTGA 9900
ACAGGCACCC CTGTTACCAC CACAGAGTGT GGGCGGAACC GCTGCCACAG CAGCGGGCTC 9960
ATCAACGATA AGCCAGGATA CTAGCCACCT GACATCTGGG CCTGTGTCTG CCTTGGCATC 10020
CGGTTCGTCC GTCCTGAATG TGGTATCCAT GCAAACTACA GCAGCCCCTA CAAGTAGCAC 10080
ATCAGTTCCA GGTCATGTCA CCTTAGCCAA CCAGAGGTTG CTTGGGACCC CAGATATTGG 10140
CTCAATAAGC CATCTTCTAA TCAAAGCCAG CCACCAGAGC CTGGGCATTC AGGACCAGCC 10200
TGTGGCTTTA CCACCAAGTT CAGGAATGTT CCCCCAGCTG GGGACATCAC AGACTCCCTC 10260
TGCTGCTGCA ATGACAGCAG CATCTAGTAT CTGTGTGCTC CCCTCTTCTC AGACTGCAGG 10320
CATGACAGCT GCATCCCCTC CTGGGGAGGC CGAAGAACAC TATAAGCTAC AGCGAGGAAA 10380
CCAGCTCCTA GCTGGCAAAA CCGGTACCCT GACTTCACAG CGGGACCGGG ATCCCGATTC 10440
TGCTCCGGGG ACCCAGCCGT CCAACTTCAC CCAGACAGCA GAAGCTCCTA ACGGTGTGAG 10500
GCTGGAGCAA AACAAGACTT TACCCTCAGC TAAGCCAGCC AGCTCCGCCT CTCCAGGGAG 10560
CTCCCCATCC TCTGGACAGC AGTCAGGAAG TTCCTCAGTG CCAGGTCCCA CTAAACCAAA 10620
ACCAAAAGCC AAACGGATTC AGCTGCCCCT AGACAAGGGG AGCGGCAAGA AGCACAAAGT 10680
TTCCCATTTG CGGACCAGTT CTGAAGCACA CATTCCACAC CGAGACACCG ACCCCGCACC 10740
CCAGCCCTCA GTGACACGGA CTCCTAGAGC GAACAGGGAG CAGCAGGATG CAGCTGGAGT 10800
GGAGCAGCCA TCGCAGAAGG AGTGTGGGCA GCCGGCAGGA CCAGTGGCTG CTCTTCCAGA 10860
GGTCCAGGCA ACACAGAATC CAGCCAATGA GCAAGAAAAT GCAGAACCTA AAGCAATGGA 10920
AGAAGAAGAG AGTGGTTTCA GCTCTCCTCT GATGCTCTGG CTCCAGCAAG AACAAAAGAG 10980
GAAAGAAAGC ATTACTGAGA GGAAGCCCAA GAAAGGACTC GTTTTTGAAA TTTCAAGTGA 11040
TGATGGCTTT CAGATCTGTG CGGAAAGTAT TGAAGATGCC TGGAAGTCAC TGACAGATAA 11100
AGTCCAGGAG GCACGATCAA ATGCCCGCCT GAAGCAGCTC TCATTTGCAG GTGTGAACGG 11160
TTTGCGGATG CTGGGGATTC TCCATGATGC CGTTGTGTTT CTGATTGAGC AGCTGGCTGG 11220
GGCCAAGCAC TGTCGGAATT ACAAATTCCG TTTCCACAAA CCAGAAGAGG CCAACGAACC 11280
CCCCTTGAAC CCTCACGGCT CAGCCAGGGC TGAGGTCCAC CTAAGGCAAT CAGCATTTGA 11340
CATGTTTAAC TTCCTGGCTT CTAAACATCG ACAGCCCCCT GAGTACAACC CTAACGATGA 11400
GGAAGAGGAA GAGGTCCAGC TGAAATCCGC ACGGAGGGCA ACAAGCATGG ATCTCCCAAT 11460
GCCCATGAGA TTCCGGCACT TGAAGAAGAC TTCTAAGGAG GCGGTTGGTG TCTACAGGTC 11520
TCCCATCCAT GGTCGGGGTC TTTTCTGTAA GAGAAACATC GATGCAGGAG AGATGGTGAT 11580
TGAATACGCC GGCAACGTCA TCCGCTCCAT CCAGACAGAC AAGCGTGAGA AGTACTATGA 11640
CAGCAAGGGC ATTGGTTGCT ACATGTTCCG AATTGATGAC TCGGAGGTAG TGGATGCCAC 11700
CATGCATGGA AATGCTGCAC GCTTCATCAA TCACTCTTGT GAGCCTAACT GCTACTCCCG 11760
GGTCATCAAT ATTGATGGGC AGAAGCACAT TGTCATCTTC GCCATGCGTA AGATCTACCG 11820
GGGGGAGGAG CTCACCTATG ACTATAAGTT CCCCATTGAG GACGCCAGCA ACAAGCTACC 11880
CTGCAACTGT GGCGCCAAAA AATGCCGCAA GTTCCTGAAC TAAAGCTGTT CATCTTCCTG 11940
TGATGGAGAA CCAGGACCCA GGGCCACCCA AAGCCATGCT GAAGGACTTC CCAGCACCCA 12000
AGAGCTCCAA GGATTGAGCA GGCAGTTGAG GGTCCTCTGG CTGGTCCCTA GTGTCCTACA 12060
TATACATCAT GTGATCATAG TCTTGGAGAG AGAAGGGTCT CAAAGAAAAG ATCCCCAGAT 12120
GGCTTTCCCC TGGGCCCTCT TTGATTGTTG AAAAACCTGA GAAACTGGTT CCTGGGAGAA 12180
TTTGCCTGCA AGGAGCATGT AGAGGGTTCC TTACAGTGGG TCTGAGCATG TCCTCAGAGA 12240
GCAGTTTGTC ATCCTCATCT TAGCCCTCTC CCTAAAAACG ATGGGTCAGA CAAGACCCCA 12300
GATACAGGGT TGGTGAGATA CCTGGTAGTT TGCCAGTTAG GCCAGTCCTG TGGCCATCTG 12360
TTGAACAAAC AAATGACCTA GTGGTTTTCC CTACTATCTG CCCACTTAAG AGTTCACTTT 12420
GGTTGGGAGA CAGGTTTCCT AGCACCTCCG GTGTCAAAAG GCTGTCTTGG GGTTGTGCCA 12480
ATTAATTACC AAACATTGAG CCTGTGGCTG TAAGTGGGAG TGTTACCCTG TGAGCCTTAC 12540
CGTAGCCAGT GACCTTTCTT GACGATAGGA GCGGCTCCCT CTCCATCCCT TCTCTTTACT 12600
CCCTCCTCCC CTCCTCCATC CTTCATCTGC TGCTTTCCCA TTCTTTCTGG GTAGCGGGAG 12660
CTTGCCTCCC TGCTCAAGGG CACTCCCTAC TTGGTATAGG AAGTGCCTAC AGAAAGTCCC 12720
CCAAGCCAGT AAGCACTCCA GGTGGGGAAT TGGACAGAAG CCGTTGGCCG TAACCAGACG 12780
GAATTTGGAG ATCTCATAAA GCTCCATTGA GAGTTTTAAA GAGATGTATG TAGCGAGGTT 12840
TTTTTAAACA AGAGACTAAA GATTATTTAA ATAGGATTTG AGTCATGCAG CAGCCTGAGT 12900
CCATAGCCAG GATATGCCCA TCCCCTTCCC AGGACGTGCT TACTCTCTTT CCCCTTTCTG 12960
AAGACATAGG AAGATGAGTT TCTAAAAGGT CAGGGTCCAG CTGAAAGAAC ACTAATCAGA 13020
TTTCAAGGCC CCAAACTTGG GGGACTAGAC CACATGTGCT AAGGGACCTC TGCCACCCGT 13080
GTGCAGCCTG TGGCTGAGCA AGTTCAATGA CACTACTGCC CTGGTTACTC CTTAGGGTGT 13140
GGACAGCCAG CAGCAAATGT TTCTTTCTTC CCCCAAGACA GAGTCTTGAA CCTGTTAGAT 13200
TAAGTCATTG GATTTTCCTC TGTTCTGTTT ACAGTTTACT ATTTAAGGTT TTATAATGTA 13260
AATATATTTT GTATATTTTT CTATTTGAAG CACTTCATAG GGAGAAGCAC TTATGACGAG 13320
GCTATTTTTA AACCGCGGTA TTATCCTAAT TTAAAAGAAG ATCGGTTTTT AATGATTTTT 13380
TATTTTCATA GGATGAAGTG AGAGAAAATA TTCAGCTGTC CACACAAAGT CTGGTTTCCC 13440
TGCCCAGCTT CCCCCTGGAA AGTGTACTTT TTGTTGTTCC ATGTGTAGCT CGTTTGTGCC 13500
CATTGACATA AATGTTCCTT GGGTCTGCTC TTTATAGTAA CTGAAAAAGA AGGTCACCCA 13560
CTCCATTAGG CCACTGCCCT CCAGGGCCAA GGACTGAGGG TACAGAACCT TAGCACACCA 13620
GTGTCTCCTC CTCTTTGCTG TATTGCCCCC TCCCTCTGGA TCAGCCCCAG AGTGGGAAGC 13680
AGCAGGCTCC GTTTCGCTCC CCTTTCTCTT TCTGAGTTAG CAACCAAGAA GCTGCAACTT 13740
GACATTCGCC ATCACATCTG CCTCATCCAT CACCTCCTTT CCTTCTCTGC CCACCAAGTC 13800
CTTGTACCCG CAGAGAACCC ACTGACCGCC TCCTGCCCTC TCGGGGCAGA TTGTTGAACC 13860
TGAAGCACAG TATGACCACT CACGATCAAG CAGATCTCTG CGCCTGCCAC AAGGTTTCAG 13920
GGTAGCGTAG TCCGAGTGGA GGGCAGGGCA CCCTTTCTCT TATGGAAGTC AGCAAAGCAA 13980
TATGATGCAG CCCAGAACTC TCTGCCAGGA CTCGTGGCTC TGCTGTGCCT TCCATCCTGG 14040
GCTCCTTTCC TTCTGTGACC TTAAGAACTT TGTCTGGTGG CTTTGCTGGA ACATTGTCAC 14100
TGTTTCCACT GTGGGAAGCC CAGCACTGTG GCCAGGATGG CAGAGATTTC CTTGTCATCA 14160
TGGAGAAGTG CCAGCAGGGG ACTGGGGAAA AGCACTCTAC CCAGACCTCA CCTCCTACCC 14220
TCCTTTGCCC ATGAACAAGA CGCAGTGGCC CAAGGGGTTT CACTAGTGTC TGGTTTCCTT 14280
CTTATTGCAC TGTGTGAGGT TTTTTTGTAA ATCCTTGTAT TCCTAATTTT TTTTTATGAA 14340
AAAATGTAAG CTGCATTTGT TACTGAAAGA TTAAATGCAC TGATGGGTCA TGCGTTCATC 14400
CTGAGAGACC CAAAGGCCAG TCAGAGGGTG GGGGGAACTC AGCTAACAGA CCTAGTCACT 14460
GCCCTGCTAG GCCATGCTGT ACTGTGAGCC CCCTCCTCAT CCTCTTCCTA CAACCCCAAT 14520
CCCTGAGGAC GGGGGGAACC CACCTTTCCT CCTCCTCCAG CTGGTTTGCC TTGCCCTCCC 14580
ACTCACTGTC AACCACAGAA ACGAGAAATT CCTCTTTCAG CTCAGCCTTG AGTCCATTGC 14640
CAAAATTCAG CATACCTGCC AGCAACTTGG GGGATAAGCC AGAGTATCCC CACAAGCGGG 14700
AGAGAAGGCA ACAAAACAGA AGGCACAGCT GTCTCCAAAA CACATCTGCT TTGTTTTGAA 14760
AGTGACCAAG AGAACCTCCG CACAAAAGTG CAGGTTGAGG ACTTTGCGCT GGGTCATTCC 14820
CAAGAATCCC CCAAGGGGCA ACCCACCGCC TCAGGAGTGA CAGCTGCGGA CCTCGAGGGT 14880
TCCGGCTTTG CTGCTAGAAC CTGTTGTGGC TGCGTTTCCT GGTGGCAGTG ACAACTGTGT 14940
AACCAGAATA GCTGCATGGC GCTGACCCTT TGGTCGGAAC TTGGTCCCTG GGCTCCCTCA 15000
GTGCTGCCCA TGTCCAGCAC AACCCCTCCC TGTTTTTAAA CCAATCACAA TTAAGGAGGA 15060
AGCCCTGGCA CTTCTTAGGT TTTCAACCCA AACTCCTTTT TCAGGACCCA GCTCACCTCT 15120
GTCAAACCCC GGCCAATCCA ATAAGCACCA TGCAGCAACC TTGATTTAAA AAAAGAAGAA 15180
AAGAAGAAAA AAAAAAAACT TAAAATAAAA TAAAATAAAA AAATACAACA CACATACACA 15240
AAAAAATCTT TTAATGAATG TATCTTTCTA AAGGACTGAC GCTCAATCAA ATACCTGAAA 15300
ATACTAGAGG TCACAGCCTC ATCTGATGTT AACTTTTATT GGATTTGGGA TTCTTTTTCA 15360
TAGAAACCAA GTTGTTTTTT GTTGTTATTG TTGTTGTTGT TGTTGTTGTT TTTTAAGGAA 15420
AAGCGGGTCA TTGCAAAGGG CTGGGTGTAA TTTTACGTTT CCTTTCCTTC ATTTTAAAGC 15480
AATACAAAGT TATTGAACAG ATAGTTTTGT GCCGAATCAT GAATACCAGT CAAGTCTCAC 15540
ACTCTGAAAA CTTGCAACCT TTTTGTTTGT TTGTTTTGGT TTTCAAATAA ATATAAATAT 15600
ATATATATAT AGGAACTAAT ATAGGAATGC ACCATTGTAA CAAAGCCTAG TTCAGTCCAT 15660
GGCTTTTACT TCTCTTAACA CTATAGATAA GGATTGTGCT ACAGTTGCTA GTGGGGCAGG 15720
GAAATGTCAG GCTCCCAGTG ACAGTGACGG TGGTGCTGAC TCCACATACG GATGACAGAC 15780
GCCCGCCTGT CCCGAGAGGA GCGTGCAGAG CAGATCTGCT CTCACCTGGC TTCTTCCTGC 15840
CTGTGGACGT TGCCAGTCGG TACCTGTACT CCTCGTCTAC TTCCGGTTAT GAATGTTGGG 15900
GTCACCACCT GCATCTAGGG GAAAATTGTG TTCTGTGCTT TCTGGTATCT TGTTCTGGGG 15960
TACACTAGTT CTGTCTTTCA ACCAAGAAAA AAAAATAGAT TTGTGGTGTT TCTTTTCTTG 16020
AACTTTAACA GTCTCCTTAG CAAATACAGG TAGTTGAATA ATTGTTTCAT GAGCTGAACA 16080
GTGGCAAGCT TCATTTCTAG AATAAGACAT TTCTCTACAG CTGTGTCATG TACAATGGAT 16140
CTTTTGGGGG TTTTTTGTTT GTTTTGTTTG CTTCCCTTTT TTTCCTTGTG TTCTTCCAAG 16200
CTTCTGGTTA GAGACAAAGT GGGGGGGGGG AAGGAAAACG TGTCTGAAGC CCATCAGTGT 16260
TAACTCCCTG AGACAGGGAT GAAGGAAAAT ACTTTAATAT TCAAAAAATA ATAATGCTGA 16320
AAGCTCTCTA CGAAAGACTG AATGTAAAAG TAAAAAGTGT ACATAGTTGT AAAAACAAAA 16380
AAAGGAGTTT TTAAACATGT TTATTTTCTA TGCACTTTTT TTTATTTAAG TGATAGTTTA 16440
ATTAATAAAC ATGTCAAGTT TATTGCTGCA 16471
Sequence Source Ensembl
Keyword

KW-0007--Acetylation
KW-0025--Alternative splicing
KW-0090--Biological rhythms
KW-0103--Bromodomain
KW-0156--Chromatin regulator
KW-0181--Complete proteome
KW-0238--DNA-binding
KW-1017--Isopeptide bond
KW-0479--Metal-binding
KW-0489--Methyltransferase
KW-0539--Nucleus
KW-0597--Phosphoprotein
KW-0621--Polymorphism
KW-1185--Reference proteome
KW-0677--Repeat
KW-0949--S-adenosyl-L-methionine
KW-0804--Transcription
KW-0805--Transcription regulation
KW-0808--Transferase
KW-0832--Ubl conjugation
KW-0862--Zinc
KW-0863--Zinc-finger
--

Interpro

IPR001487--Bromodomain
IPR003889--FYrich_C
IPR003888--FYrich_N
IPR016569--MeTrfase_trithorax
IPR003616--Post-SET_dom
IPR001214--SET_dom
IPR002857--Znf_CXXC
IPR011011--Znf_FYVE_PHD
IPR001965--Znf_PHD
IPR019787--Znf_PHD-finger
IPR013083--Znf_RING/FYVE/PHD

PROSITE

PS50014--BROMODOMAIN_2
PS51543--FYRC
PS51542--FYRN
PS50868--POST_SET
PS50280--SET
PS51058--ZF_CXXC
PS01359--ZF_PHD_1
PS50016--ZF_PHD_2

Pfam

PF05965--FYRC
PF05964--FYRN
PF00628--PHD
PF00856--SET
PF02008--zf-CXXC

Gene Ontology

GO:0005737--C:cytoplasm
GO:0035097--C:histone methyltransferase complex
GO:0071339--C:MLL1 complex
GO:0005654--C:nucleoplasm
GO:0005634--C:nucleus
GO:0003682--F:chromatin binding
GO:0001046--F:core promoter sequence-specific DNA binding
GO:0003677--F:DNA binding
GO:0042800--F:histone methyltransferase activity (H3-K4 specific)
GO:0042802--F:identical protein binding
GO:0070577--F:lysine-acetylated histone binding
GO:0042803--F:protein homodimerization activity
GO:0044212--F:transcription regulatory region DNA binding
GO:0045322--F:unmethylated CpG binding
GO:0008270--F:zinc ion binding
GO:0009952--P:anterior/posterior pattern specification
GO:0032922--P:circadian regulation of gene expression
GO:0050890--P:cognition
GO:0060216--P:definitive hemopoiesis
GO:0006306--P:DNA methylation
GO:0035162--P:embryonic hemopoiesis
GO:0035640--P:exploration behavior
GO:0044648--P:histone H3-K4 dimethylation
GO:0051568--P:histone H3-K4 methylation
GO:0080182--P:histone H3-K4 trimethylation
GO:0043984--P:histone H4-K16 acetylation
GO:0048873--P:homeostasis of number of cells within a tissue
GO:0051899--P:membrane depolarization
GO:0008285--P:negative regulation of cell proliferation
GO:0018026--P:peptidyl-lysine monomethylation
GO:2001040--P:positive regulation of cellular response to drug
GO:0051571--P:positive regulation of histone H3-K4 methylation
GO:0045944--P:positive regulation of transcription from RNA polymerase II promoter
GO:0045893--P:positive regulation of transcription, DNA-templated
GO:0032411--P:positive regulation of transporter activity
GO:0009791--P:post-embryonic development
GO:0006461--P:protein complex assembly
GO:0010468--P:regulation of gene expression
GO:0071440--P:regulation of histone H3-K14 acetylation
GO:1901674--P:regulation of histone H3-K27 acetylation
GO:0051569--P:regulation of histone H3-K4 methylation
GO:2000615--P:regulation of histone H3-K9 acetylation
GO:0048172--P:regulation of short-term neuronal synaptic plasticity
GO:0009416--P:response to light stimulus
GO:0035864--P:response to potassium ion
GO:0048536--P:spleen development
GO:0006351--P:transcription, DNA-templated
GO:0008542--P:visual learning

Orthology
WERAM ID Ensembl Protein ID Species Identity E-value Score
WERAM-Ran-0100 ENSRNOP00000020573.6 Rattus norvegicus 96 0.0 4511
WERAM-Eqc-0038 ENSECAP00000006426.1 Equus caballus 90 0.0 4218
WERAM-Otg-0095 ENSOGAP00000008374.2 Otolemur garnettii 90 0.0 4193
WERAM-Aim-0147 ENSAMEP00000013764.1 Ailuropoda melanoleuca 89 0.0 4192
WERAM-Ova-0104 ENSOARP00000010552.1 Ovis aries 89 0.0 4183
WERAM-Caf-0134 ENSCAFP00000018720.5 Canis familiaris 89 0.0 4170
WERAM-Nol-0074 ENSNLEP00000008984.1 Nomascus leucogenys 90 0.0 4148
WERAM-Mam-0038 ENSMMUP00000007251.2 Macaca mulatta 89 0.0 4148
WERAM-Loa-0136 ENSLAFP00000012406.4 Loxodonta africana 89 0.0 4147
WERAM-Chs-0032 ENSCSAP00000015460.1 Chlorocebus sabaeus 89 0.0 4142
WERAM-Bot-0163 ENSBTAP00000024084.5 Bos taurus 89 0.0 4138
WERAM-Hos-0096 ENSP00000374157.5 Homo sapiens 89 0.0 4135
WERAM-Pat-0034 ENSPTRP00000040970.5 Pan troglodytes 89 0.0 4133
WERAM-Poa-0036 ENSPPYP00000004505.2 Pongo abelii 89 0.0 4129
WERAM-Paa-0065 ENSPANP00000009106.1 Papio anubis 89 0.0 4121
WERAM-Cap-0070 ENSCPOP00000005242.2 Cavia porcellus 89 0.0 4076
WERAM-Orc-0100 ENSOCUP00000008738.2 Oryctolagus cuniculus 88 0.0 4073
WERAM-Tut-0049 ENSTTRP00000004041.1 Tursiops truncatus 87 0.0 4023
WERAM-Myl-0155 ENSMLUP00000012825.2 Myotis lucifugus 84 0.0 3934
WERAM-Sah-0138 ENSSHAP00000014807.1 Sarcophilus harrisii 82 0.0 3849
WERAM-Mod-0116 ENSMODP00000016853.4 Monodelphis domestica 83 0.0 3848
WERAM-Ptv-0056 ENSPVAP00000005910.1 Pteropus vampyrus 83 0.0 3831
WERAM-Prc-0053 ENSPCAP00000005189.1 Procavia capensis 85 0.0 3830
WERAM-Dio-0068 ENSDORP00000006787.1 Dipodomys ordii 83 0.0 3719
WERAM-Ocp-0107 ENSOPRP00000010334.2 Ochotona princeps 83 0.0 3675
WERAM-Ict-0127 ENSSTOP00000012923.2 Ictidomys tridecemlineatus 88 0.0 3642
WERAM-Gaga-0075 ENSGALP00000011008.4 Gallus gallus 78 0.0 3536
WERAM-Gog-0013 ENSGGOP00000000936.2 Gorilla gorilla 88 0.0 3525
WERAM-Tag-0002 ENSTGUP00000000072.1 Taeniopygia guttata 78 0.0 3524
WERAM-Fia-0123 ENSFALP00000010386.1 Ficedula albicollis 78 0.0 3503
WERAM-Pes-0058 ENSPSIP00000007945.1 Pelodiscus sinensis 78 0.0 3502
WERAM-Anp-0040 ENSAPLP00000004456.1 Anas platyrhynchos 79 0.0 3501
WERAM-Meg-0028 ENSMGAP00000002448.2 Meleagris gallopavo 75 0.0 3313
WERAM-Soa-0133 ENSSARP00000013142.1 Sorex araneus 80 0.0 3070
WERAM-Dan-0088 ENSDNOP00000008968.3 Dasypus novemcinctus 84 0.0 3051
WERAM-Tas-0109 ENSTSYP00000011196.1 Tarsius syrichta 83 0.0 2882
WERAM-Anc-0073 ENSACAP00000006937.3 Anolis carolinensis 65 0.0 2804
WERAM-Lac-0176 ENSLACP00000020625.1 Latimeria chalumnae 65 0.0 2787
WERAM-Mup-0053 ENSMPUP00000004604.1 Mustela putorius furo 89 0.0 2768
WERAM-Chh-0055 ENSCHOP00000006343.1 Choloepus hoffmanni 88 0.0 2723
WERAM-Tub-0079 ENSTBEP00000009516.1 Tupaia belangeri 84 0.0 2698
WERAM-Ect-0072 ENSETEP00000007880.1 Echinops telfairi 81 0.0 2642
WERAM-Mae-0003 ENSMEUP00000000157.1 Macropus eugenii 79 0.0 2352
WERAM-Vip-0054 ENSVPAP00000005197.1 Vicugna pacos 87 0.0 1788
WERAM-Ora-0035 ENSOANP00000005967.3 Ornithorhynchus anatinus 83 0.0 1345
WERAM-Leo-0025 ENSLOCP00000004893.1 Lepisosteus oculatus 76 0.0 1255
WERAM-Xet-0067 ENSXETP00000022279.3 Xenopus tropicalis 74 0.0 1191
WERAM-Ere-0135 ENSEEUP00000014133.1 Erinaceus europaeus 79 0.0 1131
WERAM-Dar-0010 ENSDARP00000095298.3 Danio rerio 68 0.0 1129
WERAM-Mim-0138 ENSMICP00000013960.1 Microcebus murinus 84 0.0 1124
WERAM-Orn-0070 ENSONIP00000007849.1 Oreochromis niloticus 69 0.0 1123
WERAM-Gaa-0091 ENSGACP00000011913.1 Gasterosteus aculeatus 70 0.0 1123
WERAM-Pof-0188 ENSPFOP00000015902.2 Poecilia formosa 69 0.0 1088
WERAM-Xim-0097 ENSXMAP00000008788.1 Xiphophorus maculatus 64 0.0 1069
WERAM-Ten-0223 ENSTNIP00000002397.1 Tetraodon nigroviridis 66 0.0 1067
WERAM-Gam-0119 ENSGMOP00000012588.1 Gadus morhua 66 0.0 1048
WERAM-Orla-0075 ENSORLP00000009606.2 Oryzias latipes 45 0.0 865
WERAM-Sus-0021 ENSSSCP00000003118.2 Sus scrofa 48 0.0 770
WERAM-Caj-0107 ENSCJAP00000018740.2 Callithrix jacchus 50 0.0 767
WERAM-Fec-0070 ENSFCAP00000005933.3 Felis catus 50 0.0 766
WERAM-Tar-0189 ENSTRUP00000038926.1 Takifugu rubripes 50 0.0 728
WERAM-Asm-0037 ENSAMXP00000004791.1 Astyanax mexicanus 46 0.0 719
WERAM-Pem-0014 ENSPMAP00000002218.1 Petromyzon marinus 59 0.0 690
WERAM-Cis-0045 ENSCSAVP00000009955.1 Ciona savignyi 36 1e-131 470
WERAM-Cii-0034 ENSCINP00000025384.2 Ciona intestinalis 54 3e-100 366
WERAM-Drm-0010 FBpp0082406 Drosophila melanogaster 47 8e-89 328
WERAM-Tum-0027 CAZ85029 Tuber melanosporum 50 3e-42 173
WERAM-Cae-0021 C26E6.9a Caenorhabditis elegans 54 4e-42 173
WERAM-Php-0006 PP1S101_4V6.1 Physcomitrella patens 54 5e-42 172
WERAM-Org-0116 ORGLA12G0159200.1 Oryza glaberrima 54 6e-42 172
WERAM-Ors-0112 OS12T0613200-02 Oryza sativa 54 8e-42 172
WERAM-Sei-0078 Si021071m Setaria italica 54 2e-41 171
WERAM-Orbr-0127 OB12G25360.1 Oryza brachyantha 53 2e-41 171
WERAM-Asn-0015 CADANIAP00003254 Aspergillus nidulans 53 3e-41 170
WERAM-Brd-0092 BRADI4G01790.1 Brachypodium distachyon 54 3e-41 170
WERAM-Ast-0003 CADATEAP00001100 Aspergillus terreus 52 3e-41 170
WERAM-Coi-0035 EAS31778 Coccidioides immitis 52 4e-41 170
WERAM-Asc-0034 CADACLAP00008186 Aspergillus clavatus 52 4e-41 169
WERAM-Asni-0037 CADANGAP00014055 Aspergillus niger 52 4e-41 169
WERAM-Aso-0006 CADAORAP00000676 Aspergillus oryzae 52 5e-41 169
WERAM-Thc-0094 EOY15831 Theobroma cacao 55 5e-41 169
WERAM-Prp-0018 EMJ21490 Prunus persica 55 8e-41 169
WERAM-Sol-0003 Solyc01g006880.2.1 Solanum lycopersicum 53 8e-41 169
Created Date 25-Jun-2016