----------------------------------------------------------------------------
DATA Set 2: COG0532
----------------------------------------------------------------------------


Organism: Caulobacter crescentus CB15
Description: translation initiation factor IF-2
PID: 16124298
Length of upstream non-coding sequence: 195
Label: CC0042

Organism: Rickettsia prowazekii
Description: TRANSLATION INITIATION FACTOR IF-2 (infB)
PID: 15604407
Length of upstream non-coding sequence: 214
Label: RP552

Organism: Synechocystis sp. PCC 6803
Description: initiation factor IF-2
PID: 16329288
Length of upstream non-coding sequence: 578
Label: slr0744

Organism: Mycobacterium leprae
Description: initiation factor IF-2
PID: 15827818
Length of upstream non-coding sequence: 65
Label: ML1556

Organism: Mycobacterium tuberculosis H37Rv
DEFINITION  infB
PID: 15609976
Length of upstream non-coding sequence: 86
Label: Rv2839c

CC0042          --------------MEVRPGPFLTWNERFFSRVPPGGSTSERMSDENENGRPGGRTPMTL
RP552           ------------------------------------------MTDNQE-----------I
slr0744         MNNAKVRIYDLSKELNLENRDILDICERLNVAAKSHSSTISESDAERIKAAAEKFTPQQP
ML1556          ------------------MAGKARVHELAKELGVTSKEVLARLNEQGEFVKSASSTVEAP
Rv2839c         -----------------MAAGKARVHELAKELGVTSKEVLARLSEQGEFVKSASSTVEAP
                                                             :              

CC0042          KPRQGSVSAGVVKQSFSHGRTKTVVVETKRTRTHAPASGNLAAPSSAERRHGEAPAPRPA
RP552           KPKK-----------LTLGNSKLLLNKSFDSLTGAQSFVNAKSKTLVEVRK---------
slr0744         KKPRVASRPESKEDKSDPKQQKILAIHHKQEKSGGPSPARPTPPPRPKLQAPKAPTPPQP
ML1556          VARRLRESFGGIKPAADKGAEQVATKAQAKRLGESLDQTLDRALDKAVAGNGATTAAPVQ
Rv2839c         VARRLRESFGGSKPAPAKGTAKSPGKGPDK----SLDKALDAAID-MAAGNGKATAAPAK
                   :                 :            .       .                 

CC0042          PPQGGGGGSAGGLSQEELRARQRVVDAAREAQARQVAEQAAAEARARAAQEAAQREAAAK
RP552           --SSIGSTTTISLNKERNSLDQSVIDSNKE--------------------EFNRRLSILK
slr0744         PVAKASAPKIQKQEEPAQEAPKSVAPPTQPLAPPPVPSLQSPPS------KPAPPTPPAK
ML1556          VDHSAAVVPIVAGEGPSTAHREELAPPAGQPS------------------EQPGVPLPGQ
Rv2839c         AADSGGAA-IVSPTTPAAPEPPTAVPP---------------------------------
                     .                    .                                 

CC0042          AAAERAAAAPPPVAQAPAAPAPAAPVTPPPAAPQAPRPVAQAPVAPSAPRQDAPRQDTRA
RP552           KAAEQSKLHDP---------AQISTLSKLASINQSINSKNEQSITDKAVEQKHQNIE---
slr0744         KAAPAPRLAGPPGRTASPNKTAVPAPAKPKVNRPEIVSLKDNRGQARSPGDREEKVAIAA
ML1556          QGTPAAPHPGHPGMPTGPHPGPAPKPGGRPPRVGNNPFSSAQSVARPIPRPPAPRP----
Rv2839c         --SPQAPHP---GMAPGARPGPVPKPGIRTPRVGNNPFSSAQPADRPIPRPPAPRPGTAR
                  :  .                 .                              .     

CC0042          AAPGQTRTYEPSRDRRDDRPSTTTYRPAPQGDRPFNQRAPRPDANANFGQRAPRPEGDRP
RP552           -------------------------------DNKVEIAAKIVQDNENISSQIPKKKKE--
slr0744         PEPPKPK------------------------VELRRPKPPRPEEDENLPELLEFPPLSRG
ML1556          ---------------------------------SASPSSMSPRPGGAVGGGGPRPPRT--
Rv2839c         PGVPR---------------------------PGASPGSMPPRPGGAVGG--ARPPRP--
                                                      .     .  .            

CC0042          RGPRPDGDRPQGDRGGYRGDRPQGDRPQGDRPQQTVRYSALAPRPAPGARGPGGPPRGPR
RP552           ---------------------------------------TLAKSVLVGMR--------TR
slr0744         KGVDGDNDADDGD--------------------------LLSTEKPKPKLKRPTPPRLGK
ML1556          -------------------------------------------GVPRPGGGRPGAPVGGR
Rv2839c         -------------------------------------------GAPRPGG-RPGAPGAGR
                                                                           :

CC0042          PGVPAAAPATPEIQRATRSAPRPGGGAMDRRPDEDDDRRKNAAPNKAVSRVKGAPQRREG
RP552           YGIEEEP-------------------ALEKTVDN-----KVVVPKIKLEESK----KFKK
slr0744         PDQWEDDEDEKANKAKAANKGKRRPKMDDDDDDLDIDGDNGPKPTLVSLSIARPPKPKSL
ML1556          SDAGGGN---------------YRGGGVGALPGGGSGGFRGRPGGGGHGGGGRPGQRGGA
Rv2839c         SDAGGGN---------------YRGGGVGAAPGTG---FRGRPGGG---GGGRPGQRGGA
                 .                              .      .                    

CC0042          RLTIQAVAGDGDSADRMRSLASVRRAREREKEKRRGGAVEQARVAREVVIPDVITVQELS
RP552           ADLFNMLSDDENGSGRTRSLASIKRARE--KEKRKLVSQVPEKVYREVTIPEVIGVGDLA
slr0744         AAKPSTPTVAKVKKPTLKSEAGSSAGGSSRSRGDRRDRKEVVQKPEVIMLDRSLTVRDLA
ML1556          AGAFGRPGGAPRRGR--KSKRQKRQEYDSMQAPVVGGVRLPHGNGETIRLARGASLSDFA
Rv2839c         AGAFGRPGGAPRRGR--KSKRQKRQEYDSMQAPVVGGVRLPHGNGETIRLARGASLSDFA
                                 :*        .  .              . : :     : :::

CC0042          NRMAVRGVDIIKFLMRQGVMLKINDVIDNDTAELVATEFGHTVKRVS--EADVEEGFIGA
RP552           NAMSERVADVIKELMKLGILANASQTIDADTAELVATNLGHTVTRVQ--ESDVEN-ILIN
slr0744         DLLKISETDIIKRLFLKGVAVQITQTLDEETARMVAESFEVAVETPERVAAAAKTTEMLD
ML1556          EKIDANPAALVQALFNLGEMVTATQSVGDETLELLGSEMNYNVQVVSPEDEDRELLEPFD
Rv2839c         DKIDANPAALVQALFNLGEMVTATQSVGDETLELLGSEMNYNVQVVSPEDEDRELLESFD
                : :    . ::: *:  *     .: :. :* .::. .:   *   .      :      

CC0042          DDH------DEHMDLRPPVVTIMGHVDHGKTSLLDALRSTDVAAGEAGGITQHIGAYQVR
RP552           DDK------VEDLRTRAPVVTVMGHVDHGKTSLLDALKSTDIAAGELGGITQHIGAYRVT
slr0744         EAD------LDNLVRRPPVVTIMGHVDHGKTTLLDSIRKTKVAQGEAGGITQHIGAYHVE
ML1556          LTYGEDQGDEDELQVRPPVVTVMGHVDHGKTRLLDTIRKANVREAEAGGITQHIGAYQVG
Rv2839c         LSYGEDEGGEEDLQVRPPVVTVMGHVDHGKTRLLDTIRKANVREAEAGGITQHIGAYQVA
                          :.:  *.****:********* ***:::.:.:  .* **********:* 

CC0042          LKD---GQRVTFLDTPGHAAFSSMRARGANITDIVVLVVAGDDGVMPQTIEAIKHAKAAE
RP552           LAD---SKAITFIDTPGHEAFSEMRSRGAKVTDIVIIVVAADDGIKTQTVEAINHAKAAG
slr0744         VEHNDKTEQIVFLDTPGHEAFTAMRARGAKVTDIAILVVAADDGVQPQTKEAISHAKAAG
ML1556          VDLDGSERLITFIDTPGHEAFTAMRARGAKATDIAILVVAADDGVMPQTVEAINHAQAAD
Rv2839c         VDLDGSQRLITFIDTPGHEAFTAMRARGAKATDIAILVVAADDGVMPQTVEAINHAQAAD
                :      . :.*:***** **: **:***: ***.::***.***: .** ***.**:** 

CC0042          VPIIVAVNKMDKPGSDPTRVVNELLQHEIVVESLGGDTQLIEVSAKARTGLDNLLEAILL
RP552           VPIIVAINKIDKPDIDIERIKNELYVHEIIGEEAGGDVIFIPISALKKINLDKLEEAILL
slr0744         VPLIVAINKVDKPEANPDRIKQELSELGLLAEEWGGDTIMVPVSALNGDNLDGLLEMILL
ML1556          VPIVVAVNKIDKEGADPAKIRGQLTEYGLVAEDFGGDTMFIDISAKVGTNIEALLEAVLL
Rv2839c         VPIVVAVNKIDKEGADPAKIRGQLTEYGLVPEEFGGDTMFVDISAKQGTNIEALEEAVLL
                **::**:**:**   :  ::  :*    :: *. ***. :: :**    .:: * * :**

CC0042          QA-EVLDLKANPDRSADGVVIEAKLDKGRGAVSTVLVNRGTLKRGDIVVAGSQWGKVRAL
RP552           IS-EMQDLKASPFGLASGVVIESKIEKGRGTLTTILVQRGTLRNGDIIIAGTSYGKVKKM
slr0744         VS-EVEELVANPNRQAKGTVIEANLDRTRGPVATLLIQNGTLRVGDAIVVGAVYGKIRAM
ML1556          TADAALDLRANSGMEAQGVAIEAHLDRGRGPVATVLVQRGTLRIGDSVVAGDAYGRVRRM
Rv2839c         TADAALDLRANPDMEAQGVAIEAHLDRGRGPVATVLVQRGTLRVGDSVVAGDAYGRVRRM
                 :    :* *..   *.*..**::::: **.::*:*::.***: ** ::.*  :*::: :

CC0042          LNERNEQLQEAGPATPVEILGLDGVPSPGDAFAVVENEARARELTEYRIRLKREKSMAPV
RP552           INDKGREILEATPSVPVEIQGLNEVPFAGDQFNVVQNEKQAKDIAEYRIRLAKEKKIS-V
slr0744         IDDRGDKVEEASPSFAVEILGLGDVPAAGDEFEVFTNEKDARLQAEARAMEDRQTRLQQA
ML1556          VDEHGVDIEAALPSSPVQVIGFTSVPGAGDNFLVVDEDRIARQIADRRSARKRNALAARS
Rv2839c         VDEHGEDVEVALPSRPVQVIGFTSVPGAGDNFLVVDEDRIARQIADRRSARKRNALAARS
                ::::. .:  * *: .*:: *:  ** .** * *. ::  *:  :: *    ::      

CC0042          GAGASMADMMAKLQ--DKKLKELPLVIKADVQGSAEAIIGSLDKMATD-EVRARIILSGA
RP552           ASRSSLEELLLKASG-NSKIKELPLIIKCDVQGSIEAISGSLLKLPSD-EIKLRILHSGV
slr0744         MSSRKVTLSSISAQAQEGELKELNIILKADVQGSLGAILGSLEQLPQG-EVQIRVLLASP
ML1556          RKRISLEDLDSALK----ETSQLNLILKGDNAGTVEALEEALMGIQVDDEVALRVIDRGV
Rv2839c         RKRISLEDLDSALK----ETSQLNLILKGDNAGTVEALEEALMGIQVDDEVVLRVIDRGV
                    .:       .    : .:* :::* *  *:  *:  :*  :  . *:  *::  . 

CC0042          GAISESDVMLAKGAGAPVIGFNVRASAQARALAEREGVEIRYYAIIYDLLDDIKGVLSGM
RP552           GPITESDVSLAHVSSAIVVGFNVRAWANALTAAEKTKVDIRYYSIIYNLIDDVKAIMSGM
slr0744         GEVTETDVDLAAASGAIIIGFNTTLASGARQAADQEGVDIREYDIIYKLLDDIQGAMEGL
ML1556          GGITETNVNLASASDAIIIGFNVRAEGKATELASREGVEIRYYLVIYQAIDDIEKALRGM
Rv2839c         GGITETNVNLASASDAVIIGFNVRAEGKATELASREGVEIRYYSVIYQAIDEIEQALRGL
                * ::*::* **  :.* ::***.   . *   *.:  *:** * :**. :*:::  : *:

CC0042          LAPIQRETFLGNAEVLQAFDISKIGKVAGCKVTEGVVRKGAKVRIIRQDIVVLELGTLQT
RP552           LEPIVREQYIGSVEIRQIFNITKIGKIAGSYVTKGIIKKGAGVRLLR-DNVVIHAGKLKT
slr0744         LDPEEIESSLGTAEVRAVFPVGRG-NIAGCYVQSGKIIRNRNLRVRRGDQVLFEG-NIDS
ML1556          LKPIYEENQLGRAEIRALFRSSKVGIIAGCIISSGVVRRNAKVRLLRDNIVVTDNLTVTS
Rv2839c         LKPIYEENQLGRAEIRALFRSSKVGLIAGCLVTSGVMRRNAKARLLRDNIVVAENLSIAS
                * *   *  :* .*:   *   :   :**. : .* : :.   *: * : *: .  .: :

CC0042          LKRFKDEVNEVPVGQECGMMFAGFQDIKVGDTIECFTVEEIKRQLD-
RP552           LKRFKDEVKEVREGYECGIAFENYEDIREGDTVEVFELVQEQRQL--
slr0744         LKRIKEDVREVNAGYECGIGCSKFNDWKEGDIIEAYEMTMKRRTLAT
ML1556          LRREKDDVTEVREGFECGMTLG-YSDIKEGDVIESYELVEKQRA---
Rv2839c         LRREKDDVTEVRDGFECGLTLG-YADIKEGDVIESYELVQKERA---
                *:* *::* **  * ***:    : * : ** :* : :   .*