---------------------------------------------------------------------------- DATA Set 2: COG0532 ---------------------------------------------------------------------------- Organism: Caulobacter crescentus CB15 Description: translation initiation factor IF-2 PID: 16124298 Length of upstream non-coding sequence: 195 Label: CC0042 Organism: Rickettsia prowazekii Description: TRANSLATION INITIATION FACTOR IF-2 (infB) PID: 15604407 Length of upstream non-coding sequence: 214 Label: RP552 Organism: Synechocystis sp. PCC 6803 Description: initiation factor IF-2 PID: 16329288 Length of upstream non-coding sequence: 578 Label: slr0744 Organism: Mycobacterium leprae Description: initiation factor IF-2 PID: 15827818 Length of upstream non-coding sequence: 65 Label: ML1556 Organism: Mycobacterium tuberculosis H37Rv DEFINITION infB PID: 15609976 Length of upstream non-coding sequence: 86 Label: Rv2839c CC0042 --------------MEVRPGPFLTWNERFFSRVPPGGSTSERMSDENENGRPGGRTPMTL RP552 ------------------------------------------MTDNQE-----------I slr0744 MNNAKVRIYDLSKELNLENRDILDICERLNVAAKSHSSTISESDAERIKAAAEKFTPQQP ML1556 ------------------MAGKARVHELAKELGVTSKEVLARLNEQGEFVKSASSTVEAP Rv2839c -----------------MAAGKARVHELAKELGVTSKEVLARLSEQGEFVKSASSTVEAP : CC0042 KPRQGSVSAGVVKQSFSHGRTKTVVVETKRTRTHAPASGNLAAPSSAERRHGEAPAPRPA RP552 KPKK-----------LTLGNSKLLLNKSFDSLTGAQSFVNAKSKTLVEVRK--------- slr0744 KKPRVASRPESKEDKSDPKQQKILAIHHKQEKSGGPSPARPTPPPRPKLQAPKAPTPPQP ML1556 VARRLRESFGGIKPAADKGAEQVATKAQAKRLGESLDQTLDRALDKAVAGNGATTAAPVQ Rv2839c VARRLRESFGGSKPAPAKGTAKSPGKGPDK----SLDKALDAAID-MAAGNGKATAAPAK : : . . CC0042 PPQGGGGGSAGGLSQEELRARQRVVDAAREAQARQVAEQAAAEARARAAQEAAQREAAAK RP552 --SSIGSTTTISLNKERNSLDQSVIDSNKE--------------------EFNRRLSILK slr0744 PVAKASAPKIQKQEEPAQEAPKSVAPPTQPLAPPPVPSLQSPPS------KPAPPTPPAK ML1556 VDHSAAVVPIVAGEGPSTAHREELAPPAGQPS------------------EQPGVPLPGQ Rv2839c AADSGGAA-IVSPTTPAAPEPPTAVPP--------------------------------- . . CC0042 AAAERAAAAPPPVAQAPAAPAPAAPVTPPPAAPQAPRPVAQAPVAPSAPRQDAPRQDTRA RP552 KAAEQSKLHDP---------AQISTLSKLASINQSINSKNEQSITDKAVEQKHQNIE--- slr0744 KAAPAPRLAGPPGRTASPNKTAVPAPAKPKVNRPEIVSLKDNRGQARSPGDREEKVAIAA ML1556 QGTPAAPHPGHPGMPTGPHPGPAPKPGGRPPRVGNNPFSSAQSVARPIPRPPAPRP---- Rv2839c --SPQAPHP---GMAPGARPGPVPKPGIRTPRVGNNPFSSAQPADRPIPRPPAPRPGTAR : . . . CC0042 AAPGQTRTYEPSRDRRDDRPSTTTYRPAPQGDRPFNQRAPRPDANANFGQRAPRPEGDRP RP552 -------------------------------DNKVEIAAKIVQDNENISSQIPKKKKE-- slr0744 PEPPKPK------------------------VELRRPKPPRPEEDENLPELLEFPPLSRG ML1556 ---------------------------------SASPSSMSPRPGGAVGGGGPRPPRT-- Rv2839c PGVPR---------------------------PGASPGSMPPRPGGAVGG--ARPPRP-- . . . CC0042 RGPRPDGDRPQGDRGGYRGDRPQGDRPQGDRPQQTVRYSALAPRPAPGARGPGGPPRGPR RP552 ---------------------------------------TLAKSVLVGMR--------TR slr0744 KGVDGDNDADDGD--------------------------LLSTEKPKPKLKRPTPPRLGK ML1556 -------------------------------------------GVPRPGGGRPGAPVGGR Rv2839c -------------------------------------------GAPRPGG-RPGAPGAGR : CC0042 PGVPAAAPATPEIQRATRSAPRPGGGAMDRRPDEDDDRRKNAAPNKAVSRVKGAPQRREG RP552 YGIEEEP-------------------ALEKTVDN-----KVVVPKIKLEESK----KFKK slr0744 PDQWEDDEDEKANKAKAANKGKRRPKMDDDDDDLDIDGDNGPKPTLVSLSIARPPKPKSL ML1556 SDAGGGN---------------YRGGGVGALPGGGSGGFRGRPGGGGHGGGGRPGQRGGA Rv2839c SDAGGGN---------------YRGGGVGAAPGTG---FRGRPGGG---GGGRPGQRGGA . . . CC0042 RLTIQAVAGDGDSADRMRSLASVRRAREREKEKRRGGAVEQARVAREVVIPDVITVQELS RP552 ADLFNMLSDDENGSGRTRSLASIKRARE--KEKRKLVSQVPEKVYREVTIPEVIGVGDLA slr0744 AAKPSTPTVAKVKKPTLKSEAGSSAGGSSRSRGDRRDRKEVVQKPEVIMLDRSLTVRDLA ML1556 AGAFGRPGGAPRRGR--KSKRQKRQEYDSMQAPVVGGVRLPHGNGETIRLARGASLSDFA Rv2839c AGAFGRPGGAPRRGR--KSKRQKRQEYDSMQAPVVGGVRLPHGNGETIRLARGASLSDFA :* . . . : : : ::: CC0042 NRMAVRGVDIIKFLMRQGVMLKINDVIDNDTAELVATEFGHTVKRVS--EADVEEGFIGA RP552 NAMSERVADVIKELMKLGILANASQTIDADTAELVATNLGHTVTRVQ--ESDVEN-ILIN slr0744 DLLKISETDIIKRLFLKGVAVQITQTLDEETARMVAESFEVAVETPERVAAAAKTTEMLD ML1556 EKIDANPAALVQALFNLGEMVTATQSVGDETLELLGSEMNYNVQVVSPEDEDRELLEPFD Rv2839c DKIDANPAALVQALFNLGEMVTATQSVGDETLELLGSEMNYNVQVVSPEDEDRELLESFD : : . ::: *: * .: :. :* .::. .: * . : CC0042 DDH------DEHMDLRPPVVTIMGHVDHGKTSLLDALRSTDVAAGEAGGITQHIGAYQVR RP552 DDK------VEDLRTRAPVVTVMGHVDHGKTSLLDALKSTDIAAGELGGITQHIGAYRVT slr0744 EAD------LDNLVRRPPVVTIMGHVDHGKTTLLDSIRKTKVAQGEAGGITQHIGAYHVE ML1556 LTYGEDQGDEDELQVRPPVVTVMGHVDHGKTRLLDTIRKANVREAEAGGITQHIGAYQVG Rv2839c LSYGEDEGGEEDLQVRPPVVTVMGHVDHGKTRLLDTIRKANVREAEAGGITQHIGAYQVA :.: *.****:********* ***:::.:.: .* **********:* CC0042 LKD---GQRVTFLDTPGHAAFSSMRARGANITDIVVLVVAGDDGVMPQTIEAIKHAKAAE RP552 LAD---SKAITFIDTPGHEAFSEMRSRGAKVTDIVIIVVAADDGIKTQTVEAINHAKAAG slr0744 VEHNDKTEQIVFLDTPGHEAFTAMRARGAKVTDIAILVVAADDGVQPQTKEAISHAKAAG ML1556 VDLDGSERLITFIDTPGHEAFTAMRARGAKATDIAILVVAADDGVMPQTVEAINHAQAAD Rv2839c VDLDGSQRLITFIDTPGHEAFTAMRARGAKATDIAILVVAADDGVMPQTVEAINHAQAAD : . :.*:***** **: **:***: ***.::***.***: .** ***.**:** CC0042 VPIIVAVNKMDKPGSDPTRVVNELLQHEIVVESLGGDTQLIEVSAKARTGLDNLLEAILL RP552 VPIIVAINKIDKPDIDIERIKNELYVHEIIGEEAGGDVIFIPISALKKINLDKLEEAILL slr0744 VPLIVAINKVDKPEANPDRIKQELSELGLLAEEWGGDTIMVPVSALNGDNLDGLLEMILL ML1556 VPIVVAVNKIDKEGADPAKIRGQLTEYGLVAEDFGGDTMFIDISAKVGTNIEALLEAVLL Rv2839c VPIVVAVNKIDKEGADPAKIRGQLTEYGLVPEEFGGDTMFVDISAKQGTNIEALEEAVLL **::**:**:** : :: :* :: *. ***. :: :** .:: * * :** CC0042 QA-EVLDLKANPDRSADGVVIEAKLDKGRGAVSTVLVNRGTLKRGDIVVAGSQWGKVRAL RP552 IS-EMQDLKASPFGLASGVVIESKIEKGRGTLTTILVQRGTLRNGDIIIAGTSYGKVKKM slr0744 VS-EVEELVANPNRQAKGTVIEANLDRTRGPVATLLIQNGTLRVGDAIVVGAVYGKIRAM ML1556 TADAALDLRANSGMEAQGVAIEAHLDRGRGPVATVLVQRGTLRIGDSVVAGDAYGRVRRM Rv2839c TADAALDLRANPDMEAQGVAIEAHLDRGRGPVATVLVQRGTLRVGDSVVAGDAYGRVRRM : :* *.. *.*..**::::: **.::*:*::.***: ** ::.* :*::: : CC0042 LNERNEQLQEAGPATPVEILGLDGVPSPGDAFAVVENEARARELTEYRIRLKREKSMAPV RP552 INDKGREILEATPSVPVEIQGLNEVPFAGDQFNVVQNEKQAKDIAEYRIRLAKEKKIS-V slr0744 IDDRGDKVEEASPSFAVEILGLGDVPAAGDEFEVFTNEKDARLQAEARAMEDRQTRLQQA ML1556 VDEHGVDIEAALPSSPVQVIGFTSVPGAGDNFLVVDEDRIARQIADRRSARKRNALAARS Rv2839c VDEHGEDVEVALPSRPVQVIGFTSVPGAGDNFLVVDEDRIARQIADRRSARKRNALAARS ::::. .: * *: .*:: *: ** .** * *. :: *: :: * :: CC0042 GAGASMADMMAKLQ--DKKLKELPLVIKADVQGSAEAIIGSLDKMATD-EVRARIILSGA RP552 ASRSSLEELLLKASG-NSKIKELPLIIKCDVQGSIEAISGSLLKLPSD-EIKLRILHSGV slr0744 MSSRKVTLSSISAQAQEGELKELNIILKADVQGSLGAILGSLEQLPQG-EVQIRVLLASP ML1556 RKRISLEDLDSALK----ETSQLNLILKGDNAGTVEALEEALMGIQVDDEVALRVIDRGV Rv2839c RKRISLEDLDSALK----ETSQLNLILKGDNAGTVEALEEALMGIQVDDEVVLRVIDRGV .: . : .:* :::* * *: *: :* : . *: *:: . CC0042 GAISESDVMLAKGAGAPVIGFNVRASAQARALAEREGVEIRYYAIIYDLLDDIKGVLSGM RP552 GPITESDVSLAHVSSAIVVGFNVRAWANALTAAEKTKVDIRYYSIIYNLIDDVKAIMSGM slr0744 GEVTETDVDLAAASGAIIIGFNTTLASGARQAADQEGVDIREYDIIYKLLDDIQGAMEGL ML1556 GGITETNVNLASASDAIIIGFNVRAEGKATELASREGVEIRYYLVIYQAIDDIEKALRGM Rv2839c GGITETNVNLASASDAVIIGFNVRAEGKATELASREGVEIRYYSVIYQAIDEIEQALRGL * ::*::* ** :.* ::***. . * *.: *:** * :**. :*::: : *: CC0042 LAPIQRETFLGNAEVLQAFDISKIGKVAGCKVTEGVVRKGAKVRIIRQDIVVLELGTLQT RP552 LEPIVREQYIGSVEIRQIFNITKIGKIAGSYVTKGIIKKGAGVRLLR-DNVVIHAGKLKT slr0744 LDPEEIESSLGTAEVRAVFPVGRG-NIAGCYVQSGKIIRNRNLRVRRGDQVLFEG-NIDS ML1556 LKPIYEENQLGRAEIRALFRSSKVGIIAGCIISSGVVRRNAKVRLLRDNIVVTDNLTVTS Rv2839c LKPIYEENQLGRAEIRALFRSSKVGLIAGCLVTSGVMRRNAKARLLRDNIVVAENLSIAS * * * :* .*: * : :**. : .* : :. *: * : *: . .: : CC0042 LKRFKDEVNEVPVGQECGMMFAGFQDIKVGDTIECFTVEEIKRQLD- RP552 LKRFKDEVKEVREGYECGIAFENYEDIREGDTVEVFELVQEQRQL-- slr0744 LKRIKEDVREVNAGYECGIGCSKFNDWKEGDIIEAYEMTMKRRTLAT ML1556 LRREKDDVTEVREGFECGMTLG-YSDIKEGDVIESYELVEKQRA--- Rv2839c LRREKDDVTEVRDGFECGLTLG-YADIKEGDVIESYELVQKERA--- *:* *::* ** * ***: : * : ** :* : : .*