I installed BLAST and CLUSTALW.
I downloaded the protein and genome files for all the Bacteria.
Using an awk script, I filtered out all the proteins lacking an upstream region of at least 50 base pairs.
Using the filtered proteins, I created a BLAST database to query against.
I took the filtered protein list for Streptococcus pyogenes, and ran it against the new BLAST database.
I processed the results using an awk script, gathering together all the proteins and upstream segments into files for later processing.
I took 3 of the top hits (on the order of 500 matches), and ran CLUSTALW on them.