I found that with Pyrococcus Abyssi having such high coding percentage (91%), that the 50 non-coding upstream bp requirement turned out to be the most difficult restriction. So I started by searching pyrococcus abyssi genome for proteins with more than 50 bp of upstream encoding, and having a COG ID, and having a gene name and gene description that didn't include "hypothetical, "predicted", "putative", or "uncharacterized". Then I checked the COG ID to make sure there were at least 5 other prokaryotes that closely matched that protein. Finally, I followed checked each of those prokaryotes to ensure they all had at least 50 bp of upstream encoding. The logic behind this is if one prokaryote had 50 bp of upstream encoding for a protein, then other prokaryotes with similar proteins, should similarly have 50 bp of encoding. In reality this only seemed to hold true for about half the few proteins I checked.