Skip to content

Using "proteomes" will pull in TrEMBL identifiers which won't get parsed in add_uniprot_annotations.py #1

@neely

Description

@neely

I recommended a student use the human reference proteome for the db to db blaster (https://github.com/pwilmart/PAW_BLAST), but that meant when it came to add annotations via add_uniprot_annotations.py, there were na's for TrEMBL entries.

An example protein in the 9606 reference proteome that will have na for gene symbol is SEPTIN3 A0A2R8Y4H2.

Now why two SEPTIN3 entries are in the reference proteome is beyond me (one sprot one trembl), and I don't believe this would be an issue if canonical was used for BLASTing. But still, if any TrEMBL entries come through in canonical, they won't get mapped in add_uniprot_annotations.py.

An easier answer is likely to change the keywlist_download.py to get TrEMBL, but that might be a huge file.

Not the expert here, B-)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions