uniprotlib

Note: This library was vibe coded with Claude. It works, it's tested, but review accordingly.

Python library for parsing UniProt XML files. Handles both single-entry downloads and multi-GB gzip-compressed database dumps with bounded memory usage.

Installation

pip install uniprotlib

Or with uv:

uv add uniprotlib

Usage

from uniprotlib import parse_xml

# single file
for entry in parse_xml("Q9Y261.xml"):
    print(entry.primary_accession, entry.protein_name)

# gzipped bulk download
for entry in parse_xml("uniprot_sprot.xml.gz"):
    print(entry.gene.primary, entry.organism.scientific_name)

# multiple files
for entry in parse_xml("human.xml.gz", "mouse.xml.gz"):
    print(entry.primary_accession)

parse_xml() returns an iterator that yields UniProtEntry objects. Gzip detection is automatic based on the .gz extension. Memory stays bounded regardless of file size.

Parsed fields

Model	Fields
`UniProtEntry`	primary_accession, accessions, entry_name, dataset, protein_name, gene, organism, sequence, keywords, db_references
`Gene`	primary, synonyms, ordered_locus_names, orf_names
`Organism`	scientific_name, common_name, tax_id, lineage
`Sequence`	value, length, mass, checksum
`DbReference`	type, id, molecule, properties

All model classes are dataclasses with full type annotations and py.typed support.

Development

Requires Python >= 3.12 and uv.

uv sync
uv run pytest tests/ -v

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
.github/workflows		.github/workflows
docs		docs
example_files		example_files
tests		tests
uniprotlib		uniprotlib
.gitignore		.gitignore
.python-version		.python-version
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
main.py		main.py
mkdocs.yml		mkdocs.yml
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

uniprotlib

Installation

Usage

Parsed fields

Development

License

About

Uh oh!

Releases

Packages

Languages

License

mpreusse/uniprotlib

Folders and files

Latest commit

History

Repository files navigation

uniprotlib

Installation

Usage

Parsed fields

Development

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages