Some tools I find useful for working with IG/TR receptor sequences, including support for allele sequence naming and the creation of custom IgBlast databases. Please see the documentation for further details.
Changes in version 0.0.65:
- Further fixes to imports
Changes in version 0.0.64:
- Fix import of aux_formats (whch regressed in 0.0.63)
Changes in version 0.0.63:
- In extract_imgt_refs, include obviously incomplete V sequences (length <= 280) unless the functional_only option is set
- upgrade setuptools dependency. If you have trouble installing receptor_utils, please upgrade pip.
Changes in version 0.0.62:
- Fix import of aux_formats
- Add support for C genes in download_germline_set
Changes in version 0.0.61:
- Internal restructing of aux_formats.py: no change to functionality
Changes in version 0.0.61:
- Fix handing of incomplete codon at 3' end in create_alignment
Changes in version 0.0.60:
- New script create_alignment: creates an alignment of alleles from a single gene, showing nucleotide sequences with amino acid translations and silent/nonsilent mutations. Supports V, D, and J sequence types with type-specific formatting including CDR region delineation for V sequences.
Changes in version 0.0.57:
- annotate_j: better handling of non-standard motifs and prioritzation of results. Canonical [WF]GxG motifs are always preferred, but non-standard motifs such as [WF]A.G (seen in TRGJP1, TRGJP2, TRAJ16) and [CWF][AG].G (seen in TRAJ35) are accommodated with a warning. likewise solutions with no stop codons are preferred, but stop codons are accommodated if necessary as they may be excised during recombination.
Changes in version 0.0.56:
- annotate_j: accommodate (with warning) the non-standard motif [CWF][AG].G seen in TRAJ35
Changes in version 0.0.55:
- annotate_j: accommodate (with warning) the non-standard motif [WF]A.G seen in TRGJP1, TRGJP2, TRAJ16
Changes in version 0.0.54:
- Fix name of ungapped file in download_germline_set
- Improve naming of novel D alleles
Changes in version 0.0.53:
- Fix bug in download_germline_set that masked the correct error message when multiple sets were selected
Changes in version 0.0.52:
- FIx missing dependency
Changes in version 0.0.51:
- Updates to download_germline_set, to create files for IgBlast.
- New documentation sections: using AIRR-C sets with IgBlast, using AIRR-C sets with MiXCR
Changes in version 0.0.50:
- Added a new utility, download_germline_set, to download germline sets from the Open Germline Receptor Database (OGRDB)
Changes in version 0.0.49:
- Added an option to allow at_coords to be used with FASTA files containing multiple sequences
- Fixed problems in name_alleles that could be caused by erroneously long V-sequences
Changes in version 0.0.48:
- Added an option to make_igblast_ndm to specify CDR positions, for use with IMGT-gapped germline sets that do not follow the canonical alignment. Added further explanation to the documentation.
Changes in version 0.0.47:
- write_csv now takes an optional scan_all argument. If True, all records to be added are scanned for keywords and the columns are extended to include keywords found in any records
Changes in version 0.0.46:
- fix issue with naming of D novel alleles - this could cause existing alleles to be named as novel by the utilities
Changes in version 0.0.45:
- added dependency for biopython version >=1.81
Changes in version 0.0.44:
- minor fix to novel allele naming
- fixed a bug that prevented sequence subsets being shown by identical_seqs
Changes in version 0.0.43:
- remove dependency on deprecated Bio.pairwise2
- improve naming of insertions, e.g. IGHV1-203_i7g_i7a would now be IGHV1-203_i7ga
Changes in version 0.0.42:
- better handling of long target sequences
Changes in version 0.0.41:
- annotate_j: fix issue with processing FASTA input
Changes in version 0.0.40:
- The submodule name receptor_utils.number_ighv has been changed to receptor_utils.number_v to reflect its wider scope. The old name will continue to work for the time being but will raise a deprecation warning.
- In receptor_utils.simple_bio_seq, write_fasta(seqs, filename) has become write_fasta(filename, seqs) for consistency with write_csv. The old calling pattern will continue to work for the time being but will raise a deprecation warning.
Changes in version 0.0.39:
- annotate_j and make_igblast_ndm will now accept a germline set in AIRR Community JSON format, as an alternative to providing the set in FASTA format.
Changes in version 0.0.38:
- Improve reporting of issues with conserved residues
- Change URL for fetching IMGT reference sets to use https