Skip to content

williamdlees/receptor_utils

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

106 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

receptor_utils

Some tools I find useful for working with IG/TR receptor sequences, including support for allele sequence naming and the creation of custom IgBlast databases. Please see the documentation for further details.

Changes in version 0.0.65:

  • Further fixes to imports

Changes in version 0.0.64:

  • Fix import of aux_formats (whch regressed in 0.0.63)

Changes in version 0.0.63:

  • In extract_imgt_refs, include obviously incomplete V sequences (length <= 280) unless the functional_only option is set
  • upgrade setuptools dependency. If you have trouble installing receptor_utils, please upgrade pip.

Changes in version 0.0.62:

  • Fix import of aux_formats
  • Add support for C genes in download_germline_set

Changes in version 0.0.61:

  • Internal restructing of aux_formats.py: no change to functionality

Changes in version 0.0.61:

  • Fix handing of incomplete codon at 3' end in create_alignment

Changes in version 0.0.60:

  • New script create_alignment: creates an alignment of alleles from a single gene, showing nucleotide sequences with amino acid translations and silent/nonsilent mutations. Supports V, D, and J sequence types with type-specific formatting including CDR region delineation for V sequences.

Changes in version 0.0.57:

  • annotate_j: better handling of non-standard motifs and prioritzation of results. Canonical [WF]GxG motifs are always preferred, but non-standard motifs such as [WF]A.G (seen in TRGJP1, TRGJP2, TRAJ16) and [CWF][AG].G (seen in TRAJ35) are accommodated with a warning. likewise solutions with no stop codons are preferred, but stop codons are accommodated if necessary as they may be excised during recombination.

Changes in version 0.0.56:

  • annotate_j: accommodate (with warning) the non-standard motif [CWF][AG].G seen in TRAJ35

Changes in version 0.0.55:

  • annotate_j: accommodate (with warning) the non-standard motif [WF]A.G seen in TRGJP1, TRGJP2, TRAJ16

Changes in version 0.0.54:

  • Fix name of ungapped file in download_germline_set
  • Improve naming of novel D alleles

Changes in version 0.0.53:

  • Fix bug in download_germline_set that masked the correct error message when multiple sets were selected

Changes in version 0.0.52:

  • FIx missing dependency

Changes in version 0.0.51:

  • Updates to download_germline_set, to create files for IgBlast.
  • New documentation sections: using AIRR-C sets with IgBlast, using AIRR-C sets with MiXCR

Changes in version 0.0.50:

  • Added a new utility, download_germline_set, to download germline sets from the Open Germline Receptor Database (OGRDB)

Changes in version 0.0.49:

  • Added an option to allow at_coords to be used with FASTA files containing multiple sequences
  • Fixed problems in name_alleles that could be caused by erroneously long V-sequences

Changes in version 0.0.48:

  • Added an option to make_igblast_ndm to specify CDR positions, for use with IMGT-gapped germline sets that do not follow the canonical alignment. Added further explanation to the documentation.

Changes in version 0.0.47:

  • write_csv now takes an optional scan_all argument. If True, all records to be added are scanned for keywords and the columns are extended to include keywords found in any records

Changes in version 0.0.46:

  • fix issue with naming of D novel alleles - this could cause existing alleles to be named as novel by the utilities

Changes in version 0.0.45:

  • added dependency for biopython version >=1.81

Changes in version 0.0.44:

  • minor fix to novel allele naming
  • fixed a bug that prevented sequence subsets being shown by identical_seqs

Changes in version 0.0.43:

  • remove dependency on deprecated Bio.pairwise2
  • improve naming of insertions, e.g. IGHV1-203_i7g_i7a would now be IGHV1-203_i7ga

Changes in version 0.0.42:

  • better handling of long target sequences

Changes in version 0.0.41:

  • annotate_j: fix issue with processing FASTA input

Changes in version 0.0.40:

  • The submodule name receptor_utils.number_ighv has been changed to receptor_utils.number_v to reflect its wider scope. The old name will continue to work for the time being but will raise a deprecation warning.
  • In receptor_utils.simple_bio_seq, write_fasta(seqs, filename) has become write_fasta(filename, seqs) for consistency with write_csv. The old calling pattern will continue to work for the time being but will raise a deprecation warning.

Changes in version 0.0.39:

  • annotate_j and make_igblast_ndm will now accept a germline set in AIRR Community JSON format, as an alternative to providing the set in FASTA format.

Changes in version 0.0.38:

  • Improve reporting of issues with conserved residues
  • Change URL for fetching IMGT reference sets to use https

About

Misc utilities for receptor gene sequences

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •