Skip to content

Blosum matrix files#2

Merged
mzueva merged 3 commits intomainfrom
MILAB-4282-blosum-matrices
Feb 17, 2026
Merged

Blosum matrix files#2
mzueva merged 3 commits intomainfrom
MILAB-4282-blosum-matrices

Conversation

@mzueva
Copy link
Contributor

@mzueva mzueva commented Feb 17, 2026

Include blosum matrix files to the software package

@gemini-code-assist
Copy link

Summary of Changes

Hello @mzueva, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request integrates essential BLOSUM matrix files into the software package, ensuring they are properly distributed, included in Docker builds, and managed within the project's build process. This enhancement provides necessary data for sequence alignment functionalities, making the software more robust for bioinformatics tasks.

Highlights

  • BLOSUM Matrix Files Added: Five new BLOSUM matrix files (blosum40.out, blosum50.out, blosum62.out, blosum80.out, blosum90.out) were added to the data/ directory, providing essential scoring matrices for sequence alignment.
  • Build Process Integration: A new prebuild script was added to package.json to automatically copy the data/ directory into each dld/*/data/ subdirectory, ensuring these files are available during the build process.
  • Docker Image Update: The Dockerfile was updated to include the newly added data/ directory, copying its contents to /usr/bin/ within the Docker image for proper distribution.
  • Git Ignore Configuration: The .gitignore file was modified to exclude dld/*/data/ directories from version control, preventing redundant tracking of copied data.
  • Version Bump: A changeset entry was created, indicating a minor version update for @platforma-open/soedinglab.software-mmseqs2 to reflect the inclusion of these new matrix files.
Changelog
  • .changeset/early-worlds-deny.md
    • Added a changeset entry for a minor version update, noting the inclusion of matrix files.
  • .gitignore
    • Updated to ignore data directories within dld subdirectories.
  • Dockerfile
    • Modified to copy the new data directory containing BLOSUM matrices into the Docker image.
  • data/blosum40.out
    • Added BLOSUM40 matrix file.
  • data/blosum50.out
    • Added BLOSUM50 matrix file.
  • data/blosum62.out
    • Added BLOSUM62 matrix file.
  • data/blosum80.out
    • Added BLOSUM80 matrix file.
  • data/blosum90.out
    • Added BLOSUM90 matrix file.
  • package.json
    • Added a prebuild script to copy the data directory to dld subdirectories.
Activity
  • No specific activity (comments, reviews, approvals) was provided in the context for this pull request.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request adds BLOSUM matrix files to the software package. The changes include adding the data files themselves, updating the .gitignore and package.json to handle them during the build process, and modifying the Dockerfile to include them in the image. While the overall goal is clear and the approach is mostly sound, there are a couple of areas for improvement. The Dockerfile copies the data files to an incorrect location, and the data files themselves are missing a final newline character, which can cause issues with some tools. My review includes suggestions to address these points.

Dockerfile Outdated

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Copying data files into /usr/bin is not standard practice, as this directory is reserved for executables. A more appropriate location according to the Filesystem Hierarchy Standard (FHS) is /usr/share. mmseqs2 is known to search for data in standard locations like /usr/share/mmseqs/data, so placing the data in a similar path like /usr/share/mmseqs2/data is recommended. The current COPY command is also likely incorrect as it copies the files directly into /usr/bin rather than into a data subdirectory, due to the trailing slash on the source path.

COPY ./data /usr/share/mmseqs2/data

@mzueva mzueva requested a review from DenKoren February 17, 2026 12:47
@mzueva mzueva merged commit b42841b into main Feb 17, 2026
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants

Comments