proteina for peptide modifies by JingXuu · Pull Request #2 · bigict/proteina

JingXuu · 2025-10-10T09:02:04Z

set receptor as motif, scaffold peptide
change features: res_seq_pdb_idx(peptide+200) and chain_break_per_res
add metrics calculation codes

Copilot

Pull Request Overview

This PR implements peptide design functionality for the proteina model by setting the receptor as a motif and designing scaffold peptides. The key modifications include chain-aware PDB writing, peptide-specific feature engineering, and comprehensive metrics calculation capabilities for evaluation.

Key changes:

Enhanced PDB writing to support multi-chain structures with proper chain indexing
Modified training pipeline to handle peptide-specific features including residue indexing with offsets and chain break markers
Added comprehensive metrics calculation infrastructure including RMSD computation, structure diversity analysis, and designability assessment

Reviewed Changes

Copilot reviewed 29 out of 31 changed files in this pull request and generated 9 comments.

Show a summary per file

File	Description
proteinfoundation/utils/ff_utils/pdb_utils.py	Enhanced PDB writing with chain support and motif-based structure organization
proteinfoundation/proteinflow/model_trainer_base.py	Added peptide features and validation RMSD tracking
proteinfoundation/motif_inference.py	Enhanced inference with PDB patching and RMSD calculation
test/cal_metrics/	Complete metrics calculation suite including RMSD, diversity, and designability
configs/	Configuration files for peptide training and inference

_{Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.}

Copilot · 2025-10-10T09:04:09Z

proteinfoundation/proteinflow/proteina.py

            boundaries = torch.linspace(
-                0.0, max_dist_boundary, num_dist_buckets - 1, device=pair_pred.device
+                0.1, max_dist_boundary, num_dist_buckets - 1, device=pair_pred.device


Hardcoded value 0.1 for distance boundary start may cause issues with very short distances. Consider using 0.0 or making this configurable.

Copilot · 2025-10-10T09:04:09Z

proteinfoundation/proteinflow/model_trainer_base.py

            n=n, shape=batch_shape, device=self.device, dtype=dtype, mask=mask
        )

+        batch['x_1'] = x_1 #Ensure that the unit used during training is nanometers (nm) not ang!!!!!


Grammatical error: 'ang' should be 'Å' or 'angstroms'.

Suggested change

batch['x_1'] = x_1 #Ensure that the unit used during training is nanometers (nm) not ang!!!!!

batch['x_1'] = x_1 # Ensure that the unit used during training is nanometers (nm), not angstroms (Å)!

Copilot · 2025-10-10T09:04:10Z

proteinfoundation/proteinflow/model_trainer_base.py

        if random.random() > 0.5 and self.cfg_exp.training.self_cond:
            x_pred_sc, _ = self.predict_clean(batch)
            batch["x_sc"] = self.detach_gradients(x_pred_sc)
+        #import pdb; pdb.set_trace()


Commented-out debug statement should be removed from production code.

Suggested change

#import pdb; pdb.set_trace()

Copilot · 2025-10-10T09:04:10Z

proteinfoundation/motif_inference.py

                              motif_min_length = cfg.motif_min_length, 
                              motif_max_length = cfg.motif_max_length, 
                              motif_task_name = cfg.motif_task_name)
+    #import pdb;pdb.set_trace()


Commented-out debug statement should be removed from production code.

Suggested change

#import pdb;pdb.set_trace()

Copilot · 2025-10-10T09:04:10Z

proteinfoundation/metrics/designability.py

    Returns:
        List of sequences (strings)
    """
+    # import pdb;pdb.set_trace()


Commented-out debug statement should be removed from production code.

Suggested change

# import pdb;pdb.set_trace()

Copilot · 2025-10-10T09:04:11Z

test/cal_metrics/cal_metrics.py

+# cal_metrics.py (only code & comments in English)
+


Orphaned comment that doesn't relate to surrounding code should be removed or clarified.

Suggested change

# cal_metrics.py (only code & comments in English)

Copilot · 2025-10-10T09:04:11Z

proteinfoundation/utils/ff_utils/pdb_utils.py

+                    chain_index=np.concatenate([prot_rec.chain_index, prot_pep.chain_index], axis=0),
+                    b_factors=np.concatenate([prot_rec.b_factors, prot_pep.b_factors], axis=0),
+                )
+                pdb_prot = to_pdb(full_prot, model=1, add_end=True)  # Now `add_end=True` for automatic ENDMDL and END


Comment mentions 'Now add_end=True' but doesn't explain why this change was made or when it should be used.

Suggested change

pdb_prot = to_pdb(full_prot, model=1, add_end=True) # Now `add_end=True` for automatic ENDMDL and END

# Use add_end=True to ensure ENDMDL and END records are included in the PDB output.

# This is necessary when writing combined structures (e.g., motif and peptide as separate chains)

# to properly terminate the PDB file for downstream compatibility. In other cases, set add_end=False

# if termination records are not required.

pdb_prot = to_pdb(full_prot, model=1, add_end=True)

Copilot · 2025-10-10T09:04:11Z

test/cal_metrics/run_mpnn_boltz.py

+            pep_seqs = pep_seqss[1:]
+            print(pep_seqs)


Debug print statement should be removed or replaced with proper logging.

Copilot · 2025-10-10T09:04:11Z

proteinfoundation/motif_inference.py

+    gt_base = os.path.basename(gt_pdb_path)
+    m_case = re.search(r'([0-9][A-Za-z0-9]{3})', gt_base)
+    case_id = m_case.group(1) if m_case else os.path.splitext(gt_base)[0]
+    #import pdb;pdb.set_trace()


Another commented-out debug statement should be removed from production code.

Suggested change

#import pdb;pdb.set_trace()

chungongyu added 3 commits August 28, 2025 15:05

set receptor as motif, scaffold peptide

9ed870a

change features: res_seq_pdb_idx(peptide+64) and chain_break_per_res

e86c331

add metrics calculation

c99f5b6

JingXuu requested a review from Copilot October 10, 2025 09:02

Copilot AI reviewed Oct 10, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comments

proteina for peptide modifies#2

proteina for peptide modifies#2
JingXuu wants to merge 3 commits intobigict:devfrom
JingXuu:dev

JingXuu commented Oct 10, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Oct 10, 2025

Uh oh!

Copilot AI Oct 10, 2025

Uh oh!

Copilot AI Oct 10, 2025

Uh oh!

Copilot AI Oct 10, 2025

Uh oh!

Copilot AI Oct 10, 2025

Uh oh!

Copilot AI Oct 10, 2025

Uh oh!

Copilot AI Oct 10, 2025

Uh oh!

Copilot AI Oct 10, 2025

Uh oh!

Copilot AI Oct 10, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

	batch['x_1'] = x_1 #Ensure that the unit used during training is nanometers (nm) not ang!!!!!
	batch['x_1'] = x_1 # Ensure that the unit used during training is nanometers (nm), not angstroms (Å)!

-                pdb_prot = to_pdb(full_prot, model=1, add_end=True)  # Now `add_end=True` for automatic ENDMDL and END
+                # Use add_end=True to ensure ENDMDL and END records are included in the PDB output.
+                # This is necessary when writing combined structures (e.g., motif and peptide as separate chains)
+                # to properly terminate the PDB file for downstream compatibility. In other cases, set add_end=False
+                # if termination records are not required.
+                pdb_prot = to_pdb(full_prot, model=1, add_end=True)

Comments

Conversation

JingXuu commented Oct 10, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Copilot AI Oct 10, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Oct 10, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Oct 10, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Oct 10, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Oct 10, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Oct 10, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Oct 10, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Oct 10, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Oct 10, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant