Skip to content

Recombination anchors visualization#4824

Open
dcmonti wants to merge 25 commits intomasterfrom
recombination-output
Open

Recombination anchors visualization#4824
dcmonti wants to merge 25 commits intomasterfrom
recombination-output

Conversation

@dcmonti
Copy link
Contributor

@dcmonti dcmonti commented Feb 9, 2026

Changelog Entry

To be copied to the draft changelog by merger:

Added the total number of recombination in a chain, recombinant anchor are now marked in the chain dump file

dcmonti and others added 25 commits January 26, 2026 09:50
Copy link
Member

@adamnovak adamnovak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here are some comments on Davide's actual code, my code, and code I synthesized.

Comment on lines +88 to +90
# Build seed lookup by ID
seed_by_id = {s['seed_id']: s for s in seeds}
seed_ids = set(seed_by_id.keys())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like wasted effort; seed_by_id is used only to get the set of IDs, and the set of IDs is used only to filter the transitions.

Comment on lines +109 to +117
seeds_data.append({
'index': i,
'seed_num': s['seed_num'],
'read_pos': s['read_pos'],
'ref_pos': s['ref_pos'],
'strand': s['strand'],
'seed_id': s['seed_id'],
'max_score': max_score if max_score > float('-inf') else 0
})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might be better, or more succinct, as a dict() copy?

Comment on lines +140 to +141
# CSS and JS are plain strings (no f-string brace escaping needed).
# Only the data script uses f-string interpolation.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This speaks to what the code generator was told to do, not what the reader needs to know.

Comment on lines +204 to +207
' <script type="text/javascript">\n'
f' var SEEDS_COMPRESSED = "{seeds_compressed}";\n'
f' var TRANSITIONS_COMPRESSED = "{transitions_compressed}";\n'
' </script>'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this doesn't need the CDATA escaping because we know it contains neither < nor & inside the script?

constructor(seeds, transitions) {
/// The raw seed array, for iteration and extent computation.
this.seeds = seeds;
/// The raw transition array, for iteration and extent computation.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we really need the transitions to compute the view extent if they can only go between the seeds?

sys.stderr.write("Install the matplotalt package to generate figure alt text\n")
alt_text = None
except Exception as e:
print("Could not generate alt text due to internal error")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should also go to stderr.

while (here != TracedScore::nowhere()) {
//std::cerr << "\trecs: " << chain_scores[here].rec_num << " paths:\t" << std::bitset<64>(chain_scores[here].paths) << std::endl;
//std::cerr << "\t\tanchor #" << here << ": " << to_chain[here] << std::endl;
// Mark here as used. Happens once per item, and so limits runtime.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These should probably be #ifdef debug-flagged off instead of commented out.

max_indel_bases,
recomb_penalty,
show_work);
//std::cerr << "[REC INFO] Recombination number for chain: " << best_past_ending_score_ever.rec_num << "\tscore: " << best_past_ending_score_ever.score << "\tpaths: " << best_past_ending_score_ever.paths << std::endl;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This also could be an #ifdef and not a comment.

points_per_possible_match,
max_indel_bases
).front();
{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like an extra layer of braces that isn't doing anything.

Comment on lines +255 to +269
for (auto& handle_and_positions : found->second) {
std::string path_name = path_graph->get_path_name(handle_and_positions.first);
for (auto& position : handle_and_positions.second) {
// Dump all the seed positions so we can select seeds we want to know about
seedpos.line();
seedpos.field(seed_anchors.at(seed_num).read_start());
seedpos.field(path_name);
seedpos.field(position.first);
seedpos.field(position.second ? "-" : "+");
seedpos.field(seed_num);
std::stringstream ss;
ss << seed_anchors.at(seed_num);
seedpos.field(ss.str());
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should avoid this whole loop if seedpos is falsey.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants

Comments