Conversation
…the selected seed, and the top chain always
…e manageably sized
adamnovak
left a comment
There was a problem hiding this comment.
Here are some comments on Davide's actual code, my code, and code I synthesized.
| # Build seed lookup by ID | ||
| seed_by_id = {s['seed_id']: s for s in seeds} | ||
| seed_ids = set(seed_by_id.keys()) |
There was a problem hiding this comment.
This seems like wasted effort; seed_by_id is used only to get the set of IDs, and the set of IDs is used only to filter the transitions.
| seeds_data.append({ | ||
| 'index': i, | ||
| 'seed_num': s['seed_num'], | ||
| 'read_pos': s['read_pos'], | ||
| 'ref_pos': s['ref_pos'], | ||
| 'strand': s['strand'], | ||
| 'seed_id': s['seed_id'], | ||
| 'max_score': max_score if max_score > float('-inf') else 0 | ||
| }) |
There was a problem hiding this comment.
This might be better, or more succinct, as a dict() copy?
| # CSS and JS are plain strings (no f-string brace escaping needed). | ||
| # Only the data script uses f-string interpolation. |
There was a problem hiding this comment.
This speaks to what the code generator was told to do, not what the reader needs to know.
| ' <script type="text/javascript">\n' | ||
| f' var SEEDS_COMPRESSED = "{seeds_compressed}";\n' | ||
| f' var TRANSITIONS_COMPRESSED = "{transitions_compressed}";\n' | ||
| ' </script>' |
There was a problem hiding this comment.
I guess this doesn't need the CDATA escaping because we know it contains neither < nor & inside the script?
| constructor(seeds, transitions) { | ||
| /// The raw seed array, for iteration and extent computation. | ||
| this.seeds = seeds; | ||
| /// The raw transition array, for iteration and extent computation. |
There was a problem hiding this comment.
Do we really need the transitions to compute the view extent if they can only go between the seeds?
| sys.stderr.write("Install the matplotalt package to generate figure alt text\n") | ||
| alt_text = None | ||
| except Exception as e: | ||
| print("Could not generate alt text due to internal error") |
There was a problem hiding this comment.
This should also go to stderr.
| while (here != TracedScore::nowhere()) { | ||
| //std::cerr << "\trecs: " << chain_scores[here].rec_num << " paths:\t" << std::bitset<64>(chain_scores[here].paths) << std::endl; | ||
| //std::cerr << "\t\tanchor #" << here << ": " << to_chain[here] << std::endl; | ||
| // Mark here as used. Happens once per item, and so limits runtime. |
There was a problem hiding this comment.
These should probably be #ifdef debug-flagged off instead of commented out.
| max_indel_bases, | ||
| recomb_penalty, | ||
| show_work); | ||
| //std::cerr << "[REC INFO] Recombination number for chain: " << best_past_ending_score_ever.rec_num << "\tscore: " << best_past_ending_score_ever.score << "\tpaths: " << best_past_ending_score_ever.paths << std::endl; |
There was a problem hiding this comment.
This also could be an #ifdef and not a comment.
| points_per_possible_match, | ||
| max_indel_bases | ||
| ).front(); | ||
| { |
There was a problem hiding this comment.
This looks like an extra layer of braces that isn't doing anything.
| for (auto& handle_and_positions : found->second) { | ||
| std::string path_name = path_graph->get_path_name(handle_and_positions.first); | ||
| for (auto& position : handle_and_positions.second) { | ||
| // Dump all the seed positions so we can select seeds we want to know about | ||
| seedpos.line(); | ||
| seedpos.field(seed_anchors.at(seed_num).read_start()); | ||
| seedpos.field(path_name); | ||
| seedpos.field(position.first); | ||
| seedpos.field(position.second ? "-" : "+"); | ||
| seedpos.field(seed_num); | ||
| std::stringstream ss; | ||
| ss << seed_anchors.at(seed_num); | ||
| seedpos.field(ss.str()); | ||
| } | ||
| } |
There was a problem hiding this comment.
We should avoid this whole loop if seedpos is falsey.
Changelog Entry
To be copied to the draft changelog by merger:
Added the total number of recombination in a chain, recombinant anchor are now marked in the chain dump file