-
Notifications
You must be signed in to change notification settings - Fork 25
BowTieBuilder Code Review #168
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
d731ae2
99bf632
725eeb0
5534087
d289f2f
0fe3c2a
fc3ba80
fe47da5
f887fca
877707c
8dfa330
dbaacf1
1caca3a
a59e281
7c46104
e182c89
12369a4
ddde72b
64b8783
01aa15f
740cfb7
f82b611
dbf81a0
d0822bb
7d10dbb
c3310fe
f3d4006
40ad34f
3b9d70e
027ecbd
e149f0e
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,5 @@ | ||
| FROM python:3.8-bullseye | ||
|
|
||
| WORKDIR /btb | ||
| RUN wget https://raw.githubusercontent.com/Reed-CompBio/BowTieBuilder-Algorithm/main/btb.py | ||
| RUN pip install networkx==2.8 | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,15 @@ | ||
| # BowTieBuilder Docker image | ||
|
|
||
| A Docker image for [BowTieBuilder](https://github.com/Reed-CompBio/BowTieBuilder-Algorithm) that is available on [DockerHub](https://hub.docker.com/repository/docker/reedcompbio/bowtiebuilder). | ||
|
|
||
| To create the Docker image run: | ||
| ``` | ||
| docker build -t reedcompbio/bowtiebuilder:v1 -f Dockerfile . | ||
| ``` | ||
| from this directory. | ||
|
|
||
| ## Original Paper | ||
|
|
||
| The original paper for [BowTieBuilder] can be accessed here: | ||
|
|
||
| Supper, J., Spangenberg, L., Planatscher, H. et al. BowTieBuilder: modeling signal transduction pathways. BMC Syst Biol 3, 67 (2009). https://doi.org/10.1186/1752-0509-3-67 |
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We want all of the local neighborhood files back. |
This file was deleted.
This file was deleted.
This file was deleted.
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,174 @@ | ||
| # need to define a new btb class and contain the following functions | ||
| # - generate_inputs | ||
| # - run | ||
| # - parse_output | ||
|
|
||
| import warnings | ||
| from pathlib import Path | ||
|
|
||
| import pandas as pd | ||
|
|
||
| from spras.containers import prepare_volume, run_container | ||
| from spras.interactome import ( | ||
| convert_undirected_to_directed, | ||
| reinsert_direction_col_directed, | ||
| ) | ||
|
|
||
| from spras.prm import PRM | ||
|
|
||
| __all__ = ['BowtieBuilder'] | ||
|
|
||
| class BowtieBuilder(PRM): | ||
| required_inputs = ['sources', 'targets', 'edges'] | ||
|
|
||
| #generate input taken from meo.py beacuse they have same input requirements | ||
| @staticmethod | ||
| def generate_inputs(data, filename_map): | ||
| """ | ||
| Access fields from the dataset and write the required input files | ||
| @param data: dataset | ||
| @param filename_map: a dict mapping file types in the required_inputs to the filename for that type | ||
| @return: | ||
| """ | ||
| for input_type in BowtieBuilder.required_inputs: | ||
| if input_type not in filename_map: | ||
| raise ValueError(f"{input_type} filename is missing") | ||
| print("FILEMAP NAME: ", filename_map) | ||
| print("DATA HEAD: ") | ||
| print( data.node_table.head()) | ||
| print("DATA INTERACTOME: ") | ||
| print(data.interactome.head()) | ||
|
|
||
| # Get sources and write to file, repeat for targets | ||
| # Does not check whether a node is a source and a target | ||
| for node_type in ['sources', 'targets']: | ||
| nodes = data.request_node_columns([node_type]) | ||
| if nodes is None: | ||
| raise ValueError(f'No {node_type} found in the node files') | ||
|
|
||
| # TODO test whether this selection is needed, what values could the column contain that we would want to | ||
| # include or exclude? | ||
| nodes = nodes.loc[nodes[node_type]] | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Let's check how BTB handles these cases - we might need to put in some checks here.
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If BTB catches these issues and throws an error that's fine, I think. We just want to make sure that it does handle these issues.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm a bit unclear on what you mean/what we should do about this.
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. |
||
| if(node_type == "sources"): | ||
| nodes.to_csv(filename_map["sources"], sep= '\t', index=False, columns=['NODEID'], header=False) | ||
| print("NODES: ") | ||
| print(nodes) | ||
| elif(node_type == "targets"): | ||
| nodes.to_csv(filename_map["targets"], sep= '\t', index=False, columns=['NODEID'], header=False) | ||
| print("NODES: ") | ||
| print(nodes) | ||
|
|
||
|
|
||
| # Create network file | ||
| edges = data.get_interactome() | ||
|
|
||
gabeah marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| # Format into directed graph | ||
| edges = convert_undirected_to_directed(edges) | ||
|
|
||
| edges.to_csv(filename_map['edges'], sep='\t', index=False, header=False) | ||
|
|
||
|
|
||
|
|
||
| # Skips parameter validation step | ||
| @staticmethod | ||
| def run(sources=None, targets=None, edges=None, output_file=None, container_framework="docker"): | ||
| """ | ||
| Run PathLinker with Docker | ||
| @param sources: input source file (required) | ||
| @param targets: input target file (required) | ||
| @param edges: input edge file (required) | ||
| @param output_file: path to the output pathway file (required) | ||
| @param container_framework: choose the container runtime framework, currently supports "docker" or "singularity" (optional) | ||
| """ | ||
|
|
||
| # Tests for pytest (docker container also runs this) | ||
| # Testing out here avoids the trouble that container errors provide | ||
|
|
||
| if not sources or not targets or not edges or not output_file: | ||
| raise ValueError('Required BowtieBuilder arguments are missing') | ||
|
|
||
| if not Path(sources).exists() or not Path(targets).exists() or not Path(edges).exists(): | ||
| raise ValueError('Missing input file') | ||
|
|
||
| # Testing for btb index errors | ||
| # It's a bit messy, but it works \_('_')_/ | ||
| with open(edges, 'r') as edge_file: | ||
| try: | ||
| for line in edge_file: | ||
| line = line.strip() | ||
| line = line.split('\t') | ||
| line = line[2] | ||
|
|
||
| except Exception as err: | ||
| raise(err) | ||
|
|
||
| work_dir = '/btb' | ||
|
|
||
| # Each volume is a tuple (src, dest) | ||
| volumes = list() | ||
|
|
||
| bind_path, source_file = prepare_volume(sources, work_dir) | ||
| volumes.append(bind_path) | ||
|
|
||
| bind_path, target_file = prepare_volume(targets, work_dir) | ||
| volumes.append(bind_path) | ||
|
|
||
| bind_path, edges_file = prepare_volume(edges, work_dir) | ||
| volumes.append(bind_path) | ||
|
|
||
| # PathLinker does not provide an argument to set the output directory | ||
| # Use its --output argument to set the output file prefix to specify an absolute path and prefix | ||
| out_dir = Path(output_file).parent | ||
| # PathLinker requires that the output directory exist | ||
| out_dir.mkdir(parents=True, exist_ok=True) | ||
| bind_path, mapped_out_dir = prepare_volume(str(out_dir), work_dir) | ||
| volumes.append(bind_path) | ||
| mapped_out_prefix = mapped_out_dir + '/raw-pathway.txt' # Use posix path inside the container | ||
|
|
||
| command = ['python', | ||
| 'btb.py', | ||
| '--edges', | ||
| edges_file, | ||
| '--sources', | ||
| source_file, | ||
| '--target', | ||
| target_file, | ||
| '--output', | ||
| mapped_out_prefix] | ||
| # command = ['ls', '-R'] | ||
|
|
||
|
|
||
| print('Running BowtieBuilder with arguments: {}'.format(' '.join(command)), flush=True) | ||
|
|
||
| container_suffix = "bowtiebuilder:v1" | ||
| out = run_container(container_framework, | ||
| container_suffix, | ||
| command, | ||
| volumes, | ||
| work_dir) | ||
| print(out) | ||
| print("Source file: ", source_file) | ||
| print("target file: ", target_file) | ||
| print("edges file: ", edges_file) | ||
| print("mapped out dir: ", mapped_out_dir) | ||
| print("mapped out prefix: ", mapped_out_prefix) | ||
|
|
||
|
|
||
| # Output is already written to raw-pathway.txt file | ||
| # output_edges = Path(next(out_dir.glob('out*-ranked-edges.txt'))) | ||
| # output_edges.rename(output_file) | ||
|
|
||
|
|
||
| @staticmethod | ||
| def parse_output(raw_pathway_file, standardized_pathway_file): | ||
| """ | ||
| Convert a predicted pathway into the universal format | ||
| @param raw_pathway_file: pathway file produced by an algorithm's run function | ||
| @param standardized_pathway_file: the same pathway written in the universal format | ||
| """ | ||
| # What about multiple raw_pathway_files | ||
| print("PARSING OUTPUT BTB") | ||
| df = pd.read_csv(raw_pathway_file, sep='\t') | ||
| df = reinsert_direction_col_directed(df) | ||
| print(df) | ||
| df.to_csv(standardized_pathway_file, header=False, index=False, sep='\t') | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,7 @@ | ||
| Node1 Node2 | ||
| A D | ||
| B D | ||
| C D | ||
| D F | ||
| D G | ||
| D E |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,6 @@ | ||
| A D 5 | ||
| B D 1.3 | ||
| C D 0.4 | ||
| D E 4.5 | ||
| D F 2 | ||
| D G 3.2 |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,6 @@ | ||
| A D 5 | ||
| B D 1.3 | ||
| C 0.4 | ||
| D E 4.5 | ||
| D F 2 | ||
| D G 3.2 |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1,2 +1,3 @@ | ||
| A | ||
| B | ||
| C |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if we should have this be a specific commit hash, in case we update
btb.py? Maybe not something for us to worry about now - at least this way if we ever to updatebtb.pythis Dockerfile will build a new image with the most recent version ofbtb.py.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll leave as-is, and we can go back if we change our minds? I'm also unsure how accessing a specific commit hash would work
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.