Open
Conversation
2f5f939 to
7c210c2
Compare
Collaborator
Author
|
Updated the PR to point to |
01765d6 to
ea41d9a
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This PR contains functionality and analysis notebooks to conduct a deeper dive into the LP performance of BioBLP-X vs RotatE models, in the context of node degree of the entity that is being predicted. It inspects whether there is a trend that attribute encodings help the BioBLP models obtain better representations for entities than RotatE models in sparser regions of the graph (where the said entities have fewer degree of in/outgoing edges).
Introduces:
notebooks/nb_utils/eval_utils.py) which parameterises the model, and entity type being analysed.head,tail, orbothsides of a triple (using pykeen's evaluation modules)Steps to conduct the Analysis of the effect of Node Degree...below on how to use wire this up)Note: The expectation is that the notebooks/nb_utils/eval_utils.py will later be subsumed within the bioblp package in a future commit once the immediate priorities of paper deadlines are in the past. For the interim, the support functionality lives within the nb_utils
Steps to conduct the Analysis of the effect of Node Degree on LP performance for model of choice:
Future Work (Things that need to be changed):
NodeDegreeAnalyser(innotebooks/nb_utils/eval_utils.py) and plotting functionality have confusing parameter names such asnode_endpoint_typeandeval_on_node_endpoint. The former refers to the position of the entity with an attribute that we are analysing in the triples, and the latter refers to when we obtain evaluation metrics while predictinghead,tail, orboth. Currently it is difficult to differentiate at a glance when we talk about predicting thehead/tail` of an entity, vs when we are talking about the position in which a certain entity type like drug/protein/disease occurs.