This project tries to address graph similarity.
Files:
-
The file DD.mat in datasets/ folder is used to get Adjacency matrices of protiens.
-
eigen.py: It has methods to get laplacian of a matrix by getting, and ultimately eigen vectors.
-
five_node_test.ipynb: Has code to test a graph with 5 nodes and display similarities and other distance metrics.
-
seven_node_test.ipynb: Has code to test a graph with 7 nodes and display similarities and other distance metrics.
-
twelve_node_test.ipynb: Has code to test a graph with 12 nodes and display similarities and other distance metrics.
-
generate_dataset.py: Generates positive and negative adjancecy matrices given a adjacency matrix.
-
process_data.py: Has code to get the DD.mat file from datasets/ folder and get the adjacency matrix from a list of adjacency matrices based on passed index.
-
test.py: This file is used to test on real datasets, i.e. DD.mat. It has a method that tests the original matrix chosen by calling process_data.py with other positive and negative matches by generating these positive and negative matches and hence finding distance metrics.
-
similarity_algorithms.py: Has functions that can be used to calculate Euclidean, Minkowski, and Manhattan distances, given two eigen vectors.
Steps to run custom graphs with 5, 7 and 12 nodes:
- Go to file five_node_test.ipynb or seven_node_test.ipynb or twelve_node_ipynb in your IDE like VS code.
- Click on first cell.
- Click on "Run cell and below". Scroll down and see all results.
Steps to run real datasets from DD.mat:
- Go to file test.py on your IDE.
- Run command "python test.py" in terminal to run the code. This file reads DD.mat's adjacency matrix of index "x" defined in code, then creates 20 positive/negative adjacency matrices based on original matrix and tests different distance metrics of original matrix with all other generated matrices. Comment/uncomment line #25/#26 to generate positive/negative matrices for comparision.