Skip to content

Let correlation endpoint accept only one dataset#93

Draft
asizemore wants to merge 3 commits intomasterfrom
improvement-92-allow-correlations-one-dataset
Draft

Let correlation endpoint accept only one dataset#93
asizemore wants to merge 3 commits intomasterfrom
improvement-92-allow-correlations-one-dataset

Conversation

@asizemore
Copy link
Member

@asizemore asizemore commented Sep 22, 2025

Resolves #92

For wgcna, we'd like the user to run correlation on 1 (self-correlation) or 2 datasets. Currently we have a separate endpoint for self-correlation.

This PR does a few things:

  1. Renames hasSecondCollection to hasTwoCollections
  2. Removes restrictions on data2 being null and data2 == data1
  3. Allows for self-correlation calculations to be performed by either passing data1=data2 or data1 and data2=null.

It's looking like the frontend will be sending in data2==data1. Still, both this avenue and data2=null work and seem like reasonable ways to use the endpoint, so i'm partial to keeping both options but could be persuaded otherwise!

To test, use eda-inc and here's an example request:

{
	"config": {
		"prefilterThresholds": {
			"proportionNonZero": 0.05,
			"variance": 0,
			"standardDeviation": 0
		},
		"data1": {
			"dataType": "collection",
			"collectionSpec": {
				"entityId": "EUPATH_0000813",
				"collectionId": "EUPATH_0009252"
			}
		},
		"correlationMethod": "spearman"
	},
	"derivedVariables": [],
	"filters": [],
	"studyId": "Bangladesh_healthy_5yr-1"
}

To test data1 == data2,

{
	"config": {
		"prefilterThresholds": {
			"proportionNonZero": 0.05,
			"variance": 0,
			"standardDeviation": 0
		},
		"data1": {
			"dataType": "collection",
			"collectionSpec": {
				"entityId": "EUPATH_0000813",
				"collectionId": "EUPATH_0009252"
			}
		},
		"data2": {
			"dataType": "collection",
			"collectionSpec": {
				"entityId": "EUPATH_0000813",
				"collectionId": "EUPATH_0009252"
			}
		},
		"correlationMethod": "spearman"
	},
	"derivedVariables": [],
	"filters": [],
	"studyId": "Bangladesh_healthy_5yr-1"
}

@asizemore asizemore marked this pull request as ready for review October 6, 2025 18:33
@asizemore asizemore requested a review from bobular October 6, 2025 18:58
@asizemore
Copy link
Member Author

@bobular if possible i'd like to merge this quickly and come back to fix things in a new PR if possible. I'm trying to work on the frontend simultaneously but running the backend locally is slowing dev.
Since we'll be testing with local frontend dev in the next week, i dont think extensive testing is required.

@asizemore
Copy link
Member Author

Don't review! I've introduced a bug and i must squash it first. Can see that now mbio correlations don't work

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

allow correlation to accept only one dataset

1 participant