-
Notifications
You must be signed in to change notification settings - Fork 11
Open
Labels
Type: question ❔Questions for all to consider.Questions for all to consider.
Description
When calculating pearson.dist() (which is basically a scaled correlation between rows/spectra) with a perfectly flat spectrum, the result is NaN.
This is caused by the standardization of the data matrix: the variance within the flat spectrum is 0, so a division by 0 occurs.
- The behaviour is consistent with
cor(x, y)which returnsNAin this case. - OTOH, we may say that since the covariance with a flat spectrum is always 0 also the correlation should be 0 (and Pearson distance 0.5).
Besides allowing smoothly to work with flat spectra andpearson.dist(), this would allow users to distinguish Pearson distance to a flat spectrum from situations where e.g.NAs in the spectra cause the distance to beNA.
Opinions?
library(hyperSpec)
x <- flu - flu [3] + 200
plot(x)
pearson.dist (x)
#> 1 2 3 4 5
#> 2 0.0008858704
#> 3 NaN NaN
#> 4 0.9967988590 0.9950559547 NaN
#> 5 0.9984690049 0.9968275493 NaN 0.0014374616
#> 6 0.9990662021 0.9977563018 NaN 0.0016757621 0.0006331176
cor (t(x[[1]]), t(x[[3]]))
#> Warning in cor(t(x[[1]]), t(x[[3]])): the standard deviation is zero
#> [,1]
#> [1,] NA
cov (t(x[[1]]), t(x[[3]]))
#> [,1]
#> [1,] 0Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
Type: question ❔Questions for all to consider.Questions for all to consider.