-
Notifications
You must be signed in to change notification settings - Fork 11
Open
Labels
Note: poll ⬆️ ⬇️A question where everyone can voteA question where everyone can voteType: question ❔Questions for all to consider.Questions for all to consider.
Description
For some models, e.g. prcomp(), hyperSpec objects can be used to fit the model, but not for prediction.
The underlying "mechanism" of the issue is that prcomp() calls as.data.frame() and then model.matrix() to extract the relevant columns of the training data, whereas predict.prcomp() expects a data.frame or matrix.
PCA <- prcomp (~ spc, flu)prediction needs a data.frame or matrix with column names spc<wl>:
head(rownames(PCA$rotation))
#> [1] "spc405" "spc405.5" "spc406" "spc406.5" "spc407" "spc407.5"thus,
predict (PCA, flu)
#> Error in predict.prcomp(PCA, flu): 'newdata' must be a matrix or data frameneither as.data.frame() nor as.matrix() work:
predict(PCA, as.data.frame(flu))
#> Error in predict.prcomp(PCA, as.data.frame(flu)): 'newdata' does not have named columns matching one or more of the original columns
predict(PCA, as.matrix(flu))
#> Error in predict.prcomp(PCA, as.matrix(flu)): 'newdata' does not have named columns matching one or more of the original columnsas.wide.df() can produce the right type of data.frame, though:
predict(PCA, as.wide.df(flu, wl.prefix = "spc"))
#> PC1 PC2 PC3 PC4 PC5 PC6
#> 1 -2981.2355 3.3656437 -3.036947 5.186403 9.1582295 -3.301036e-13
#> 2 -1847.6202 0.1474497 -6.973606 1.617421 -11.5188178 4.123699e-13
#> 3 -572.9198 4.3060755 6.771751 -15.374293 0.5490714 4.745072e-13
#> 4 613.5676 -5.4921238 15.612317 8.056681 -1.9075938 7.245697e-14
#> 5 1758.9646 -18.2575816 -7.749809 -2.473730 2.9604848 1.027175e-15
#> 6 3029.2434 15.9305364 -4.623706 2.987518 0.7586259 -2.891355e-14slightly different colnames for as.matrix() would make that work as well:
matrix_flu <- flu[[]]
colnames(matrix_flu) <- paste0("spc", colnames(matrix_flu))
predict (PCA, matrix_flu)
#> PC1 PC2 PC3 PC4 PC5 PC6
#> [1,] -2981.2355 3.3656437 -3.036947 5.186403 9.1582295 -3.301036e-13
#> [2,] -1847.6202 0.1474497 -6.973606 1.617421 -11.5188178 4.123699e-13
#> [3,] -572.9198 4.3060755 6.771751 -15.374293 0.5490714 4.745072e-13
#> [4,] 613.5676 -5.4921238 15.612317 8.056681 -1.9075938 7.245697e-14
#> [5,] 1758.9646 -18.2575816 -7.749809 -2.473730 2.9604848 1.027175e-15
#> [6,] 3029.2434 15.9305364 -4.623706 2.987518 0.7586259 -2.891355e-14What to do?
- Should we change the default in
as.wide.df()towl.prefix = "spc", sopredict(PCA, as.wide.df(flu)works by default? - For models that are trained only on the spectra matrix, making
as.matrix()andx[[]]output colnames prefixed withspcwould makepredict(PCA, flu[[]])work.
- This is slightly related to names of wavelength vector are lost while collapsing two hyperSpec objects #87.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
Note: poll ⬆️ ⬇️A question where everyone can voteA question where everyone can voteType: question ❔Questions for all to consider.Questions for all to consider.