fixedLassoInf: working with lambda value from cv.glmnet

I am working on a simulation study that looks at performance when lambda is *not* determined a priori, but is instead calculated by cross-validation. I am doing this as an independent verification of the results found in [Taylor & Tibshirani (2018)](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6097808/) that show using cross-validation yields valid inferential statistics. (I know that [Loftus](https://arxiv.org/abs/1511.08866) also proposed a way to deal with a lambda determined by cross-validation, but it doesn't appear to be in the package yet, and the simulations in the 2018 paper performed well enough for me.)

I see that in the documentation it says that `{glmnet}` uses the 1/n parameterization, whereas `{selectiveInference}` uses the common parameterization. The documentation shows how to go from common lambda and transform it to something that `{glmnet}` can use. I need to do the opposite: Go from something `cv.glmnet()` gives me, and turn it into the lambda on the common scale that `fixedLassoInf()` wants.

Specifically, the `{glmnet}` documentation reads:

> Note also that for "gaussian", glmnet standardizes y to have unit variance (using 1/n rather than 1/(n-1) formula) before computing its lambda sequence (and then unstandardizes the resulting coefficients); if you wish to reproduce/compare results with other software, best to supply a standardized y

While `{selectiveInference}` says:

> Estimated lasso coefficients (e.g., from glmnet). This is of length p (so the intercept is not included as the first component). Be careful! This function uses the "standard" lasso objective... In contrast, glmnet multiplies the first term by a factor of 1/n. So after running glmnet, to extract the beta corresponding to a value lambda, you need to use beta = coef(obj,s=lambda/n)[-1]...

For a reproducible example, see the code below. My question specifically concerns how to adjust this line: `si_lambda <- glmnet_lambda`. That is, what transformation do I do to go from a lambda `cv.glmnet()` gives me (I assign this to `glmnet_lambda`) into a lambda that `{selectiveInference}` will use (which I call `si_lambda`)?

My original thought was that, since the documentation says to divide by n, my thinking would be to multiply what `cv.glmnet()` gives me by my sample size. That runs without throwing a warning or an error, but it gives me a lambda of `188.5121`, which feels wrong. Apologies if that is the answer and I'm just being dense—but I wanted to make sure I am going from one software to the other in an appropriate manner.

```
library(glmnet)
library(selectiveInference)
library(tidyverse)
set.seed(1839)

n <- 1000       # sample size
B <- c(0, 1, 0) # intercept 0, beta1 = 1, beta2 = 0
eps_sd <- 1     # sd of the error

# make data
X <- cbind(1, replicate(length(B) - 1, rnorm(n, 0, 1)))
y <- X %*% B + rnorm(n, 0, eps_sd)
dat <- as.data.frame(X[, -1])
dat <- as_tibble(cbind(dat, y))

# get lambda by way of cross-validation
glmnet_lambda <- cv.glmnet(
  x = as.matrix(select(dat, -y)),
  y = dat$y
) %>% 
  getElement("lambda.1se")

# run glmnet with that lambda
m1 <- glmnet(
  x = as.matrix(select(dat, -y)),
  y = dat$y,
  lambda = glmnet_lambda
)

# get coefs from that model, dropping intercept, per the docs
m1_coefs <- coef(m1)[-1]

# what reparameterization do I do here?
si_lambda <- glmnet_lambda

# do post-selection inference with m1
# runs with warning, so I assume parameterized incorrectly -- how to fix?
m2 <- fixedLassoInf(
  x = as.matrix(select(dat, -y)),
  y = dat$y,
  beta = m1_coefs,
  lambda = si_lambda
)
```

And session information:

```
> sessionInfo()
R version 4.1.0 (2021-05-18)
Platform: x86_64-apple-darwin17.0 (64-bit)
Running under: macOS Big Sur 11.4

Matrix products: default
LAPACK: /Library/Frameworks/R.framework/Versions/4.1/Resources/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] parallel  stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] forcats_0.5.1            stringr_1.4.0            dplyr_1.0.6             
 [4] purrr_0.3.4              readr_1.4.0              tidyr_1.1.3             
 [7] tibble_3.1.2             ggplot2_3.3.3            tidyverse_1.3.1         
[10] selectiveInference_1.2.5 MASS_7.3-54              adaptMCMC_1.4           
[13] coda_0.19-4              survival_3.2-11          intervals_0.15.2        
[16] glmnet_4.1-1             Matrix_1.3-3            
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fixedLassoInf: working with lambda value from cv.glmnet #54

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

fixedLassoInf: working with lambda value from cv.glmnet #54

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions