Skip to content

Application to large sparse design matrix #51

@GabrielHoffman

Description

@GabrielHoffman

I have a lasso problem with ~10K samples and ~20K features where most of the entries in the design matrix x are zero. I can store them in a dgCMatrix sparse matrix and use glmnet to fit a lasso while taking advantage of the sparse design matrix. This is can be over 100x faster than using a standard matrix object.

Is there a way to extend coef, fixedLassoInf and estimateSigma to handle sparse matrices??
This would involve dealing with the centering and scaling of features, but it seems doable.

Also, fixedLassoInf that is very slow in the high dimensional setting in both the sparse and non-sparse case. If I do a pre-filtering step, will the p-values and FDR control still be accurate?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions