Skip to content

Segfault when training elastic net / lasso with wide problems #1

@ogrisel

Description

@ogrisel

When the number of features is much bigger than the number of samples I get a segmentation fault. The following script can reproduce the problem:

import numpy as np
from glmnet.elastic_net import Lasso

# problem dim
n_samples = 100
n_features = 100000
n_informative_features = 10

# normally distributed input signal
X = np.random.randn(n_samples, n_features)

# generate a ground truth model with only the first 10 features being non
# zeros (the other features are not correlated to Y and should be ignored by
# the L1 regularizer)
true_coef = np.zeros(n_features)
true_coef[:n_informative_features] = np.random.randn(n_informative_features)

# generate the ground truth Y from the reference model and X + label noise
Y = np.dot(X, true_coef) + np.random.normal(scale=0.1, size=n_samples)

print Lasso(alpha=1).fit(X, Y)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions