ENH: Increase correction pairs used for quasi Newton approximations #2752

david-cortes-intel · 2025-10-27T07:23:38Z

Description

Logistic regression calls the L-BFGS-B solver from SciPy with mostly default parameters mimicking scikit-learn. The parameters for the solver were meant as default choices for a general solver that's aimed at solving non-convex and bound-constrained problems, but for a comparatively easier problem like L2-regularized logistic regression, they are not optimal - instead, better approximations to the true Hessian (through more corrections in the underlying quasi-Newton approximations) are typically beneficial for high-dimensional datasets.

Benchmarks on an Ice Lake machine (note: comparison is main branch against this, so relative numbers below 1 mean improvement in this PR):

Note: In many of these benchmarks, the solver is not reaching convergence, so the comparison is not always apples-to-apples. Cases where neither branch reached convergence are highlighted in yellow in the screenshots above:

Increase the number of iterations to improve the convergence (max_iter=200).
You might also want to scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
  n_iter_i = _check_optimize_result(
/export/users/dcortes/repos/sklex-benchenv/daal4py/sklearn/linear_model/logistic_path.py:266: ConvergenceWarning: lbfgs failed to converge after 200 iteration(s) (status=1):
STOP: TOTAL NO. OF ITERATIONS REACHED LIMIT

Hence, even though some cases might look like performance degradations here, they are actually reaching better solutions, so it's not an exact comparison.

Note2: the docs CI job is failing due to a rate limiting issue with medium.com.

Checklist:

Completeness and readability

I have commented my code, particularly in hard-to-understand areas.
Git commit message contains an appropriate signed-off-by string (see CONTRIBUTING.md for details).
I have resolved any merge conflicts that might occur with the base branch.

Testing

All CI jobs are green or I have provided justification why they aren't.

Performance

I have measured performance for affected algorithms using scikit-learn_bench and provided at least a summary table with measured data, if performance change is expected.

codecov · 2025-10-27T07:49:44Z

Codecov Report

✅ All modified and coverable lines are covered by tests.

Flag	Coverage Δ
azure	`80.47% <ø> (ø)`
github	`82.09% <ø> (+0.01%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.
see 1 file with indirect coverage changes

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Vika-F

Approving, as this is really an improvement.
But need a follow up about the benchmarking methodology: Should we fix the tolerance in order to have equally accurate solution in both cases, when comparing to stock scikit, for example?

david-cortes-intel · 2025-10-27T12:39:29Z

Approving, as this is really an improvement. But need a follow up about the benchmarking methodology: Should we fix the tolerance in order to have equally accurate solution in both cases, when comparing to stock scikit, for example?

Yes, I'm working on a fix.

david-cortes-intel · 2025-10-27T13:47:49Z

PR for the benchmarks: IntelPython/scikit-learn_bench#190

use larger bfgs memory

22321e7

david-cortes-intel added the enhancement New feature or request label Oct 27, 2025

david-cortes-intel marked this pull request as ready for review October 27, 2025 08:41

david-cortes-intel requested review from Alexsandruss and icfaust as code owners October 27, 2025 08:41

david-cortes-intel requested review from Vika-F and avolkov-intel October 27, 2025 08:42

david-cortes-intel changed the title ~~[Experiment, do NOT merge] ENH: Increase correction pairs used for quasi Newton approximations~~ ENH: Increase correction pairs used for quasi Newton approximations Oct 27, 2025

Vika-F approved these changes Oct 27, 2025

View reviewed changes

david-cortes-intel merged commit 6d108d1 into uxlfoundation:main Oct 27, 2025
34 of 36 checks passed

david-cortes-intel mentioned this pull request Oct 29, 2025

MAINT: Deselect logistic regression cases that fail from small numeric differences #2757

Merged

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

ENH: Increase correction pairs used for quasi Newton approximations #2752

ENH: Increase correction pairs used for quasi Newton approximations #2752

Uh oh!

david-cortes-intel commented Oct 27, 2025 •

edited

Loading

Uh oh!

codecov bot commented Oct 27, 2025 •

edited

Loading

Uh oh!

Vika-F left a comment

Uh oh!

david-cortes-intel commented Oct 27, 2025

Uh oh!

david-cortes-intel commented Oct 27, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

ENH: Increase correction pairs used for quasi Newton approximations #2752

ENH: Increase correction pairs used for quasi Newton approximations #2752

Uh oh!

Conversation

david-cortes-intel commented Oct 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Uh oh!

codecov bot commented Oct 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Vika-F left a comment

Choose a reason for hiding this comment

Uh oh!

david-cortes-intel commented Oct 27, 2025

Uh oh!

david-cortes-intel commented Oct 27, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

david-cortes-intel commented Oct 27, 2025 •

edited

Loading

codecov bot commented Oct 27, 2025 •

edited

Loading