Skip to content

Adapting step_size and sag updates for sample_weight #31837

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

antoinebaker
Copy link
Contributor

@antoinebaker antoinebaker commented Jul 25, 2025

Reference Issues/PRs

Helper for PR #31675 to fix sample_weight support in SAG(A) solvers #31536.

What does this implement/fix? Explain your changes.

In test_sag.py several fixes are made to properly handle sample_weight.

  1. get_step_size: the formula Lipschitz smoothness constant are corrected to take into account sample_weight
  2. sag: the updates for the weights and intercept are corrected
  3. sag: the number of seen elements (which converges to n_samples) is replaced by the weighted sum of seen elements (which converges to sample_weight.sum())

A true_weights argument was added to sag for visualisation purpose (plot the convergence towards the true minima). This is useful for the notebook below but can be removed in the final PR. A tol argument was also added to use the same stopping criterion as _sag_fast.pyx.

cc @snath-xoc

Copy link

✔️ Linting Passed

All linting checks passed. Your pull request is in excellent shape! ☀️

Generated for commit: d117189. Link to the linter CI: here

@antoinebaker
Copy link
Contributor Author

See https://gist.github.com/antoinebaker/0fc40c94952a2da371bedb6caca53c10 : now both sag and saga converges towards the true minima, for the weighted and repeated datasets. The weighted version no longer has convergence issue.

We illustrate the convergence on the blob dataset from test_classifier_matching.

We also illustrate with the dataset from our common test check_sample_weight_equivalence, which should pass after a similar fix in _sag_fast,pyx. Note that the sag and saga solvers, while being stochastic, actually yield a deterministic output when they have converged (the true minima), and it's enough to test them with the deterministic repeated/weighted equivalence test.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant