Skip to content

Conversation

@TheRealJon
Copy link
Member

Problem

When users are load-balanced across multiple console pods, each pod creates a session cookie with a unique name based on POD_NAME: openshift-session-token-<POD_NAME>.

With a 1-month cookie expiration, users accumulate cookies from different pods without old ones being removed. This is especially problematic with Azure AD SSO where new authentication tokens are issued on each login, causing OpenShift to create new session cookies instead of reusing old ones.

Over time, the cookie header can exceed 4096 bytes, causing errors.

Solution

This fix cleans up session cookies from other pods when creating a new session, ensuring only one active session cookie exists at a time.

Changes

  • Modified AddSession() to expire old pod cookies before creating new session in pkg/auth/sessions/combined_sessions.go:46-73
  • Updated DeleteSession() to use modern cookie expiration pattern (removed outdated Go loop variable capture)
  • Added test TestCombinedSessionStore_AddSession_CleansUpOldPodCookies to verify old pod cookies are properly expired

Testing

  • All existing session tests pass ✓
  • New test verifies old pod cookies are expired on new session creation ✓

Impact

  • Backward compatible: Existing sessions continue to work
  • Low risk: Only affects cookie cleanup during new session creation
  • Performance: Minimal overhead - iterates through cookies once per login

Fixes: OCPBUGS-65967

🤖 Generated with Claude Code

When users are load-balanced across multiple console pods, each pod
creates a session cookie with a unique name based on POD_NAME:
openshift-session-token-<POD_NAME>. With a 1-month cookie expiration,
users accumulate cookies from different pods without old ones being
removed, eventually causing the cookie header to exceed 4096 bytes.

This fix cleans up session cookies from other pods when creating a
new session, ensuring only one active session cookie exists at a time.

Changes:
- Modified AddSession() to expire old pod cookies before creating new session
- Updated DeleteSession() to use modern cookie expiration pattern
- Added test to verify old pod cookies are properly expired

Fixes: OCPBUGS-65967

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
@openshift-ci-robot openshift-ci-robot added jira/severity-moderate Referenced Jira bug's severity is moderate for the branch this PR is targeting. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. jira/invalid-bug Indicates that a referenced Jira bug is invalid for the branch this PR is targeting. labels Dec 12, 2025
@openshift-ci-robot
Copy link
Contributor

@TheRealJon: This pull request references Jira Issue OCPBUGS-65967, which is invalid:

  • expected the bug to target the "4.22.0" version, but no target version was set

Comment /jira refresh to re-evaluate validity if changes to the Jira bug are made, or edit the title of this pull request to link to a different bug.

The bug has been updated to refer to the pull request using the external bug tracker.

Details

In response to this:

Problem

When users are load-balanced across multiple console pods, each pod creates a session cookie with a unique name based on POD_NAME: openshift-session-token-<POD_NAME>.

With a 1-month cookie expiration, users accumulate cookies from different pods without old ones being removed. This is especially problematic with Azure AD SSO where new authentication tokens are issued on each login, causing OpenShift to create new session cookies instead of reusing old ones.

Over time, the cookie header can exceed 4096 bytes, causing errors.

Solution

This fix cleans up session cookies from other pods when creating a new session, ensuring only one active session cookie exists at a time.

Changes

  • Modified AddSession() to expire old pod cookies before creating new session in pkg/auth/sessions/combined_sessions.go:46-73
  • Updated DeleteSession() to use modern cookie expiration pattern (removed outdated Go loop variable capture)
  • Added test TestCombinedSessionStore_AddSession_CleansUpOldPodCookies to verify old pod cookies are properly expired

Testing

  • All existing session tests pass ✓
  • New test verifies old pod cookies are expired on new session creation ✓

Impact

  • Backward compatible: Existing sessions continue to work
  • Low risk: Only affects cookie cleanup during new session creation
  • Performance: Minimal overhead - iterates through cookies once per login

Fixes: OCPBUGS-65967

🤖 Generated with Claude Code

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@coderabbitai
Copy link

coderabbitai bot commented Dec 12, 2025

Walkthrough

The changes update session cookie handling: AddSession now expires session cookies from other pods (by cookie-name prefix) while keeping the current pod's cookie, and DeleteSession stops mutating cookie objects in-place and instead constructs and sets new http.Cookie instances with MaxAge=-1 to invalidate cookies. Test code uses type-inferred utilptr.To calls.

Changes

Cohort / File(s) Summary
Cookie cleanup & invalidation
pkg/auth/sessions/combined_sessions.go
In AddSession, iterate request cookies and expire any cookies whose names start with OpenshiftAccessTokenCookieName but are not the current pod's SessionCookieName() by setting MaxAge=-1. In DeleteSession, remove in-place cookie mutation and instead create new http.Cookie objects (same Name, empty Value, configured Path, MaxAge=-1) and call http.SetCookie to invalidate them. Minor control-flow cleanup (removed unused local assignment).
Test literal construction
pkg/auth/sessions/combined_sessions_test.go
Three struct literals in CombinedSessionStore_AddSession tests changed from utilptr.To[string]("value") to utilptr.To("value") to rely on type inference; no behavioral changes.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

  • Verify cookie-name prefix detection correctly excludes the current pod's session cookie.
  • Confirm new explicit http.Cookie construction with MaxAge=-1 behaves identically across browsers/clients compared to previous mutation approach.
  • Check tests compile and still validate expected behavior after utilptr type-inference changes.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

Cache: Disabled due to data retention organization setting

Knowledge base: Disabled due to Reviews -> Disable Knowledge Base setting

📥 Commits

Reviewing files that changed from the base of the PR and between 96646c9 and ec0c7f3.

📒 Files selected for processing (1)
  • pkg/auth/sessions/combined_sessions.go (2 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • pkg/auth/sessions/combined_sessions.go

Comment @coderabbitai help to get the list of available commands and usage tips.

@openshift-ci openshift-ci bot requested review from Leo6Leo and jhadvig December 12, 2025 14:36
@openshift-ci openshift-ci bot added component/backend Related to backend approved Indicates a PR has been approved by an approver from all required OWNERS files. labels Dec 12, 2025
@TheRealJon
Copy link
Member Author

/hold

Needs testing

@openshift-ci openshift-ci bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Dec 12, 2025
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

Cache: Disabled due to data retention organization setting

Knowledge base: Disabled due to Reviews -> Disable Knowledge Base setting

📥 Commits

Reviewing files that changed from the base of the PR and between 335ed77 and 96646c9.

📒 Files selected for processing (2)
  • pkg/auth/sessions/combined_sessions.go (2 hunks)
  • pkg/auth/sessions/combined_sessions_test.go (4 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**

⚙️ CodeRabbit configuration file

-Focus on major issues impacting performance, readability, maintainability and security. Avoid nitpicks and avoid verbosity.

Files:

  • pkg/auth/sessions/combined_sessions_test.go
  • pkg/auth/sessions/combined_sessions.go
🧬 Code graph analysis (2)
pkg/auth/sessions/combined_sessions_test.go (2)
pkg/auth/sessions/combined_sessions.go (2)
  • NewSessionStore (31-44)
  • SessionCookieName (26-29)
pkg/auth/sessions/server_session.go (1)
  • OpenshiftAccessTokenCookieName (17-17)
pkg/auth/sessions/combined_sessions.go (1)
pkg/auth/sessions/server_session.go (1)
  • OpenshiftAccessTokenCookieName (17-17)
🔇 Additional comments (2)
pkg/auth/sessions/combined_sessions_test.go (2)

55-55: LGTM: Type inference improves readability.

The changes from utilptr.To[string] to utilptr.To leverage Go's type inference, making the code cleaner while maintaining equivalent functionality.

Also applies to: 67-67, 96-96


699-745: Verify Path attribute in expired cookies.

The test correctly verifies that MaxAge is set to -1 for old pod cookies, but it doesn't check whether the Path attribute is set correctly. Cookies must have matching Name, Domain, and Path to be deleted by the browser. If Path is not set correctly in the expired cookies, the browser won't delete them, which would defeat the purpose of this cleanup logic.

Consider adding an assertion like:

 		if c.Name == OpenshiftAccessTokenCookieName+"-console-old-pod-1" ||
 			c.Name == OpenshiftAccessTokenCookieName+"-console-old-pod-2" {
 			require.Equal(t, -1, c.MaxAge, "Old pod cookie %s should be expired", c.Name)
+			require.Equal(t, "/", c.Path, "Old pod cookie %s should have correct Path for deletion", c.Name)
 			expiredCookies++
 		}

When deleting cookies via HTTP response headers, browsers need the
deletion cookie to match the Name, Path, and Domain of the original
cookie. The previous implementation only set MaxAge=-1 on the existing
cookie object without explicitly setting the Path, which could prevent
proper cookie deletion.

This change creates a new cookie with the minimal required attributes
(Name, Path, Value="", MaxAge=-1) using the path from the session store
options, ensuring the browser properly recognizes and deletes the cookie.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
@Leo6Leo
Copy link
Contributor

Leo6Leo commented Dec 15, 2025

Assign for my own review tracking purposes:
/assign @Leo6Leo

@TheRealJon
Copy link
Member Author

/retest

1 similar comment
@TheRealJon
Copy link
Member Author

/retest

@TheRealJon
Copy link
Member Author

/jira refresh

@openshift-ci-robot openshift-ci-robot added jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. and removed jira/invalid-bug Indicates that a referenced Jira bug is invalid for the branch this PR is targeting. labels Dec 17, 2025
@openshift-ci-robot
Copy link
Contributor

@TheRealJon: This pull request references Jira Issue OCPBUGS-65967, which is valid. The bug has been moved to the POST state.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target version (4.22.0) matches configured target version for branch (4.22.0)
  • bug is in the state ASSIGNED, which is one of the valid states (NEW, ASSIGNED, POST)

Requesting review from QA contact:
/cc @jadhaj

Details

In response to this:

/jira refresh

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci openshift-ci bot requested a review from jadhaj December 17, 2025 15:50
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Dec 17, 2025

@TheRealJon: all tests passed!

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@Leo6Leo
Copy link
Contributor

Leo6Leo commented Dec 17, 2025

From the code's perspective, I think there is no problem and it will solve the problem. Will need the help from QE to verify it on the real cluster if it actually solves the problem.

/lgtm

QE verification:
/assign @yapei

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Dec 17, 2025
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Dec 17, 2025

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: Leo6Leo, TheRealJon

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@yapei
Copy link
Contributor

yapei commented Dec 25, 2025

@Leo6Leo @TheRealJon here are my steps to verify the changes, I created a testing cluster with cluster-bot (launch 4.21.0-0.nightly-2025-12-22-230304,openshift/console#15837 aws)

  1. initial status, given below console pods, we only have one cookie:
    openshift-session-token-console-68bf8595f-zwrc9
$ oc get pods -n openshift-console
NAME                         READY   STATUS    RESTARTS   AGE
console-68bf8595f-h5jkx      1/1     Running   0          7m58s
console-68bf8595f-zwrc9      1/1     Running   0          5m24s
downloads-7dbb84fd5d-8fsjq   1/1     Running   0          78m
downloads-7dbb84fd5d-qnl29   1/1     Running   0          78m
  1. trigger a new rollout via oc rollout restart deployment/console -n openshift-console, we can see two openshift session cookies:
    openshift-session-token-console-68bf8595f-h5jkx
    openshift-session-token-console-68bf8595f-zwrc9
    [In each rollout, only one new pods will be re-created]
$ oc get pods -n openshift-console
NAME                         READY   STATUS    RESTARTS   AGE
console-68bf8595f-dhtq6      1/1     Running   0          83s
console-68bf8595f-h5jkx      1/1     Running   0          11m
downloads-7dbb84fd5d-8fsjq   1/1     Running   0          81m
downloads-7dbb84fd5d-qnl29   1/1     Running   0          81m
  1. wait several minutes, trigger several new rollouts but openshift session token cookies are always two:
    openshift-session-token-console-68bf8595f-h5jkx
    openshift-session-token-console-68bf8595f-zwrc9
$ oc get pods -n openshift-console
NAME                         READY   STATUS    RESTARTS   AGE
console-68bf8595f-5dhbx      1/1     Running   0          43s
console-68bf8595f-h5jkx      1/1     Running   0          13m
downloads-7dbb84fd5d-8fsjq   1/1     Running   0          83m
downloads-7dbb84fd5d-qnl29   1/1     Running   0          83m

$ oc get pods -n openshift-console
NAME                         READY   STATUS    RESTARTS   AGE
console-68bf8595f-h5jkx      1/1     Running   0          15m
console-68bf8595f-zhgfc      1/1     Running   0          28s
downloads-7dbb84fd5d-8fsjq   1/1     Running   0          85m
downloads-7dbb84fd5d-qnl29   1/1     Running   0          85m
  1. If I manually delete the oldest console pod console-68bf8595f-h5jkx, then we can see a new cookies is generated, there will be three cookies
    openshift-session-token-console-68bf8595f-h5jkx
    openshift-session-token-console-68bf8595f-zhgfc
    openshift-session-token-console-68bf8595f-zwrc9
$ oc get pods -n openshift-console
NAME                         READY   STATUS    RESTARTS   AGE
console-68bf8595f-7lwfj      1/1     Running   0          29s
console-68bf8595f-zhgfc      1/1     Running   0          4m49s
downloads-7dbb84fd5d-8fsjq   1/1     Running   0          90m
downloads-7dbb84fd5d-qnl29   1/1     Running   0          90m

So it appears to me that the old session cookies are still not cleared as we expect?
I did the exactly same steps on a 421 cluster and it's behaving the same as on the cluster with bug fix
Could you please help confirm?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. component/backend Related to backend do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. jira/severity-moderate Referenced Jira bug's severity is moderate for the branch this PR is targeting. jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. lgtm Indicates that a PR is ready to be merged.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants