-
Notifications
You must be signed in to change notification settings - Fork 49
Experimental/two stage #296
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
`CandidateRankingModel`
We make changes to the `get_train_with_targets_for_reranker` method to separate the retrieval of sampled candidates and unsampled candidates from first-stage candidate generators for the reranker.
…eSystems/RecTools into experimental/two_stage
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR introduces a two-stage recommendation pipeline through the CandidateRankingModel class, which combines first-stage candidate generation with second-stage reranking using gradient boosting models.
Key Changes:
- Implements a flexible two-stage ranking architecture with support for multiple candidate generators and various reranking models
- Adds specialized support for CatBoost models through the
CatBoostRerankerclass - Introduces helper classes for feature collection, negative sampling, and candidate generation
- Includes comprehensive tests and a detailed tutorial notebook
Reviewed changes
Copilot reviewed 13 out of 15 changed files in this pull request and generated 9 comments.
Show a summary per file
| File | Description |
|---|---|
rectools/models/ranking/candidate_ranking.py |
Core implementation of the two-stage ranking model with candidate generation, feature collection, and reranking logic |
rectools/models/ranking/catboost_reranker.py |
Specialized reranker for CatBoost classifiers and rankers with pool preparation |
rectools/models/ranking/__init__.py |
Module exports with fallback imports for optional dependencies |
rectools/exceptions.py |
New NotFittedForStageError exception for stage-specific fitting requirements |
rectools/columns.py |
Added Target column constant for train/test target values |
rectools/compat.py |
Compatibility class for CatBoost when dependency is unavailable |
tests/models/ranking/test_candidate_ranking.py |
Comprehensive tests for all ranking components |
tests/models/ranking/test_catboost_reranker.py |
Tests for CatBoost-specific functionality |
tests/models/test_serialization.py |
Model serialization tests including CandidateRankingModel |
tests/test_compat.py |
Compatibility layer tests for CatBoostReranker |
pyproject.toml |
Added catboost dependency and updated black version |
README.md |
Documentation of new catboost extension |
examples/tutorials/candidate_ranking_model_tutorial.ipynb |
Detailed tutorial with multiple reranker examples |
.github/workflows/test.yml |
Removed trailing whitespace |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| Returns | ||
| ------- | ||
| pd.DataFrame, pd.DataFrame, dict(str -> any) |
Copilot
AI
Dec 7, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The docstring at line 581 states the return type as "pd.DataFrame, pd.DataFrame, dict(str -> any)" but the actual method returns a tuple of (Dataset, pd.DataFrame, dict). The first element is a Dataset object, not a pd.DataFrame.
Correct the docstring to reflect the actual return type: "Dataset, pd.DataFrame, dict(str -> any)"
| pd.DataFrame, pd.DataFrame, dict(str -> any) | |
| Dataset, pd.DataFrame, dict(str -> any) |
| if context is not None: | ||
| context = None | ||
| warnings.warn( | ||
| "You are providing context to a model that does not require it. Context is set to 'None'", | ||
| UserWarning, | ||
| ) |
Copilot
AI
Dec 7, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nitpick] The context parameter is being reassigned to None (line 832) and then a warning is issued (lines 833-836). This pattern is confusing because the reassignment has no effect - the variable is local to the function and not used afterwards.
Consider either removing the reassignment (line 832) or clarifying in the comment/warning that the context is being ignored.
Description
Type of change
How Has This Been Tested?
Before submitting a PR, please check yourself against the following list. It would save us quite a lot of time.