Experimental/two stage #296

feldlime · 2025-08-30T22:09:59Z

Description

Type of change

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to not work as expected)
Optimization

How Has This Been Tested?

Before submitting a PR, please check yourself against the following list. It would save us quite a lot of time.

Have you read the contribution guide?
Have you updated the relevant docstrings? We're using Numpy format, please double-check yourself
Does your change require any new tests?
Have you updated the changelog file?

`CandidateRankingModel`

We make changes to the `get_train_with_targets_for_reranker` method to separate the retrieval of sampled candidates and unsampled candidates from first-stage candidate generators for the reranker.

…eSystems/RecTools into experimental/two_stage

Copilot

Pull request overview

This PR introduces a two-stage recommendation pipeline through the CandidateRankingModel class, which combines first-stage candidate generation with second-stage reranking using gradient boosting models.

Key Changes:

Implements a flexible two-stage ranking architecture with support for multiple candidate generators and various reranking models
Adds specialized support for CatBoost models through the CatBoostReranker class
Introduces helper classes for feature collection, negative sampling, and candidate generation
Includes comprehensive tests and a detailed tutorial notebook

Reviewed changes

Copilot reviewed 13 out of 15 changed files in this pull request and generated 9 comments.

Show a summary per file

File	Description
`rectools/models/ranking/candidate_ranking.py`	Core implementation of the two-stage ranking model with candidate generation, feature collection, and reranking logic
`rectools/models/ranking/catboost_reranker.py`	Specialized reranker for CatBoost classifiers and rankers with pool preparation
`rectools/models/ranking/__init__.py`	Module exports with fallback imports for optional dependencies
`rectools/exceptions.py`	New `NotFittedForStageError` exception for stage-specific fitting requirements
`rectools/columns.py`	Added `Target` column constant for train/test target values
`rectools/compat.py`	Compatibility class for CatBoost when dependency is unavailable
`tests/models/ranking/test_candidate_ranking.py`	Comprehensive tests for all ranking components
`tests/models/ranking/test_catboost_reranker.py`	Tests for CatBoost-specific functionality
`tests/models/test_serialization.py`	Model serialization tests including CandidateRankingModel
`tests/test_compat.py`	Compatibility layer tests for CatBoostReranker
`pyproject.toml`	Added catboost dependency and updated black version
`README.md`	Documentation of new catboost extension
`examples/tutorials/candidate_ranking_model_tutorial.ipynb`	Detailed tutorial with multiple reranker examples
`.github/workflows/test.yml`	Removed trailing whitespace

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

rectools/models/ranking/candidate_ranking.py

rectools/exceptions.py

rectools/models/ranking/candidate_ranking.py

Copilot · 2025-12-07T23:11:28Z

rectools/models/ranking/candidate_ranking.py

+
+        Returns
+        -------
+        pd.DataFrame, pd.DataFrame, dict(str -> any)


The docstring at line 581 states the return type as "pd.DataFrame, pd.DataFrame, dict(str -> any)" but the actual method returns a tuple of (Dataset, pd.DataFrame, dict). The first element is a Dataset object, not a pd.DataFrame.

Correct the docstring to reflect the actual return type: "Dataset, pd.DataFrame, dict(str -> any)"

Suggested change

pd.DataFrame, pd.DataFrame, dict(str -> any)

Dataset, pd.DataFrame, dict(str -> any)

tests/models/ranking/test_catboost_reranker.py

Copilot · 2025-12-07T23:11:29Z

rectools/models/ranking/candidate_ranking.py

+        if context is not None:
+            context = None
+            warnings.warn(
+                "You are providing context to a model that does not require it. Context is set to 'None'",
+                UserWarning,
+            )


[nitpick] The context parameter is being reassigned to None (line 832) and then a warning is issued (lines 833-836). This pattern is confusing because the reassignment has no effect - the variable is local to the function and not used afterwards.

Consider either removing the reassignment (line 832) or clarifying in the comment/warning that the context is being ignored.

rectools/models/ranking/catboost_reranker.py

blondered and others added 15 commits December 24, 2024 17:32

Feature/twostage pandas (#234)

55a3b91

`CandidateRankingModel`

Fix get_train_with_targets_for_reranker method (#244)

0bbc87e

We make changes to the `get_train_with_targets_for_reranker` method to separate the retrieval of sampled candidates and unsampled candidates from first-stage candidate generators for the reranker.

Merge branch 'main' into experimental/two_stage

c710fc0

fixed pyproject.toml

07dd084

fixed import

1ee1b16

removed unused function

d8a5716

bumped black version

19b6f02

removed duplicated method

3e33aec

Merge branch 'main' into experimental/two_stage

bc9f6f9

fixed comments

2c80fc4

improved error handling

47f25c1

Merge branch 'experimental/two_stage' of https://github.com/MobileTel…

852c320

…eSystems/RecTools into experimental/two_stage

fixed errors and warnings

80256c5

added ipykernel dependancy

05604f0

adjusted tutorial

c8faada

feldlime requested a review from Copilot December 7, 2025 23:06

Copilot started reviewing on behalf of feldlime December 7, 2025 23:06 View session

small improvements in the tutorial

aef3135

Copilot AI reviewed Dec 7, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Experimental/two stage #296

Experimental/two stage #296

Uh oh!

feldlime commented Aug 30, 2025 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI Dec 7, 2025

Uh oh!

Uh oh!

Copilot AI Dec 7, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

	pd.DataFrame, pd.DataFrame, dict(str -> any)
	Dataset, pd.DataFrame, dict(str -> any)

Experimental/two stage #296

Are you sure you want to change the base?

Experimental/two stage #296

Uh oh!

Conversation

feldlime commented Aug 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Type of change

How Has This Been Tested?

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI Dec 7, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Copilot AI Dec 7, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

feldlime commented Aug 30, 2025 •

edited

Loading