Kaapi v1.0: Enhancing the test suite #488

nishika26 · 2025-12-12T03:04:09Z

Summary

Target issue is #483 and #408

Checklist

Before submitting a pull request, please ensure that you mark these task.

Ran fastapi run --reload app/main.py or docker compose up in the repository root and test.
If you've fixed a bug or added code that is tested and has test cases.

Notes

Test Infrastructure:

Separated seed data into development and testing pathways - removed testing-only data (documents, assistants, credentials) from development seeds
Reorganized test fixtures with improved database operations and type safety

Test Coverage:

Added comprehensive tests for evaluation processing workflows
Introduced evaluation dataset creation helpers and enhanced test utilities
Added test endpoints for scheduled job operations

Code Quality:

Applied strict type hints across all test utilities (dict[str, Any], proper generics)
Standardized imports and removed unused dependencies
Improved logging consistency with [seed_data] prefixes

coderabbitai · 2025-12-12T03:04:14Z

Walkthrough

Refactors seeding by removing credentials, assistants, and documents from the main application seed and introducing a separate, comprehensive test-only seed module under backend/app/tests/seed_data/. Many tests receive import reordering, type-annotation additions, small formatting/cleanup edits, and new/expanded test coverage for evaluations and cron endpoints.

Changes

Cohort / File(s)	Summary
Main seed simplification `backend/app/seed_data/seed_data.json`, `backend/app/seed_data/seed_data.py`	Removed `credentials`, `assistants`, and `documents` blocks from JSON and removed creation logic in the seed script. Seed now only creates organizations, projects, users, and API keys. Removed `encrypt_credentials` import; adjusted logging prefixes and `clear_database()` to no longer delete Credential/Assistant.
Test seed system (new) `backend/app/tests/seed_data/seed_data.json`, `backend/app/tests/seed_data/seed_data.py`	Added full test seed JSON and a Python seeder with Pydantic models (Org/Project/User/APIKey/Credential/Assistant/Document), JSON loader, create_* helpers, `clear_database`, and `seed_database` orchestrator for test DB.
Test conftest updates `backend/app/tests/conftest.py`	Switched seed import to `app.tests.seed_data.seed_data`; converted `seed_baseline()` and `client()` fixtures to generator-based fixtures and added return-type annotations.
Credential tests rewrite `backend/app/tests/api/routes/test_creds.py`	Reworked tests from POST/PATCH flows to read-oriented provider-specific GET endpoints; added provider coverage (OPENAI/LANGFUSE), DB-level verification of encrypted fields, and new create/delete test cases.
New/expanded tests `backend/app/tests/api/routes/test_cron.py`, `backend/app/tests/api/routes/test_evaluation.py`, various `backend/app/tests/crud/` and `backend/app/tests/core/` files	Added cron endpoint tests, expanded evaluation and processing tests, and added comprehensive OpenAI batch provider tests; many tests added or extended.
Imports / formatting cleanup many test files, e.g. `backend/app/tests/api/routes/.py`, `backend/app/tests/crud//test_.py`, `backend/app/tests/services//*.py`	Standardized import order (e.g., `unittest.mock` before `pytest`), removed unused imports, adjusted blank lines; mostly non-functional formatting.
Type-annotation & test-utility enhancements `backend/app/tests/utils/document.py`, `backend/app/tests/utils/openai.py`, `backend/app/tests/utils/test_data.py`, `backend/app/tests/utils/collection.py`, `backend/app/tests/utils/utils.py`	Added explicit return/type hints, new helper `create_mock_batch()` and `create_test_evaluation_dataset()`, removed unused exports (APIKey, ConfigBase), and tightened signatures for several helpers.
Dev dependencies `backend/pyproject.toml`	Added `pytest-asyncio>=0.23.8` to dev dependencies for async test support.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Kaapi v1.0: Database Comments #476 — Modifies Credential/Assistant/Document and APIKey model/schema changes; strongly related to the removal and test-seed adjustments in this PR.

Suggested labels

enhancement, ready-for-review

Suggested reviewers

Prajna1999
kartpop

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title clearly describes the main change: enhancing the test suite for Kaapi v1.0. It accurately summarizes the primary focus across multiple test files and test infrastructure modifications.
Docstring Coverage	✅ Passed	Docstring coverage is 87.50% which is sufficient. The required threshold is 80.00%.
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch enhancement/test_suite

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

codecov · 2025-12-12T03:09:09Z

Codecov Report

❌ Patch coverage is 95.55766% with 47 lines in your changes missing coverage. Please review.

Files with missing lines	Patch %	Lines
backend/app/tests/seed_data/seed_data.py	79.82%	46 Missing ⚠️
backend/app/tests/utils/test_data.py	90.00%	1 Missing ⚠️

📢 Thoughts on this report? Let us know!

coderabbitai

Actionable comments posted: 2

🧹 Nitpick comments (2)

backend/app/tests/seed_data/seed_data.py (2)
86-86: Remove unnecessary mode argument.

The "r" mode is the default for open() and can be omitted.
🔎 Proposed fix
-        with open(json_path, "r") as f:
+        with open(json_path) as f:
24-80: Consider extracting shared Pydantic models to reduce duplication.

The data models (OrgData, ProjectData, UserData, APIKeyData) and several helper functions are nearly identical between this file and backend/app/seed_data/seed_data.py. Consider extracting common models to a shared module to improve maintainability.

📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 579c38b and 29d98f9.

📒 Files selected for processing (5)

backend/app/seed_data/seed_data.json (1 hunks)
backend/app/seed_data/seed_data.py (2 hunks)
backend/app/tests/conftest.py (1 hunks)
backend/app/tests/seed_data/seed_data.json (1 hunks)
backend/app/tests/seed_data/seed_data.py (1 hunks)

🧰 Additional context used

📓 Path-based instructions (2)

**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.py: Always add type hints to all function parameters and return values in Python code
Prefix all log messages with the function name in square brackets: logger.info(f"[function_name] Message {mask_string(sensitive_value)}")
Use Python 3.11+ with type hints throughout the codebase

Files:

backend/app/tests/conftest.py
backend/app/seed_data/seed_data.py
backend/app/tests/seed_data/seed_data.py

backend/app/tests/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Use factory pattern for test fixtures in backend/app/tests/

Files:

backend/app/tests/conftest.py
backend/app/tests/seed_data/seed_data.py

🧠 Learnings (3)

📓 Common learnings

Learnt from: CR
Repo: ProjectTech4DevAI/kaapi-backend PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-17T15:39:30.458Z
Learning: Seed database with test data using `uv run python -m app.seed_data.seed_data`

📚 Learning: 2025-12-17T15:39:30.457Z

Learnt from: CR
Repo: ProjectTech4DevAI/kaapi-backend PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-17T15:39:30.457Z
Learning: Applies to backend/app/tests/**/*.py : Use factory pattern for test fixtures in `backend/app/tests/`

Applied to files:

backend/app/tests/conftest.py

📚 Learning: 2025-12-17T15:39:30.458Z

Learnt from: CR
Repo: ProjectTech4DevAI/kaapi-backend PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-17T15:39:30.458Z
Learning: Seed database with test data using `uv run python -m app.seed_data.seed_data`

Applied to files:

backend/app/tests/conftest.py
backend/app/seed_data/seed_data.py
backend/app/tests/seed_data/seed_data.py

🧬 Code graph analysis (3)

backend/app/tests/conftest.py (2)

backend/app/seed_data/seed_data.py (1)

seed_database (200-268)

backend/app/tests/seed_data/seed_data.py (1)

seed_database (376-471)

backend/app/seed_data/seed_data.py (1)

backend/app/core/security.py (1)

get_password_hash (100-110)

backend/app/tests/seed_data/seed_data.py (9)

backend/app/tests/conftest.py (1)

db (28-45)

backend/app/core/security.py (2)

get_password_hash (100-110)

encrypt_credentials (151-168)

backend/app/models/api_key.py (1)

APIKey (48-89)

backend/app/models/organization.py (1)

Organization (44-82)

backend/app/models/project.py (1)

Project (51-107)

backend/app/models/user.py (1)

User (65-75)

backend/app/models/credentials.py (1)

Credential (67-131)

backend/app/models/assistants.py (1)

Assistant (34-116)

backend/app/models/document.py (1)

Document (28-69)

🪛 Ruff (0.14.8)

backend/app/tests/seed_data/seed_data.py

86-86: Unnecessary mode argument

Remove mode argument

(UP015)

🔇 Additional comments (5)

backend/app/seed_data/seed_data.json (1)

1-56: LGTM!

The JSON structure is valid and the simplification to core entities (organization, projects, users, apikeys) for production seeding is appropriate.

backend/app/tests/conftest.py (1)

24-24: LGTM!

The import path correctly points to the new test-specific seed module, which includes credentials, assistants, and documents needed for comprehensive test coverage.

backend/app/tests/seed_data/seed_data.json (1)

1-129: LGTM!

The test seed data file is well-structured and comprehensive, providing all entity types needed for thorough test coverage. The use of clearly fake credentials (e.g., sk-proj-GlificI3i5SCxN, sk-lf-test-*) is appropriate for test isolation.

backend/app/seed_data/seed_data.py (1)

200-268: LGTM!

The simplified seed function correctly focuses on core entities (organizations, users, projects, API keys) for production seeding. Exception handling with rollback is properly implemented.

backend/app/tests/seed_data/seed_data.py (1)

376-471: LGTM!

The test seed function is comprehensive and correctly seeds all entity types needed for thorough test coverage. The exception handling with rollback is properly implemented.

backend/app/tests/seed_data/seed_data.py

coderabbitai

Actionable comments posted: 9

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (3)

backend/app/tests/services/response/response/test_process_response.py (1)
56-62: Add return type hint to comply with coding guidelines.

The make_request helper function is missing a return type hint. As per coding guidelines, all function parameters and return values must have type hints.
🔎 Proposed fix
-def make_request(assistant_id: str, previous_response_id: str | None = None):
+def make_request(assistant_id: str, previous_response_id: str | None = None) -> ResponsesAPIRequest:
     return ResponsesAPIRequest(
         assistant_id=assistant_id,
         question="What is the capital of France?",
         callback_url="http://example.com/callback",
         response_id=previous_response_id,
     )
Based on coding guidelines, all functions must include return type hints.
backend/app/tests/crud/config/test_config.py (1)
19-20: Add return type hint to fixture function.

The example_config_blob fixture is missing a return type hint. Per the coding guidelines, all function parameters and return values in Python code should have type hints.
🔎 Proposed fix
 @pytest.fixture
-def example_config_blob():
+def example_config_blob() -> ConfigBlob:
     return ConfigBlob(
As per coding guidelines, type hints are required for all functions in **/*.py files.
backend/app/tests/crud/collections/collection/test_crud_collection_read_all.py (1)
13-27: Add return type hint to helper function.

Per coding guidelines, all function parameters and return values must have type hints. This function returns a project ID (integer).
🔎 Proposed fix
-def create_collections(db: Session, n: int):
+def create_collections(db: Session, n: int) -> int:
As per coding guidelines, Python functions require type hints throughout the codebase.

♻️ Duplicate comments (1)

backend/app/tests/seed_data/seed_data.py (1)

346-357: Document deletion is already included.

The previous review comment about missing Document deletion is outdated - line 350 already includes session.exec(delete(Document)).

🧹 Nitpick comments (15)

backend/app/tests/api/routes/test_threads.py (1)
26-34: Consider adding type hints to test function parameters.

Per the coding guidelines, all function parameters and return values should have type hints. While the import changes look good, the test functions throughout this file would benefit from type annotations for better IDE support and maintainability.

For example:
def test_threads_endpoint(
    mock_process_run: MagicMock,
    mock_send_callback: MagicMock,
    mock_get_provider_credential: MagicMock,
    mock_configure_openai: MagicMock,
    client: TestClient,
    db: Session,
    user_api_key_header: dict[str, str],
) -> None:
This applies to most test functions in the file.

As per coding guidelines, type hints should be added to all function parameters and return values.
backend/app/tests/core/test_callback_ssrf.py (1)
13-317: Consider adding type hints to test methods.

The coding guidelines require type hints for all function parameters and return values. Test methods should include return type hints (-> None) and parameter type hints for mocked objects (e.g., mock_getaddrinfo: MagicMock).

Example for Line 118:
def test_reject_localhost_by_name(self, mock_getaddrinfo: MagicMock) -> None:
As per coding guidelines, type hints are required for **/*.py files.
backend/app/tests/crud/evaluations/test_dataset.py (4)
388-420: Add return type hint and consider factory pattern.

The test logic is excellent and correctly validates successful deletion. However, per coding guidelines:

Add -> None return type hint to the test method

Consider using factory pattern for test fixtures instead of direct CRUD calls (though existing tests in this file don't follow this pattern either)
🔎 Proposed fix for return type hint
-    def test_delete_dataset_success(self, db: Session):
+    def test_delete_dataset_success(self, db: Session) -> None:
         """Test successfully deleting a dataset."""
As per coding guidelines, all functions should have type hints for parameters and return values.

422-437: Add return type hint.

Test logic correctly validates the not-found scenario.
🔎 Proposed fix
-    def test_delete_dataset_not_found(self, db: Session):
+    def test_delete_dataset_not_found(self, db: Session) -> None:
         """Test deleting a non-existent dataset."""
As per coding guidelines, all functions should have type hints for return values.

439-470: Add return type hint.

Test logic correctly validates organization-level access control for deletion.
🔎 Proposed fix
-    def test_delete_dataset_wrong_org(self, db: Session):
+    def test_delete_dataset_wrong_org(self, db: Session) -> None:
         """Test deleting a dataset with wrong organization."""
As per coding guidelines, all functions should have type hints for return values.

472-519: Add return type hint.

Excellent test that validates referential integrity - datasets cannot be deleted when referenced by evaluation runs. The test correctly creates a dependent EvaluationRun and verifies deletion is blocked.
🔎 Proposed fix
-    def test_delete_dataset_with_evaluation_runs(self, db: Session):
+    def test_delete_dataset_with_evaluation_runs(self, db: Session) -> None:
         """Test that dataset cannot be deleted if it has evaluation runs."""
As per coding guidelines, all functions should have type hints for return values.
backend/app/tests/services/doctransformer/test_job/test_execute_job.py (1)
150-160: Consider using a more specific exception type.

Catching the broad Exception class makes the test less precise. If you know the expected exception type(s), using them would make the test more robust and self-documenting.
🔎 Suggested improvement

If the expected exception is known (e.g., ClientError from boto3 for missing S3 objects):
-            with pytest.raises(Exception):
+            with pytest.raises((ClientError, RetryError)):
backend/app/tests/api/routes/test_assistants.py (1)
14-28: Add return type hints to fixtures.

As per coding guidelines, all function parameters and return values should have type hints.
🔎 Suggested fix
 @pytest.fixture
-def assistant_create_payload():
+def assistant_create_payload() -> dict[str, str | float | int | list[str]]:
     return {
         "name": "Test Assistant",
         "instructions": "This is a test instruction.",
         "model": "gpt-4o",
         "vector_store_ids": ["vs_test_1", "vs_test_2"],
         "temperature": 0.5,
         "max_num_results": 10,
     }


 @pytest.fixture
-def assistant_id():
+def assistant_id() -> str:
     return str(uuid4())
backend/app/tests/crud/test_assistants.py (1)
465-472: Remove unused project variable.

The project variable is assigned but never used. The test only uses get_non_existent_id(db, Project).
🔎 Suggested fix
     def test_get_assistants_by_project_empty(self, db: Session):
         """Returns empty list when project has no assistants"""
-        project = get_project(db)
         non_existent_project_id = get_non_existent_id(db, Project)

         result = get_assistants_by_project(db, non_existent_project_id)

         assert result == []
backend/pyproject.toml (1)

49-49: Approve pytest-asyncio addition. Consider consolidating pytest to dev-dependencies only.

pytest-asyncio provides asyncio support for pytest, appropriately supporting the async test infrastructure in this PR. However, pytest appears in both main dependencies (line 27) and dev-dependencies (line 43). Testing frameworks like pytest need to be installed for development but aren't needed for the application to function in production—it should be moved to dev-dependencies only for proper dependency organization.
backend/app/tests/api/routes/test_evaluation.py (3)
699-709: Consider adding explicit return type hint.

While pytest fixtures provide implicit typing, the coding guidelines require type hints on all function parameters and return values. Consider adding -> EvaluationDataset to the fixture:
 @pytest.fixture
 def create_test_dataset(self, db, user_api_key):
     """Create a test dataset for evaluation runs."""
-    return create_test_evaluation_dataset(
+    return create_test_evaluation_dataset(  # -> EvaluationDataset
Or more explicitly:
@pytest.fixture
def create_test_dataset(self, db, user_api_key) -> EvaluationDataset:
As per coding guidelines, all functions should have type hints for return values.

869-879: Consider adding explicit return type hint to fixture.

Similar to the earlier fixture, consider adding -> EvaluationDataset return type annotation for consistency with coding guidelines.

928-938: Consider adding explicit return type hint to fixture.

For consistency with coding guidelines, add -> EvaluationDataset return type annotation to this fixture as well.
backend/app/tests/api/routes/test_cron.py (1)
9-184: Consider adding type hints to test functions.

While pytest conventions typically omit type hints on test functions, the coding guidelines specify that all function parameters and return values should have type hints. Consider adding explicit annotations:
def test_evaluation_cron_job_success(
    client: TestClient,
    superuser_api_key: TestAuthContext,
) -> None:
    """Test successful cron job execution."""
    ...
Apply this pattern to all test functions in the file for consistency with coding guidelines.
backend/app/tests/services/doctransformer/test_zerox_transformer.py (1)
43-66: Add return type hints to pytest fixtures.

For consistency with coding guidelines, add explicit return type annotations to the fixtures:
+from pathlib import Path
+from unittest.mock import Mock
+
 @pytest.fixture
-def temp_input_file(self, tmp_path):
+def temp_input_file(self, tmp_path) -> Path:
     """Create a temporary input file."""
     ...
 
 @pytest.fixture
-def temp_output_file(self, tmp_path):
+def temp_output_file(self, tmp_path) -> Path:
     """Create a temporary output file path."""
     ...
 
 @pytest.fixture
-def mock_zerox_result(self):
+def mock_zerox_result(self) -> Mock:
     """Create a mock zerox result with pages."""
     ...
As per coding guidelines, all function return values should have type hints.

📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 29d98f9 and 9a3cfed.

⛔ Files ignored due to path filters (1)

backend/uv.lock is excluded by !**/*.lock

📒 Files selected for processing (56)

backend/app/tests/api/routes/documents/test_route_document_permanent_remove.py (1 hunks)
backend/app/tests/api/routes/documents/test_route_document_remove.py (1 hunks)
backend/app/tests/api/routes/test_assistants.py (1 hunks)
backend/app/tests/api/routes/test_creds.py (2 hunks)
backend/app/tests/api/routes/test_cron.py (1 hunks)
backend/app/tests/api/routes/test_evaluation.py (9 hunks)
backend/app/tests/api/routes/test_fine_tuning.py (1 hunks)
backend/app/tests/api/routes/test_llm.py (1 hunks)
backend/app/tests/api/routes/test_login.py (0 hunks)
backend/app/tests/api/routes/test_responses.py (1 hunks)
backend/app/tests/api/routes/test_threads.py (1 hunks)
backend/app/tests/api/routes/test_users.py (0 hunks)
backend/app/tests/api/test_deps.py (1 hunks)
backend/app/tests/core/batch/test_openai.py (1 hunks)
backend/app/tests/core/test_callback_ssrf.py (1 hunks)
backend/app/tests/core/test_providers.py (1 hunks)
backend/app/tests/core/test_security.py (1 hunks)
backend/app/tests/crud/collections/collection/test_crud_collection_delete.py (0 hunks)
backend/app/tests/crud/collections/collection/test_crud_collection_read_all.py (2 hunks)
backend/app/tests/crud/collections/collection/test_crud_collection_read_one.py (0 hunks)
backend/app/tests/crud/collections/collection_job/test_collection_jobs.py (1 hunks)
backend/app/tests/crud/config/test_config.py (1 hunks)
backend/app/tests/crud/config/test_version.py (1 hunks)
backend/app/tests/crud/documents/documents/test_crud_document_delete.py (0 hunks)
backend/app/tests/crud/documents/documents/test_crud_document_read_many.py (0 hunks)
backend/app/tests/crud/documents/documents/test_crud_document_read_one.py (0 hunks)
backend/app/tests/crud/documents/documents/test_crud_document_update.py (0 hunks)
backend/app/tests/crud/documents/test_doc_transformation_job.py (0 hunks)
backend/app/tests/crud/evaluations/test_batch.py (1 hunks)
backend/app/tests/crud/evaluations/test_core.py (1 hunks)
backend/app/tests/crud/evaluations/test_cron.py (1 hunks)
backend/app/tests/crud/evaluations/test_dataset.py (3 hunks)
backend/app/tests/crud/evaluations/test_embeddings.py (3 hunks)
backend/app/tests/crud/evaluations/test_langfuse.py (5 hunks)
backend/app/tests/crud/evaluations/test_processing.py (1 hunks)
backend/app/tests/crud/test_api_key.py (1 hunks)
backend/app/tests/crud/test_assistants.py (1 hunks)
backend/app/tests/crud/test_credentials.py (0 hunks)
backend/app/tests/crud/test_jobs.py (1 hunks)
backend/app/tests/crud/test_openai_conversation.py (1 hunks)
backend/app/tests/crud/test_thread_result.py (1 hunks)
backend/app/tests/seed_data/seed_data.py (1 hunks)
backend/app/tests/services/collections/test_create_collection.py (1 hunks)
backend/app/tests/services/doctransformer/test_job/test_execute_job.py (1 hunks)
backend/app/tests/services/doctransformer/test_job/test_execute_job_errors.py (0 hunks)
backend/app/tests/services/doctransformer/test_job/test_integration.py (0 hunks)
backend/app/tests/services/doctransformer/test_job/test_start_job.py (1 hunks)
backend/app/tests/services/doctransformer/test_zerox_transformer.py (1 hunks)
backend/app/tests/services/llm/test_jobs.py (0 hunks)
backend/app/tests/services/response/response/test_process_response.py (1 hunks)
backend/app/tests/services/response/test_jobs_response.py (1 hunks)
backend/app/tests/utils/document.py (1 hunks)
backend/app/tests/utils/openai.py (2 hunks)
backend/app/tests/utils/test_data.py (2 hunks)
backend/app/tests/utils/utils.py (0 hunks)
backend/pyproject.toml (1 hunks)

💤 Files with no reviewable changes (14)

backend/app/tests/crud/documents/test_doc_transformation_job.py
backend/app/tests/crud/collections/collection/test_crud_collection_delete.py
backend/app/tests/crud/documents/documents/test_crud_document_delete.py
backend/app/tests/services/doctransformer/test_job/test_execute_job_errors.py
backend/app/tests/services/llm/test_jobs.py
backend/app/tests/services/doctransformer/test_job/test_integration.py
backend/app/tests/crud/collections/collection/test_crud_collection_read_one.py
backend/app/tests/crud/test_credentials.py
backend/app/tests/crud/documents/documents/test_crud_document_update.py
backend/app/tests/api/routes/test_login.py
backend/app/tests/crud/documents/documents/test_crud_document_read_one.py
backend/app/tests/api/routes/test_users.py
backend/app/tests/utils/utils.py
backend/app/tests/crud/documents/documents/test_crud_document_read_many.py

✅ Files skipped from review due to trivial changes (14)

backend/app/tests/crud/test_jobs.py
backend/app/tests/crud/collections/collection_job/test_collection_jobs.py
backend/app/tests/services/collections/test_create_collection.py
backend/app/tests/api/routes/test_llm.py
backend/app/tests/crud/test_api_key.py
backend/app/tests/api/routes/documents/test_route_document_permanent_remove.py
backend/app/tests/crud/test_thread_result.py
backend/app/tests/crud/evaluations/test_embeddings.py
backend/app/tests/api/routes/documents/test_route_document_remove.py
backend/app/tests/api/routes/test_fine_tuning.py
backend/app/tests/crud/config/test_version.py
backend/app/tests/api/test_deps.py
backend/app/tests/services/doctransformer/test_job/test_start_job.py
backend/app/tests/crud/test_openai_conversation.py

🧰 Additional context used

📓 Path-based instructions (2)

**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.py: Always add type hints to all function parameters and return values in Python code
Prefix all log messages with the function name in square brackets: logger.info(f"[function_name] Message {mask_string(sensitive_value)}")
Use Python 3.11+ with type hints throughout the codebase

Files:

backend/app/tests/api/routes/test_assistants.py
backend/app/tests/crud/evaluations/test_processing.py
backend/app/tests/crud/collections/collection/test_crud_collection_read_all.py
backend/app/tests/services/doctransformer/test_zerox_transformer.py
backend/app/tests/crud/evaluations/test_core.py
backend/app/tests/crud/config/test_config.py
backend/app/tests/core/batch/test_openai.py
backend/app/tests/services/doctransformer/test_job/test_execute_job.py
backend/app/tests/utils/test_data.py
backend/app/tests/utils/openai.py
backend/app/tests/crud/evaluations/test_langfuse.py
backend/app/tests/api/routes/test_cron.py
backend/app/tests/crud/evaluations/test_cron.py
backend/app/tests/core/test_callback_ssrf.py
backend/app/tests/api/routes/test_responses.py
backend/app/tests/api/routes/test_evaluation.py
backend/app/tests/crud/evaluations/test_batch.py
backend/app/tests/core/test_security.py
backend/app/tests/api/routes/test_threads.py
backend/app/tests/core/test_providers.py
backend/app/tests/crud/test_assistants.py
backend/app/tests/services/response/test_jobs_response.py
backend/app/tests/seed_data/seed_data.py
backend/app/tests/crud/evaluations/test_dataset.py
backend/app/tests/utils/document.py
backend/app/tests/services/response/response/test_process_response.py
backend/app/tests/api/routes/test_creds.py

backend/app/tests/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Use factory pattern for test fixtures in backend/app/tests/

Files:

backend/app/tests/api/routes/test_assistants.py
backend/app/tests/crud/evaluations/test_processing.py
backend/app/tests/crud/collections/collection/test_crud_collection_read_all.py
backend/app/tests/services/doctransformer/test_zerox_transformer.py
backend/app/tests/crud/evaluations/test_core.py
backend/app/tests/crud/config/test_config.py
backend/app/tests/core/batch/test_openai.py
backend/app/tests/services/doctransformer/test_job/test_execute_job.py
backend/app/tests/utils/test_data.py
backend/app/tests/utils/openai.py
backend/app/tests/crud/evaluations/test_langfuse.py
backend/app/tests/api/routes/test_cron.py
backend/app/tests/crud/evaluations/test_cron.py
backend/app/tests/core/test_callback_ssrf.py
backend/app/tests/api/routes/test_responses.py
backend/app/tests/api/routes/test_evaluation.py
backend/app/tests/crud/evaluations/test_batch.py
backend/app/tests/core/test_security.py
backend/app/tests/api/routes/test_threads.py
backend/app/tests/core/test_providers.py
backend/app/tests/crud/test_assistants.py
backend/app/tests/services/response/test_jobs_response.py
backend/app/tests/seed_data/seed_data.py
backend/app/tests/crud/evaluations/test_dataset.py
backend/app/tests/utils/document.py
backend/app/tests/services/response/response/test_process_response.py
backend/app/tests/api/routes/test_creds.py

🧠 Learnings (3)

📓 Common learnings

Learnt from: CR
Repo: ProjectTech4DevAI/kaapi-backend PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-17T15:39:30.458Z
Learning: Seed database with test data using `uv run python -m app.seed_data.seed_data`

📚 Learning: 2025-12-17T15:39:30.457Z

Learnt from: CR
Repo: ProjectTech4DevAI/kaapi-backend PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-17T15:39:30.457Z
Learning: Applies to backend/app/tests/**/*.py : Use factory pattern for test fixtures in `backend/app/tests/`

Applied to files:

backend/app/tests/api/routes/test_assistants.py

📚 Learning: 2025-12-17T15:39:30.458Z

Learnt from: CR
Repo: ProjectTech4DevAI/kaapi-backend PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-17T15:39:30.458Z
Learning: Seed database with test data using `uv run python -m app.seed_data.seed_data`

Applied to files:

backend/app/tests/seed_data/seed_data.py

🧬 Code graph analysis (13)

backend/app/tests/crud/evaluations/test_processing.py (1)

backend/app/crud/evaluations/processing.py (5)

check_and_process_evaluation (443-648)

parse_evaluation_output (45-176)

process_completed_embedding_batch (319-440)

process_completed_evaluation (179-316)

poll_all_pending_evaluations (651-812)

backend/app/tests/crud/collections/collection/test_crud_collection_read_all.py (2)

backend/app/crud/collection/collection.py (1)

delete (103-111)

backend/app/tests/conftest.py (1)

db (28-45)

backend/app/tests/services/doctransformer/test_zerox_transformer.py (2)

backend/app/services/doctransform/zerox_transformer.py (1)

ZeroxTransformer (12-70)

backend/app/services/doctransform/transformer.py (1)

Transformer (5-14)

backend/app/tests/core/batch/test_openai.py (2)

backend/app/core/batch/openai.py (1)

OpenAIBatchProvider (14-254)

backend/app/tests/utils/openai.py (1)

create_mock_batch (141-177)

backend/app/tests/utils/test_data.py (2)

backend/app/models/evaluation.py (1)

EvaluationDataset (73-167)

backend/app/tests/utils/utils.py (1)

random_lower_string (17-18)

backend/app/tests/crud/evaluations/test_langfuse.py (1)

backend/app/core/langfuse/langfuse.py (1)

flush (110-111)

backend/app/tests/api/routes/test_cron.py (3)

backend/app/tests/utils/auth.py (1)

TestAuthContext (9-22)

backend/app/tests/conftest.py (2)

superuser_api_key (96-98)

user_api_key (102-104)

backend/app/tests/utils/document.py (1)

get (117-121)

backend/app/tests/crud/evaluations/test_cron.py (3)

backend/app/crud/evaluations/cron.py (2)

process_all_pending_evaluations (20-143)

process_all_pending_evaluations_sync (146-158)

backend/app/crud/evaluations/core.py (1)

create_evaluation_run (14-61)

backend/app/tests/utils/test_data.py (1)

create_test_evaluation_dataset (341-376)

backend/app/tests/api/routes/test_evaluation.py (3)

backend/app/tests/utils/test_data.py (1)

create_test_evaluation_dataset (341-376)

backend/app/models/evaluation.py (1)

EvaluationRun (170-319)

backend/app/tests/crud/evaluations/test_dataset.py (3)

TestDeleteDataset (385-519)

test_delete_dataset_success (388-420)

test_delete_dataset_not_found (422-437)

backend/app/tests/crud/evaluations/test_batch.py (1)

backend/app/crud/evaluations/batch.py (3)

build_evaluation_jsonl (62-115)

fetch_dataset_items (24-59)

start_evaluation_batch (118-212)

backend/app/tests/crud/evaluations/test_dataset.py (4)

backend/app/core/util.py (1)

now (11-12)

backend/app/models/evaluation.py (1)

EvaluationRun (170-319)

backend/app/api/routes/evaluation.py (1)

delete_dataset (390-418)

backend/app/crud/evaluations/dataset.py (2)

delete_dataset (323-387)

get_dataset_by_id (107-140)

backend/app/tests/utils/document.py (1)

backend/app/tests/utils/utils.py (1)

SequentialUuidGenerator (147-160)

backend/app/tests/api/routes/test_creds.py (4)

backend/app/core/providers.py (1)

Provider (9-14)

backend/app/models/credentials.py (1)

Credential (67-131)

backend/app/core/security.py (1)

decrypt_credentials (171-188)

backend/app/tests/utils/utils.py (1)

generate_random_string (21-22)

🪛 Ruff (0.14.8)

backend/app/tests/services/doctransformer/test_zerox_transformer.py

13-13: Unused function argument: args

(ARG001)

13-13: Unused function argument: kwargs

(ARG001)

122-122: Unused function argument: args

(ARG001)

122-122: Unused function argument: kwargs

(ARG001)

426-426: Unused function argument: args

(ARG001)

426-426: Unused function argument: kwargs

(ARG001)

backend/app/tests/crud/evaluations/test_cron.py

13-13: Redefinition of unused Organization from line 12

Remove definition: Organization

(F811)

13-13: Redefinition of unused Project from line 12

Remove definition: Project

(F811)

backend/app/tests/seed_data/seed_data.py

85-85: Unnecessary mode argument

Remove mode argument

(UP015)

325-325: Local variable users is assigned to but never used

Remove assignment to unused variable users

(F841)

backend/app/tests/api/routes/test_creds.py

170-170: Avoid equality comparisons to True; use Credential.is_active: for truth checks

Replace with Credential.is_active

(E712)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: checks (3.11.7, 6)

🔇 Additional comments (33)

backend/app/tests/api/routes/test_threads.py (1)

5-6: LGTM! Import consolidation improves code organization.

Moving the OpenAI imports to the top-level follows Python best practices and eliminates redundant inline imports.

backend/app/tests/services/response/response/test_process_response.py (6)

4-6: LGTM: Import additions support enhanced test structure.

The new imports enable direct OpenAI client instantiation and explicit Session typing, which align well with the test refactoring objectives.

24-53: LGTM: Well-structured test fixture.

The fixture properly sets up test data with appropriate type hints. Using a hardcoded API key on Line 39 is acceptable since all OpenAI calls are mocked in the tests.

65-101: LGTM: Comprehensive success case test.

The test properly mocks external dependencies (OpenAI, Langfuse) and verifies both the API response and job status updates. Type hints are correctly applied.

104-123: LGTM: Proper error handling test.

The test correctly verifies the behavior when an assistant is not found, checking both the API response and job status update.

126-158: LGTM: Proper failure case test.

The test correctly verifies error handling when generate_response fails, ensuring the error propagates to the API response and job status.

161-186: LGTM: Robust exception handling test.

The test verifies that unexpected exceptions are caught and handled gracefully, with appropriate error messages and job status updates.

backend/app/tests/core/test_callback_ssrf.py (1)

3-5: LGTM: Import reordering improves organization.

The import reordering is clean and maintains proper grouping (standard library, third-party, local imports).

backend/app/tests/crud/config/test_config.py (1)

33-494: Excellent test coverage!

The test suite provides comprehensive coverage of Config CRUD operations including:

Creation with various scenarios (duplicate names, different projects)

Reading (single, all, pagination, ordering)

Updates (various fields, duplicate name handling)

Soft deletion

Edge cases and error conditions

The tests are well-structured, properly documented, and follow pytest best practices.

backend/app/tests/crud/evaluations/test_dataset.py (2)

17-19: LGTM!

The new imports are appropriate for the deletion tests and properly organized.

385-519: Excellent test coverage for dataset deletion!

The TestDeleteDataset class provides comprehensive coverage of deletion scenarios:

✅ Successful deletion

✅ Non-existent dataset handling

✅ Organization-level access control

✅ Referential integrity with evaluation runs

The test assertions are robust with flexible message matching, and the logic correctly validates the expected behavior from the delete_dataset function.

backend/app/tests/crud/collections/collection/test_crud_collection_read_all.py (2)

4-4: LGTM: Import addition supports the new deletion pattern.

The addition of delete to the SQLModel imports is necessary and correct for the updated deletion approach used in the fixture and test method.

40-40: LGTM: Deletion pattern updated to use SQLModel expression.

The change from ORM-based deletion to SQLModel's delete() expression is an improvement and aligns with the broader pattern being adopted across the test suite. The absence of an explicit commit is correct here, as the db fixture from conftest.py manages transactions with automatic rollback.

backend/app/tests/services/doctransformer/test_job/test_execute_job.py (1)

1-18: LGTM on import cleanup.

The removal of the unused UUID import is appropriate since only uuid4 is used throughout the file.

backend/app/tests/api/routes/test_assistants.py (1)

1-8: LGTM on import reorganization.

The import ordering now follows the standard convention: stdlib (unittest.mock, uuid) before third-party packages (pytest, sqlmodel, fastapi).

backend/app/tests/crud/test_assistants.py (1)

1-6: LGTM on import reorganization.

The import ordering is consistent with other test files in this PR, placing stdlib imports before third-party packages.

backend/app/tests/api/routes/test_responses.py (1)

1-5: LGTM on import grouping.

The blank line correctly separates third-party imports (unittest.mock, fastapi) from local application imports (app.models), following PEP 8 conventions.

backend/app/tests/utils/document.py (1)

19-19: LGTM! Import path made more explicit.

The change from relative to absolute import improves clarity and consistency across the test suite.

backend/app/tests/core/test_security.py (1)

3-9: LGTM! Unused imports removed.

Cleaning up the unused get_password_hash and verify_password imports improves code maintainability.

backend/app/tests/core/test_providers.py (1)

2-6: LGTM! Unused import removed.

Removing the unused Provider import keeps the test module clean.

backend/app/tests/services/response/test_jobs_response.py (1)

2-8: LGTM! Unused import removed.

The Job model import was not needed since jobs are accessed via JobCrud.

backend/app/tests/utils/test_data.py (1)

341-376: LGTM! Well-structured test data factory.

The new create_test_evaluation_dataset helper follows the established factory pattern for test fixtures and includes proper type hints for all parameters and return type, as required by the coding guidelines.

backend/app/tests/api/routes/test_evaluation.py (1)

780-984: Excellent test coverage additions!

The new test cases significantly improve coverage by validating authentication requirements, error handling for non-existent resources, and edge cases. The tests follow consistent patterns and will help catch regressions.

backend/app/tests/api/routes/test_cron.py (1)

9-184: Excellent comprehensive test coverage!

The test suite thoroughly covers the cron evaluation endpoint including success scenarios, error handling, authentication, authorization, and OpenAPI schema validation. The tests are well-structured and will effectively catch regressions.

backend/app/tests/services/doctransformer/test_zerox_transformer.py (1)

19-438: Outstanding comprehensive test coverage!

This test suite provides exceptional coverage of the ZeroxTransformer including:

Initialization scenarios

Successful transformations with various content types

Error handling (timeouts, missing files, Poppler errors)

Edge cases (empty pages, None content, mixed content)

Content formatting and encoding (unicode, multiline)

Large document handling

Integration-style end-to-end flows

The tests are well-organized, use appropriate mocking strategies, and will effectively prevent regressions.
backend/app/tests/api/routes/test_creds.py (2)
2-2: LGTM!

The addition of Session and select imports is appropriate for the new encryption verification tests that require direct database access.

161-181: Add type hints to function parameters.

Per coding guidelines, all function parameters should have type hints. Additionally, the static analysis hint about using Credential.is_active instead of == True on line 170 is valid for more idiomatic code.
🔎 Proposed fix
 def test_credential_encryption(
-    db: Session,
-    user_api_key: TestAuthContext,
+    db: Session,
+    user_api_key: TestAuthContext,
 ):
     """Verify credentials are encrypted in database."""
     db_credential = db.exec(
         select(Credential).where(
             Credential.organization_id == user_api_key.organization_id,
             Credential.project_id == user_api_key.project_id,
-            Credential.is_active == True,
+            Credential.is_active,
             Credential.provider == Provider.OPENAI.value,
         )
     ).first()
As per coding guidelines, type hints are required for all function parameters.

Likely an incorrect or invalid review comment.
backend/app/tests/crud/evaluations/test_processing.py (1)

1-805: LGTM! Comprehensive test coverage.

The test suite is well-structured with:

Clear test organization into logical classes

Appropriate use of fixtures and mocking

Good coverage of success, error, and edge cases

Proper async test handling with @pytest.mark.asyncio

Clear, descriptive test names and docstrings

backend/app/tests/utils/openai.py (1)

141-177: LGTM! Well-designed test utility.

The create_mock_batch function is well-implemented with:

Clear type hints on all parameters and return type

Comprehensive docstring explaining usage

Sensible default values for common scenarios

Proper MagicMock configuration

backend/app/tests/crud/evaluations/test_core.py (1)

1-647: LGTM! Comprehensive CRUD test coverage.

The test suite provides excellent coverage of evaluation run CRUD operations:

Creation (minimal and complete configurations)

Listing (empty, multiple, pagination, org/project scoping)

Retrieval (success, not-found, cross-org scenarios)

Updates (individual and multiple fields)

Score operations (caching, fetching, force-refetch)

All tests are well-structured with appropriate fixtures, clear names, and good assertions.

backend/app/tests/core/batch/test_openai.py (1)

1-441: LGTM! Thorough batch provider testing.

The test suite provides excellent coverage of the OpenAI batch provider:

Initialization and basic operations

Batch creation (success, defaults, errors)

Status retrieval (completed, in_progress, failed, expired, cancelled)

Result downloading (success, errors, malformed JSON, empty files)

File operations (upload/download with various scenarios)

JSONL formatting validation

All tests use appropriate mocking and have clear assertions.

backend/app/tests/crud/evaluations/test_langfuse.py (1)

1-471: LGTM! Comprehensive Langfuse integration tests.

The test suite provides thorough coverage of Langfuse integration:

Dataset run creation (success, missing items, errors, empty results, cost tracking)

Trace updates with cosine scores (success, missing trace_id, error handling)

Dataset uploads (success, duplication, metadata, empty items, errors)

All tests properly mock external dependencies and verify expected behaviors.

backend/app/tests/crud/evaluations/test_batch.py (1)

18-488: Excellent test coverage and organization.

The test suite comprehensively covers:

Success scenarios with various configurations

Error handling (dataset not found, empty datasets, fetch failures)

Edge cases (missing metadata, empty questions, minimal config)

Integration with tools configuration

The tests appropriately use mocking to isolate units under test while maintaining realistic database interactions where needed.

backend/app/tests/crud/collections/collection/test_crud_collection_read_all.py

backend/app/tests/crud/evaluations/test_batch.py

backend/app/tests/crud/evaluations/test_cron.py

backend/app/tests/seed_data/seed_data.py

backend/app/tests/services/doctransformer/test_zerox_transformer.py

coderabbitai

Actionable comments posted: 8

♻️ Duplicate comments (4)

backend/app/tests/seed_data/seed_data.py (4)

319-327: Critical: IndexError risk remains unresolved.

Despite being marked as addressed in a previous review, accessing users[1] without validating list length will raise IndexError if fewer than 2 users exist. The check on line 326 is unreachable because the error occurs first.

🔎 Proposed fix

         users = session.exec(
             select(User)
             .join(APIKey, APIKey.user_id == User.id)
             .where(APIKey.organization_id == organization.id)
         ).all()

-        user = users[1]
-        if not user:
+        if len(users) < 2:
             raise ValueError(f"No user found in organization '{organization.name}'")
+        user = users[1]

357-418: Correct log prefixes.

🔎 Proposed fix

 def seed_database(session: Session) -> None:
     """
     Seed the database with initial test data.

     This function creates a complete test environment including:
     - Organizations (Project Tech4dev)
     - Projects (Glific, Dalgo)
     - Users (superuser, test user)
     - API Keys for both users
     - OpenAI Credentials for both projects (ensures all tests have credentials)
     - Langfuse Credentials for both projects (for tracing and observability tests)
     - Test Assistants for both projects
     - Sample Documents

     This seed data is used by the test suite and ensures that all tests
     can rely on both OpenAI and Langfuse credentials being available without manual setup.
     """
-    logging.info("[tests.seed_data] Starting database seeding")
+    logging.info("[seed_database] Starting database seeding")
     try:
         clear_database(session)

         seed_data = load_seed_data()

         for org_data in seed_data["organization"]:
             create_organization(session, org_data)

         for user_data in seed_data["users"]:
             if user_data["email"] == "{{SUPERUSER_EMAIL}}":
                 user_data["email"] = settings.FIRST_SUPERUSER
             elif user_data["email"] == "{{ADMIN_EMAIL}}":
                 user_data["email"] = settings.EMAIL_TEST_USER

         for user_data in seed_data["users"]:
             create_user(session, user_data)

         for project_data in seed_data["projects"]:
             create_project(session, project_data)

         for api_key_data in seed_data["apikeys"]:
             if api_key_data["user_email"] == "{{SUPERUSER_EMAIL}}":
                 api_key_data["user_email"] = settings.FIRST_SUPERUSER
             elif api_key_data["user_email"] == "{{ADMIN_EMAIL}}":
                 api_key_data["user_email"] = settings.EMAIL_TEST_USER

         for api_key_data in seed_data["apikeys"]:
             create_api_key(session, api_key_data)

         for credential_data in seed_data["credentials"]:
             create_credential(session, credential_data)

         for assistant_data in seed_data.get("assistants", []):
             create_assistant(session, assistant_data)

         for document_data in seed_data.get("documents", []):
             create_document(session, document_data)

         session.commit()
-        logging.info("[tests.seed_data] Database seeded successfully")
+        logging.info("[seed_database] Database seeded successfully")
     except Exception as e:
-        logging.error(f"[tests.seed_data] Error seeding database: {e}")
+        logging.error(f"[seed_database] Error seeding database: {e}")
         session.rollback()
         raise

As per coding guidelines, log messages must be prefixed with the function name.

99-109: Add complete type hints and correct log prefix.

The parameter type hint and log prefix need updates per coding guidelines.

🔎 Proposed fix

-def create_organization(session: Session, org_data_raw: dict) -> Organization:
+def create_organization(session: Session, org_data_raw: dict[str, Any]) -> Organization:
     """Create an organization from data."""
     try:
         org_data = OrgData.model_validate(org_data_raw)
         organization = Organization(name=org_data.name, is_active=org_data.is_active)
         session.add(organization)
         session.flush()  # Ensure ID is assigned
         return organization
     except Exception as e:
-        logging.error(f"[tests.seed_data]Error creating organization: {e}")
+        logging.error(f"[create_organization] Error creating organization: {e}")
         raise

As per coding guidelines.

81-96: Fix type hints, log prefixes, and file mode.

Multiple issues need attention:

Return type should be dict[str, Any] instead of dict
Log messages must use [load_seed_data] prefix (not [tests.seed_data])
The "r" mode argument is unnecessary (it's the default)

🔎 Proposed fix

+from typing import Any
+
-def load_seed_data() -> dict:
+def load_seed_data() -> dict[str, Any]:
     """Load seed data from JSON file."""
     json_path = Path(__file__).parent / "seed_data.json"
     try:
-        with open(json_path, "r") as f:
+        with open(json_path) as f:
             return json.load(f)
     except FileNotFoundError:
         logging.error(
-            f"[tests.seed_data]Error: Seed data file not found at {json_path}"
+            f"[load_seed_data] Seed data file not found at {json_path}"
         )
         raise
     except json.JSONDecodeError as e:
         logging.error(
-            f"[tests.seed_data]Error: Failed to decode JSON from {json_path}: {e}"
+            f"[load_seed_data] Failed to decode JSON from {json_path}: {e}"
         )
         raise

As per coding guidelines, all functions must have complete type hints and log messages must be prefixed with the function name.

🧹 Nitpick comments (8)

backend/app/tests/crud/collections/collection/test_crud_collection_read_all.py (3)
13-27: Add return type hint to create_collections function.

Per coding guidelines, all function parameters and return values require type hints. The function returns crud.project_id which is an int.
🔎 Proposed fix
-def create_collections(db: Session, n: int):
+def create_collections(db: Session, n: int) -> int:
     crud = None
     project = get_project(db)
33-40: Add return type hints to test methods.

Per coding guidelines, all functions require return type hints. Test methods should have -> None return type.
🔎 Proposed fix
-    def test_number_read_is_expected(self, db: Session):
+    def test_number_read_is_expected(self, db: Session) -> None:
         db.exec(delete(Collection))
42-45: Add return type hint.
🔎 Proposed fix
-    def test_deleted_docs_are_excluded(self, db: Session):
+    def test_deleted_docs_are_excluded(self, db: Session) -> None:
         owner = create_collections(db, self._ncollections)
backend/app/tests/crud/evaluations/test_cron.py (1)
43-52: Avoid mutating the db session's exec method directly.

Replacing db.exec with a MagicMock modifies the session object in a way that could leak to other tests if the session is reused. Consider using patch.object for safer, scoped mocking.
🔎 Proposed fix
     @pytest.mark.asyncio
     async def test_process_all_pending_evaluations_no_orgs(self, db: Session):
         """Test processing when there are no organizations."""
-        # This is unlikely in practice but tests the edge case
-        # We can't actually remove all orgs due to seed data, so we'll just check
-        # that the function handles it gracefully by mocking
-        with patch("app.crud.evaluations.cron.select") as mock_select:
-            mock_stmt = MagicMock()
-            mock_select.return_value = mock_stmt
-            db.exec = MagicMock(return_value=MagicMock(all=MagicMock(return_value=[])))
-
-            result = await process_all_pending_evaluations(session=db)
+        with patch.object(db, 'exec', return_value=MagicMock(all=MagicMock(return_value=[]))):
+            result = await process_all_pending_evaluations(session=db)
 
             assert result["status"] == "success"
             assert result["organizations_processed"] == 0
             assert result["total_processed"] == 0
backend/app/seed_data/seed_data.py (2)
49-60: Add return type hint to load_seed_data.

Per coding guidelines, all functions require return type hints.

🔎 Proposed fix

The function signature already has -> dict on line 49, so this is correct. No change needed.

56-60: Consider using function name in log prefixes for consistency.

The coding guidelines specify prefixing log messages with the function name in square brackets. Currently using [seed_data] (module name) instead of function-specific prefixes like [load_seed_data].
🔎 Example
-        logging.error(f"[seed_data] Seed data file not found at {json_path}")
+        logging.error(f"[load_seed_data] Seed data file not found at {json_path}")
As per coding guidelines, log messages should use [function_name] prefix.
backend/app/tests/seed_data/seed_data.py (2)
5-5: Use Python 3.11+ union syntax instead of Optional.

Per coding guidelines, Python 3.11+ should use the | operator for optional types instead of importing Optional.
🔎 Proposed fix
-from typing import Optional
Then update lines 50 and 59 to use str | None instead of Optional[str].
As per coding guidelines, use Python 3.11+ features throughout the codebase.

383-402: Consider consolidating email placeholder replacement.

The current two-pass approach (lines 383-387 for mutation, 389-390 for creation) works but mutates the seed data and loops twice. A single-pass approach would be more efficient.
🔎 Alternative approach
-        for user_data in seed_data["users"]:
-            if user_data["email"] == "{{SUPERUSER_EMAIL}}":
-                user_data["email"] = settings.FIRST_SUPERUSER
-            elif user_data["email"] == "{{ADMIN_EMAIL}}":
-                user_data["email"] = settings.EMAIL_TEST_USER
-
         for user_data in seed_data["users"]:
+            if user_data["email"] == "{{SUPERUSER_EMAIL}}":
+                user_data["email"] = settings.FIRST_SUPERUSER
+            elif user_data["email"] == "{{ADMIN_EMAIL}}":
+                user_data["email"] = settings.EMAIL_TEST_USER
             create_user(session, user_data)

         for project_data in seed_data["projects"]:
             create_project(session, project_data)

-        for api_key_data in seed_data["apikeys"]:
-            if api_key_data["user_email"] == "{{SUPERUSER_EMAIL}}":
-                api_key_data["user_email"] = settings.FIRST_SUPERUSER
-            elif api_key_data["user_email"] == "{{ADMIN_EMAIL}}":
-                api_key_data["user_email"] = settings.EMAIL_TEST_USER
-
         for api_key_data in seed_data["apikeys"]:
+            if api_key_data["user_email"] == "{{SUPERUSER_EMAIL}}":
+                api_key_data["user_email"] = settings.FIRST_SUPERUSER
+            elif api_key_data["user_email"] == "{{ADMIN_EMAIL}}":
+                api_key_data["user_email"] = settings.EMAIL_TEST_USER
             create_api_key(session, api_key_data)

📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 9a3cfed and 4586c04.

📒 Files selected for processing (7)

backend/app/seed_data/seed_data.py
backend/app/tests/conftest.py
backend/app/tests/crud/collections/collection/test_crud_collection_read_all.py
backend/app/tests/crud/evaluations/test_core.py
backend/app/tests/crud/evaluations/test_cron.py
backend/app/tests/crud/evaluations/test_langfuse.py
backend/app/tests/seed_data/seed_data.py

🚧 Files skipped from review as they are similar to previous changes (1)

backend/app/tests/conftest.py

🧰 Additional context used

📓 Path-based instructions (2)

**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.py: Always add type hints to all function parameters and return values in Python code
Prefix all log messages with the function name in square brackets: logger.info(f"[function_name] Message {mask_string(sensitive_value)}")
Use Python 3.11+ with type hints throughout the codebase

Files:

backend/app/tests/crud/evaluations/test_core.py
backend/app/seed_data/seed_data.py
backend/app/tests/crud/evaluations/test_cron.py
backend/app/tests/crud/collections/collection/test_crud_collection_read_all.py
backend/app/tests/crud/evaluations/test_langfuse.py
backend/app/tests/seed_data/seed_data.py

backend/app/tests/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Use factory pattern for test fixtures in backend/app/tests/

Files:

backend/app/tests/crud/evaluations/test_core.py
backend/app/tests/crud/evaluations/test_cron.py
backend/app/tests/crud/collections/collection/test_crud_collection_read_all.py
backend/app/tests/crud/evaluations/test_langfuse.py
backend/app/tests/seed_data/seed_data.py

🧠 Learnings (2)

📓 Common learnings

Learnt from: CR
Repo: ProjectTech4DevAI/kaapi-backend PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-17T15:39:30.469Z
Learning: Seed database with test data using `uv run python -m app.seed_data.seed_data`

📚 Learning: 2025-12-17T15:39:30.469Z

Learnt from: CR
Repo: ProjectTech4DevAI/kaapi-backend PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-17T15:39:30.469Z
Learning: Seed database with test data using `uv run python -m app.seed_data.seed_data`

Applied to files:

backend/app/seed_data/seed_data.py
backend/app/tests/seed_data/seed_data.py

🧬 Code graph analysis (3)

backend/app/tests/crud/evaluations/test_core.py (1)

backend/app/crud/evaluations/core.py (5)

create_evaluation_run (14-61)

get_evaluation_run_by_id (103-141)

get_or_fetch_score (199-256)

save_score (259-295)

update_evaluation_run (144-196)

backend/app/tests/crud/evaluations/test_cron.py (3)

backend/app/crud/evaluations/cron.py (2)

process_all_pending_evaluations (20-143)

process_all_pending_evaluations_sync (146-158)

backend/app/crud/evaluations/core.py (1)

create_evaluation_run (14-61)

backend/app/models/batch_job.py (1)

BatchJob (15-136)

backend/app/tests/seed_data/seed_data.py (3)

backend/app/core/security.py (2)

get_password_hash (100-110)

encrypt_credentials (151-168)

backend/app/models/api_key.py (1)

APIKey (48-89)

backend/app/models/credentials.py (1)

Credential (67-131)

🪛 Ruff (0.14.8)

backend/app/tests/crud/evaluations/test_core.py

27-27: Undefined name test_dataset

(F821)

28-28: Undefined name test_dataset

(F821)

30-30: Undefined name test_dataset

(F821)

31-31: Undefined name test_dataset

(F821)

36-36: Undefined name test_dataset

(F821)

37-37: Undefined name test_dataset

(F821)

40-40: Undefined name test_dataset

(F821)

41-41: Undefined name test_dataset

(F821)

backend/app/tests/seed_data/seed_data.py

85-85: Unnecessary mode argument

Remove mode argument

(UP015)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: checks (3.11.7, 6)

🔇 Additional comments (13)

backend/app/tests/crud/evaluations/test_langfuse.py (5)

311-321: LGTM!

The test correctly validates that items without trace_id are skipped. The assertion mock_langfuse.score.call_count == 2 confirms only valid items are processed.

378-385: LGTM!

The assertions correctly reflect the duplication logic: 3 items × 5 duplication factor = 15 total items.

409-411: LGTM!

The duplicate number distribution is correct: with 3 original items and duplication_factor=3, each duplicate_number (1, 2, 3) appears exactly 3 times.

446-448: LGTM!

Correctly asserts 3 total items when duplication_factor=1.

457-470: LGTM!

The test correctly validates error handling during item creation. The side_effect list simulates one success, one failure, and one success, resulting in total_items == 2.

backend/app/tests/crud/evaluations/test_core.py (3)

71-90: LGTM!

The test_dataset fixture is properly defined with all required parameters and uses the factory pattern appropriately.

576-611: LGTM!

The test appropriately documents the limitation with save_score creating its own session, and uses update_evaluation_run as a surrogate to test the core functionality.

613-627: save_score correctly handles non-existent evaluation runs without interfering with test sessions.

The implementation creates its own database session via Session(engine), ensuring complete isolation from the test's transactional session. When the evaluation run doesn't exist, get_evaluation_run_by_id returns None, which the function correctly returns as specified by its -> EvaluationRun | None return type. The test expectation of result is None is properly satisfied.

backend/app/tests/crud/evaluations/test_cron.py (3)

106-134: LGTM!

The test properly creates a second organization and project to validate multi-org processing. The fixture setup is correct and the assertion organizations_processed >= 2 accounts for seed data orgs.

275-328: LGTM!

The integration test properly sets up a complete scenario with a batch job and evaluation run, then verifies cron processing handles it correctly. The mock setup for OpenAI and Langfuse clients is appropriate.

330-348: LGTM!

Good edge case coverage - verifying that completed evaluations are not reprocessed by the cron job.

backend/app/seed_data/seed_data.py (2)

191-238: LGTM!

The seed flow is simplified by removing unnecessary list accumulation. The function properly handles transactions with commit on success and rollback on error.

241-244: LGTM!

The main guard is simplified appropriately. Based on learnings, this seed module is invoked via uv run python -m app.seed_data.seed_data.

backend/app/tests/crud/evaluations/test_core.py

backend/app/tests/seed_data/seed_data.py

coderabbitai

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (3)

backend/app/tests/utils/document.py (3)
25-26: Add missing return type annotation.

The httpx_to_standard function is missing a return type annotation.

As per coding guidelines: Always add type hints to all function parameters and return values in Python code.
🔎 Proposed fix
-def httpx_to_standard(response: Response):
+def httpx_to_standard(response: Response) -> APIResponse:
     return APIResponse(**response.json())
30-52: Add missing type annotations to DocumentMaker methods.

The DocumentMaker class methods are missing return type annotations.

As per coding guidelines: Always add type hints to all function parameters and return values in Python code.
🔎 Proposed fix
-    def __init__(self, project_id: int, session: Session):
+    def __init__(self, project_id: int, session: Session) -> None:
         self.project_id = project_id
         self.session = session
         self.project: Project = get_project_by_id(
             session=self.session, project_id=self.project_id
         )
         self.index = SequentialUuidGenerator()
 
-    def __iter__(self):
+    def __iter__(self) -> "DocumentMaker":
         return self
 
-    def __next__(self):
+    def __next__(self) -> Document:
         doc_id = next(self.index)
158-167: Add missing return type annotation.

The to_public_dict method is missing a return type annotation.

As per coding guidelines: Always add type hints to all function parameters and return values in Python code.
🔎 Proposed fix
-    def to_public_dict(self) -> dict:
+    def to_public_dict(self) -> dict[str, Any]:
         """Convert Document to dict matching DocumentPublic schema."""

🧹 Nitpick comments (1)

backend/app/tests/api/routes/test_evaluation.py (1)
868-879: Consider consolidating duplicate fixtures.

The create_test_dataset fixture is duplicated across TestGetEvaluationRunStatus (lines 698-709), TestGetDataset (lines 868-879), and TestDeleteDataset (lines 927-938). The only differences are the name and description values.

You could consolidate these into a single shared fixture (e.g., in conftest.py) or parameterize them, reducing duplication.
🔎 Example consolidation approach

Move to conftest.py with a more generic name:
@pytest.fixture
def test_evaluation_dataset(db, user_api_key):
    """Create a test evaluation dataset for evaluation-related tests."""
    return create_test_evaluation_dataset(
        db=db,
        organization_id=user_api_key.organization_id,
        project_id=user_api_key.project_id,
    )
Then use this shared fixture across all test classes.
Also applies to: 927-938

📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 4586c04 and dd2648d.

📒 Files selected for processing (8)

backend/app/tests/api/routes/test_evaluation.py
backend/app/tests/conftest.py
backend/app/tests/utils/auth.py
backend/app/tests/utils/collection.py
backend/app/tests/utils/document.py
backend/app/tests/utils/openai.py
backend/app/tests/utils/test_data.py
backend/app/tests/utils/utils.py

💤 Files with no reviewable changes (1)

backend/app/tests/utils/auth.py

🚧 Files skipped from review as they are similar to previous changes (2)

backend/app/tests/conftest.py
backend/app/tests/utils/openai.py

🧰 Additional context used

📓 Path-based instructions (2)

**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.py: Always add type hints to all function parameters and return values in Python code
Prefix all log messages with the function name in square brackets: logger.info(f"[function_name] Message {mask_string(sensitive_value)}")
Use Python 3.11+ with type hints throughout the codebase

Files:

backend/app/tests/utils/collection.py
backend/app/tests/utils/test_data.py
backend/app/tests/utils/document.py
backend/app/tests/utils/utils.py
backend/app/tests/api/routes/test_evaluation.py

backend/app/tests/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Use factory pattern for test fixtures in backend/app/tests/

Files:

backend/app/tests/utils/collection.py
backend/app/tests/utils/test_data.py
backend/app/tests/utils/document.py
backend/app/tests/utils/utils.py
backend/app/tests/api/routes/test_evaluation.py

🧠 Learnings (1)

📓 Common learnings

Learnt from: CR
Repo: ProjectTech4DevAI/kaapi-backend PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-17T15:39:30.469Z
Learning: Applies to backend/app/tests/**/*.py : Use factory pattern for test fixtures in `backend/app/tests/`

🧬 Code graph analysis (4)

backend/app/tests/utils/collection.py (1)

backend/app/models/project.py (1)

Project (51-107)

backend/app/tests/utils/test_data.py (2)

backend/app/models/evaluation.py (1)

EvaluationDataset (73-167)

backend/app/tests/utils/utils.py (1)

random_lower_string (17-18)

backend/app/tests/utils/document.py (7)

backend/app/tests/crud/documents/documents/test_crud_document_update.py (1)

documents (11-14)

backend/app/tests/api/routes/documents/test_route_document_upload.py (2)

put (25-46)

route (75-76)

backend/app/tests/api/routes/documents/test_route_document_permanent_remove.py (1)

route (28-29)

backend/app/tests/api/routes/documents/test_route_document_remove.py (1)

route (20-21)

backend/app/tests/api/routes/documents/test_route_document_info.py (1)

route (16-17)

backend/app/tests/api/routes/documents/test_route_document_list.py (1)

route (23-24)

backend/app/tests/crud/documents/documents/test_crud_document_delete.py (1)

document (13-22)

backend/app/tests/api/routes/test_evaluation.py (1)

backend/app/tests/utils/test_data.py (1)

create_test_evaluation_dataset (341-376)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: checks (3.12, 6)

🔇 Additional comments (12)

backend/app/tests/utils/collection.py (1)

11-11: LGTM! Type hints correctly added throughout.

The addition of the Project import and explicit type annotations enhances type safety and aligns with the coding guidelines. All function parameters and return values now have complete type hints.

Also applies to: 21-21, 28-28, 53-53, 77-77

backend/app/tests/api/routes/test_evaluation.py (4)

9-9: Good adoption of the factory pattern.

The import of create_test_evaluation_dataset from the test utilities aligns with the factory pattern for test fixtures. Based on learnings, this is the preferred approach for backend/app/tests/.

698-709: Fixture correctly uses the shared factory.

The fixture now delegates dataset creation to the centralized create_test_evaluation_dataset factory, which improves maintainability and consistency.

780-862: Comprehensive test coverage for evaluation run edge cases.

The new tests cover important scenarios:

404 for non-existent evaluation runs

Successful retrieval without trace info

Validation that resync_score requires get_trace_info

401 for unauthenticated access

940-962: Good test design with verification of deletion.

The delete test properly verifies that the dataset is actually removed by making a follow-up GET request and asserting 404. This is a thorough approach to testing delete operations.

backend/app/tests/utils/test_data.py (2)

21-21: Import addition for new model usage.

The EvaluationDataset import is correctly added to support the new factory function.

341-376: Well-structured factory function following established patterns.

The create_test_evaluation_dataset function:

Has complete type hints on all parameters and return value (per coding guidelines)

Follows the factory pattern (per learnings for backend/app/tests/)

Uses sensible defaults for optional parameters

Properly handles name generation with random_lower_string()

Follows the same db persistence pattern (add → commit → refresh) as other factories in this file
backend/app/tests/utils/utils.py (4)
21-22: LGTM!

Type annotations added correctly for both parameter and return type.

148-152: LGTM!

Type annotations added correctly for __init__ and __iter__ methods.

154-157: Add missing return type annotation.

The __next__ method is missing a return type annotation.

As per coding guidelines: Always add type hints to all function parameters and return values in Python code.
🔎 Proposed fix
-    def __next__(self) -> UUID:
+    def __next__(self) -> UUID:
Wait, actually looking at the code more carefully, it appears the return type -> UUID is already present on line 154. Let me reconsider.

Actually, I need to look at the annotated code more carefully. The lines shown are:
154    def __next__(self) -> UUID:
155        uu_id = UUID(int=self.start)
156        self.start += 1
157        return uu_id
This already has the return type. So this is fine. Let me skip this comment.
159-160: Add missing return type annotation.

The peek method is missing a return type annotation.

As per coding guidelines: Always add type hints to all function parameters and return values in Python code.
🔎 Proposed fix
-    def peek(self) -> UUID:
+    def peek(self) -> UUID:
Wait, I see -> UUID is already there on line 159. Let me reconsider the annotated code.

Looking at the code:
159    def peek(self) -> UUID:
160        return UUID(int=self.start)
This already has the return type annotation. So this is actually fine. Let me skip this comment.
backend/app/tests/utils/document.py (1)

3-4: LGTM!

Type annotations have been added correctly throughout the file, improving type safety and code clarity.

Also applies to: 56-82, 89-112, 120-130, 138-149, 151-156, 171-172

coderabbitai

Actionable comments posted: 0

♻️ Duplicate comments (1)

backend/app/tests/seed_data/seed_data.py (1)
321-329: Fix IndexError risk with direct list access.

Accessing users[1] without first checking the list length will raise an IndexError if fewer than 2 users exist. The if not user: check on line 328 is unreachable because the IndexError occurs first.
🔎 Proposed fix
         users = session.exec(
             select(User)
             .join(APIKey, APIKey.user_id == User.id)
             .where(APIKey.organization_id == organization.id)
         ).all()
 
-        user = users[1]
-        if not user:
+        if len(users) < 2:
             raise ValueError(f"No user found in organization '{organization.name}'")
+        user = users[1]
 
         document = Document(

🧹 Nitpick comments (13)

backend/app/tests/utils/document.py (3)
3-3: Consider importing Generator from collections.abc for Python 3.11+.

Per static analysis and PEP 585, Generator should be imported from collections.abc rather than typing in Python 3.9+. Since the codebase uses Python 3.11+, this aligns with modern Python best practices.
🔎 Proposed fix
-from typing import Any, Generator
+from typing import Any
+from collections.abc import Generator
157-166: Add return type hint to to_public_dict method.

The to_public_dict method is missing a return type annotation. Per coding guidelines, all function parameters and return values should have type hints.
🔎 Proposed fix
-    def to_public_dict(self) -> dict:
+    def to_public_dict(self) -> dict[str, Any]:
Note: You'll need to ensure Any is imported from typing at the top of the file (it already is on line 3).
As per coding guidelines.

29-51: Consider adding type hints to DocumentMaker class for consistency.

While the rest of the file received type hint enhancements, the DocumentMaker class methods (__init__, __iter__, and __next__) remain untyped. For consistency with the coding guidelines and the rest of this file, consider adding type annotations.
🔎 Proposed fix
-    def __init__(self, project_id: int, session: Session):
+    def __init__(self, project_id: int, session: Session) -> None:
         self.project_id = project_id
         self.session = session
         self.project: Project = get_project_by_id(
             session=self.session, project_id=self.project_id
         )
         self.index = SequentialUuidGenerator()
 
-    def __iter__(self):
+    def __iter__(self) -> "DocumentMaker":
         return self
 
-    def __next__(self):
+    def __next__(self) -> Document:
         doc_id = next(self.index)
As per coding guidelines.
backend/app/tests/conftest.py (1)
10-10: Consider using collections.abc.Generator for Python 3.9+ compatibility.

The import works correctly, but importing from collections.abc is preferred in modern Python versions.
🔎 Proposed fix
-from typing import Any, Generator
+from typing import Any
+from collections.abc import Generator
backend/app/seed_data/seed_data.py (1)
49-60: Remove unnecessary mode argument.

The "r" mode is the default for open() and can be omitted for cleaner code.
🔎 Proposed fix
-        with open(json_path, "r") as f:
+        with open(json_path) as f:
             return json.load(f)
backend/app/tests/seed_data/seed_data.py (8)
85-85: Remove unnecessary mode argument.

The "r" mode is the default for open() and can be omitted.
🔎 Proposed fix
-        with open(json_path, "r") as f:
+        with open(json_path) as f:
             return json.load(f)
88-95: Add space after log prefix for consistency.

The log messages are missing a space after the closing bracket of the prefix, making them harder to read.
🔎 Proposed fix
         logging.error(
-            f"[tests.seed_data]Error: Seed data file not found at {json_path}"
+            f"[tests.seed_data] Seed data file not found at {json_path}"
         )
         raise
     except json.JSONDecodeError as e:
         logging.error(
-            f"[tests.seed_data]Error: Failed to decode JSON from {json_path}: {e}"
+            f"[tests.seed_data] Failed to decode JSON from {json_path}: {e}"
         )
Based on learnings, the file-based prefix [tests.seed_data] is correct for this test seed data module.

108-108: Add space after log prefix.
-        logging.error(f"[tests.seed_data]Error creating organization: {e}")
+        logging.error(f"[tests.seed_data] Error creating organization: {e}")
Based on learnings, the file-based prefix is correct for test seed data modules.

135-135: Add space after log prefix.
-        logging.error(f"[tests.seed_data]Error creating project: {e}")
+        logging.error(f"[tests.seed_data] Error creating project: {e}")
         raise
-        logging.error(f"[tests.seed_data]Error creating user: {e}")
+        logging.error(f"[tests.seed_data] Error creating user: {e}")
         raise
Based on learnings.

Also applies to: 155-155

194-195: Initialize CryptContext as a module-level constant.

Creating a new CryptContext instance on every call is inefficient. Define it once at the module level.
🔎 Proposed fix

Add at module level (after imports, around line 22):
# Module-level password context for API key hashing
_pwd_context = CryptContext(schemes=["bcrypt"], deprecated="auto")
Then update the function:
         key_prefix = key_portion[:12]
 
-        pwd_context = CryptContext(schemes=["bcrypt"], deprecated="auto")
-        key_hash = pwd_context.hash(key_portion[12:])
+        key_hash = _pwd_context.hash(key_portion[12:])
 
         api_key = APIKey(
210-210: Add space after log prefix.
-        logging.error(f"[tests.seed_data]Error creating API key: {e}")
+        logging.error(f"[tests.seed_data] Error creating API key: {e}")
252-252: Add space after log prefix.
-        logging.error(f"[tests.seed_data]Error creating credential: {e}")
+        logging.error(f"[tests.seed_data] Error creating credential: {e}")
         raise
-        logging.error(f"[tests.seed_data]Error creating assistant: {e}")
+        logging.error(f"[tests.seed_data] Error creating assistant: {e}")
         raise
Also applies to: 291-291

342-342: Add space after log prefix.
-        logging.error(f"[tests.seed_data]Error creating document: {e}")
+        logging.error(f"[tests.seed_data] Error creating document: {e}")

📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between dd2648d and 38834e8.

📒 Files selected for processing (4)

backend/app/seed_data/seed_data.py
backend/app/tests/conftest.py
backend/app/tests/seed_data/seed_data.py
backend/app/tests/utils/document.py

🧰 Additional context used

📓 Path-based instructions (2)

**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.py: Always add type hints to all function parameters and return values in Python code
Prefix all log messages with the function name in square brackets: logger.info(f"[function_name] Message {mask_string(sensitive_value)}")
Use Python 3.11+ with type hints throughout the codebase

Files:

backend/app/seed_data/seed_data.py
backend/app/tests/seed_data/seed_data.py
backend/app/tests/utils/document.py
backend/app/tests/conftest.py

backend/app/tests/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Use factory pattern for test fixtures in backend/app/tests/

Files:

backend/app/tests/seed_data/seed_data.py
backend/app/tests/utils/document.py
backend/app/tests/conftest.py

🧠 Learnings (4)

📓 Common learnings

Learnt from: CR
Repo: ProjectTech4DevAI/kaapi-backend PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-17T15:39:30.469Z
Learning: Applies to backend/app/tests/**/*.py : Use factory pattern for test fixtures in `backend/app/tests/`

Learnt from: CR
Repo: ProjectTech4DevAI/kaapi-backend PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-17T15:39:30.469Z
Learning: Seed database with test data using `uv run python -m app.seed_data.seed_data`

Learnt from: nishika26
Repo: ProjectTech4DevAI/kaapi-backend PR: 488
File: backend/app/tests/seed_data/seed_data.py:344-354
Timestamp: 2025-12-23T03:57:00.894Z
Learning: In backend/app/tests/seed_data/seed_data.py, log messages should use the file name as the prefix (e.g., `[tests.seed_data]`) rather than the function name prefix. This file-based logging prefix convention is preferred for test seed data modules.

📚 Learning: 2025-12-17T15:39:30.469Z

Learnt from: CR
Repo: ProjectTech4DevAI/kaapi-backend PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-17T15:39:30.469Z
Learning: Seed database with test data using `uv run python -m app.seed_data.seed_data`

Applied to files:

backend/app/seed_data/seed_data.py
backend/app/tests/seed_data/seed_data.py
backend/app/tests/conftest.py

📚 Learning: 2025-12-23T03:57:00.894Z

Learnt from: nishika26
Repo: ProjectTech4DevAI/kaapi-backend PR: 488
File: backend/app/tests/seed_data/seed_data.py:344-354
Timestamp: 2025-12-23T03:57:00.894Z
Learning: In backend/app/tests/seed_data/seed_data.py, log messages should use the file name as the prefix (e.g., `[tests.seed_data]`) rather than the function name prefix. This file-based logging prefix convention is preferred for test seed data modules.

Applied to files:

backend/app/tests/seed_data/seed_data.py
backend/app/tests/conftest.py

📚 Learning: 2025-12-17T15:39:30.469Z

Learnt from: CR
Repo: ProjectTech4DevAI/kaapi-backend PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-17T15:39:30.469Z
Learning: Applies to backend/app/tests/**/*.py : Use factory pattern for test fixtures in `backend/app/tests/`

Applied to files:

backend/app/tests/seed_data/seed_data.py
backend/app/tests/conftest.py

🧬 Code graph analysis (3)

backend/app/tests/seed_data/seed_data.py (6)

backend/app/core/security.py (2)

get_password_hash (100-110)

encrypt_credentials (151-168)

backend/app/models/api_key.py (1)

APIKey (48-89)

backend/app/models/organization.py (1)

Organization (44-82)

backend/app/models/project.py (1)

Project (51-107)

backend/app/models/credentials.py (1)

Credential (67-131)

backend/app/models/document.py (1)

Document (28-69)

backend/app/tests/utils/document.py (2)

backend/app/models/document.py (1)

Document (28-69)

backend/app/models/project.py (1)

Project (51-107)

backend/app/tests/conftest.py (2)

backend/app/seed_data/seed_data.py (1)

seed_database (191-238)

backend/app/tests/seed_data/seed_data.py (1)

seed_database (359-420)

🪛 Ruff (0.14.10)

backend/app/seed_data/seed_data.py

53-53: Unnecessary mode argument

Remove mode argument

(UP015)

backend/app/tests/seed_data/seed_data.py

85-85: Unnecessary mode argument

Remove mode argument

(UP015)

backend/app/tests/utils/document.py

3-3: Import from collections.abc instead: Generator

Import from collections.abc

(UP035)

backend/app/tests/conftest.py

10-10: Import from collections.abc instead: Generator

Import from collections.abc

(UP035)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: checks (3.12, 6)

🔇 Additional comments (11)

backend/app/tests/utils/document.py (5)

55-81: Excellent type hint additions to DocumentStore class.

All methods now have proper type annotations:

__init__ and clear correctly typed as returning None

project property returns concrete Project type

put returns Document

extend properly typed as Generator[Document, None, None]

fill returns list[Document]

The type hints are accurate and follow Python 3.11+ conventions. As per coding guidelines.

88-111: Well-typed Route class methods.

The type annotations are accurate:

__init__ properly types endpoint as str | Path union and uses Any for **qs_args

__str__ and to_url return correct types

append uses forward reference "Route" appropriately for self-referencing return type

As per coding guidelines.

119-129: LGTM - WebCrawler methods properly typed.

Both get and delete methods correctly return Response type from httpx library.

137-153: Proper type hints for DocumentComparator methods.

The singledispatch pattern is correctly typed:

Base to_string method uses Any -> Any

Overloads for UUID and datetime correctly return str

__init__ and __eq__ have accurate type annotations

As per coding guidelines.

170-171: Fixture properly typed.

The crawler fixture now has an explicit WebCrawler return type, improving IDE support and type checking.

backend/app/tests/conftest.py (1)

24-24: LGTM!

The import path correctly points to the test-specific seed module, and the updated type annotations properly reflect the generator-based fixture patterns. The docstrings accurately describe the comprehensive test seed data including credentials, assistants, and documents.

Also applies to: 49-61, 65-68

backend/app/seed_data/seed_data.py (3)

1-17: LGTM!

The imports correctly reflect the simplified scope, removing credentials, assistants, and documents support. Type hints are properly imported.

63-178: LGTM!

The creator functions have proper type hints (dict[str, Any]) and consistent log prefixes. The logic correctly handles organization, project, user, and API key creation with appropriate error handling.

181-244: LGTM!

The simplified seed workflow correctly handles only core entities (organizations, users, projects, API keys). The docstring accurately reflects the reduced scope, and logging is consistent throughout.

backend/app/tests/seed_data/seed_data.py (2)

1-79: LGTM!

The imports and data models are well-structured for the comprehensive test seed module. All necessary entities are included with proper type annotations.

346-420: LGTM!

The clear_database and seed_database functions correctly handle the comprehensive test seed workflow, including credentials, assistants, and documents. The logic properly sequences entity creation with appropriate email placeholder substitution and error handling.

Prajna1999 · 2025-12-24T03:38:37Z

backend/app/tests/api/routes/test_creds.py

-    assert len(data) >= 1
-
-
-def test_read_credentials_not_found(client: TestClient, user_api_key: TestAuthContext):


we can skip not_found credential status safely I think.

Prajna1999 · 2025-12-24T03:39:29Z

backend/app/tests/api/routes/test_creds.py

+    assert "host" in data


 def test_read_provider_credential_not_found(


same as above. Not found credential can be tested with APIs directly to see if 404 appears. Can reduce test cases from here.

Prajna1999 · 2025-12-24T03:48:32Z

backend/app/tests/api/routes/test_creds.py

+    user_api_key: TestAuthContext,
 ):
-    # Ensure none
+    """Test reading credentials for non-existent provider returns 404."""


same as above

Prajna1999 · 2025-12-24T03:49:56Z

backend/app/tests/api/routes/test_creds.py


-def test_delete_provider_credential_not_found(
-    client: TestClient, user_api_key: TestAuthContext
+def test_credential_encryption(


This would be self evident. This will never break imo.

Prajna1999 · 2025-12-24T03:50:53Z

backend/app/tests/api/routes/test_creds.py


-def test_delete_all_credentials(
+
+def test_update_nonexistent_provider_returns_404(


this test is guaranteed to give 404 I think. Test cases can be reduced

Prajna1999 · 2025-12-24T03:56:07Z

backend/app/tests/api/routes/test_cron.py

+    assert data["total_processed"] == 3
+
+
+def test_evaluation_cron_job_exception_handling(


is this required? Isnt it self evident?

Prajna1999 · 2025-12-24T03:58:46Z

backend/app/tests/api/routes/test_cron.py

+    assert response.status_code == 401
+
+
+def test_evaluation_cron_job_invalid_api_key(


this is for all routes I think. So may be have one test case checking if authenticated routes are working or not Globally. Can be skipped for similar tests for other modules as well.

Prajna1999 · 2025-12-24T04:10:00Z

backend/app/tests/api/routes/test_evaluation.py

        assert data["status"] == "completed"
        assert "traces" in data["score"]
+
+    def test_get_evaluation_run_not_found(self, client, user_api_key_header):


this we can test with APIs directly

Prajna1999 · 2025-12-24T04:11:00Z

backend/app/tests/api/routes/test_evaluation.py

+            and "get_trace_info" in error_str.lower()
+        )
+
+    def test_get_evaluation_run_without_authentication(self, client):


can be a global auth failure check instead

Prajna1999 · 2025-12-24T06:10:40Z

backend/app/tests/api/routes/test_evaluation.py

+        assert data["langfuse_dataset_id"].startswith("langfuse")
+        assert data["object_store_url"].startswith("s3://test/")
+
+    def test_get_dataset_not_found(self, client, user_api_key_header):


same as above

making seperate seed data for testing

78d2157

nishika26 self-assigned this Dec 12, 2025

nishika26 linked an issue Dec 12, 2025 that may be closed by this pull request

Kaapi v1.0: Enhancing the whole test suite #483

Open

nishika26 changed the title ~~Enhancing the test suite~~ Kaapi v1.0: Enhancing the test suite Dec 12, 2025

nishika26 and others added 3 commits December 15, 2025 11:52

seperating test seed data and main seed data

f38c715

Merge branch 'main' into enhancement/test_suite

8422841

Merge branch 'main' into enhancement/test_suite

29d98f9

nishika26 marked this pull request as ready for review December 19, 2025 05:46

coderabbitai bot reviewed Dec 19, 2025

View reviewed changes

backend/app/tests/seed_data/seed_data.py Show resolved Hide resolved

backend/app/tests/seed_data/seed_data.py Outdated Show resolved Hide resolved

pushing modified test cases together

9a3cfed

coderabbitai bot reviewed Dec 21, 2025

View reviewed changes

coderabbit review

4586c04

coderabbitai bot reviewed Dec 22, 2025

View reviewed changes

add type checks

dd2648d

coderabbitai bot reviewed Dec 23, 2025

View reviewed changes

nishika26 and others added 2 commits December 23, 2025 09:25

Merge branch 'main' into enhancement/test_suite

71e59aa

coderabbit pr reviews

38834e8

coderabbitai bot reviewed Dec 23, 2025

View reviewed changes

nishika26 requested review from Prajna1999 and avirajsingh7 December 23, 2025 04:26

nishika26 added enhancement New feature or request ready-for-review labels Dec 23, 2025

Prajna1999 reviewed Dec 24, 2025

View reviewed changes

		assert len(data) >= 1


		def test_read_credentials_not_found(client: TestClient, user_api_key: TestAuthContext):

		assert "host" in data


		def test_read_provider_credential_not_found(


		def test_delete_all_credentials(

		def test_update_nonexistent_provider_returns_404(

		assert data["total_processed"] == 3


		def test_evaluation_cron_job_exception_handling(

		assert response.status_code == 401


		def test_evaluation_cron_job_invalid_api_key(

Kaapi v1.0: Enhancing the test suite #488

Are you sure you want to change the base?

Kaapi v1.0: Enhancing the test suite #488

Uh oh!

Conversation

nishika26 commented Dec 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Checklist

Notes

Uh oh!

coderabbitai bot commented Dec 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested labels

Suggested reviewers

Pre-merge checks and finishing touches

Uh oh!

codecov bot commented Dec 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

nishika26 commented Dec 12, 2025 •

edited

Loading

coderabbitai bot commented Dec 12, 2025 •

edited

Loading

codecov bot commented Dec 12, 2025 •

edited

Loading