-
Notifications
You must be signed in to change notification settings - Fork 7
Graphql graph CTL command #711
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: stable
Are you sure you want to change the base?
Conversation
WalkthroughThis change introduces a new GraphQL query analysis system for Infrahub SDK. A new Pre-merge checks❌ Failed checks (1 warning, 1 inconclusive)
✅ Passed checks (1 passed)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Deploying infrahub-sdk-python with
|
| Latest commit: |
e40b48b
|
| Status: | ✅ Deploy successful! |
| Preview URL: | https://9c789d97.infrahub-sdk-python.pages.dev |
| Branch Preview URL: | https://bgi-grahql-check-command.infrahub-sdk-python.pages.dev |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
🧹 Nitpick comments (1)
infrahub_sdk/query_analyzer.py (1)
417-424: Consider simplifying the schema_branch type annotation.The
schema_branchparameter is typed asSchemaBranchProtocol | BranchSchema, but sinceBranchSchemaimplementsSchemaBranchProtocol, the union type is redundant. You could simplify to justSchemaBranchProtocol.🔎 Proposed refactor
def __init__( self, query: str, - schema_branch: SchemaBranchProtocol | BranchSchema, + schema_branch: SchemaBranchProtocol, schema: GraphQLSchema | None = None, query_variables: dict[str, Any] | None = None, operation_name: str | None = None, ) -> None:This maintains flexibility while being more precise about the interface requirement.
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
infrahub_sdk/ctl/graphql.pyinfrahub_sdk/query_analyzer.pyinfrahub_sdk/schema/main.pytests/unit/sdk/test_infrahub_query_analyzer.py
🧰 Additional context used
📓 Path-based instructions (5)
**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
**/*.py: Use type hints on all function signatures
Never mix async/sync inappropriately
Never bypass type checking without justification
Files:
infrahub_sdk/schema/main.pyinfrahub_sdk/query_analyzer.pyinfrahub_sdk/ctl/graphql.pytests/unit/sdk/test_infrahub_query_analyzer.py
infrahub_sdk/**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
Follow async/sync dual pattern for new features in the Python SDK
Files:
infrahub_sdk/schema/main.pyinfrahub_sdk/query_analyzer.pyinfrahub_sdk/ctl/graphql.py
infrahub_sdk/ctl/**/*.py
📄 CodeRabbit inference engine (infrahub_sdk/ctl/AGENTS.md)
infrahub_sdk/ctl/**/*.py: Use@catch_exception(console=console)decorator on all async CLI commands
IncludeCONFIG_PARAMin all CLI command function parameters, even if unused
Useinitialize_client()orinitialize_client_sync()from ctl.client for client creation in CLI commands
Use Rich library for output formatting in CLI commands (tables, panels, console.print) instead of plain print() statements
Do not instantiate InfrahubClient directly; always use initialize_client() or initialize_client_sync() helper functions
Do not use plain print() statements in CLI commands; use Rich console.print() instead
Files:
infrahub_sdk/ctl/graphql.py
tests/**/*.py
📄 CodeRabbit inference engine (tests/AGENTS.md)
tests/**/*.py: Usehttpx_mockfixture for HTTP mocking in tests instead of making real HTTP requests
Do not add@pytest.mark.asynciodecorator to async test functions (async auto-mode is globally enabled)
Files:
tests/unit/sdk/test_infrahub_query_analyzer.py
tests/unit/**/*.py
📄 CodeRabbit inference engine (tests/AGENTS.md)
Unit tests must be fast, mocked, and have no external dependencies
Files:
tests/unit/sdk/test_infrahub_query_analyzer.py
🧬 Code graph analysis (3)
infrahub_sdk/query_analyzer.py (2)
infrahub_sdk/analyzer.py (2)
GraphQLQueryAnalyzer(32-121)operations(55-65)infrahub_sdk/schema/main.py (15)
GenericSchemaAPI(288-292)NodeSchemaAPI(312-314)ProfileSchemaAPI(317-318)TemplateSchemaAPI(321-322)BranchSchema(361-460)node_names(368-370)generic_names(373-375)profile_names(378-380)get(382-395)get_node(397-407)get_generic(409-419)get_profile(421-431)kind(279-280)attribute_names(223-224)get_relationship_or_none(193-197)
infrahub_sdk/ctl/graphql.py (5)
infrahub_sdk/ctl/client.py (1)
initialize_client(10-25)infrahub_sdk/ctl/utils.py (1)
catch_exception(78-106)infrahub_sdk/graphql/utils.py (2)
insert_fragments_inline(13-31)remove_fragment_import(34-40)infrahub_sdk/query_analyzer.py (3)
InfrahubQueryAnalyzer(407-760)only_has_unique_targets(386-404)query_report(460-464)infrahub_sdk/schema/main.py (1)
BranchSchema(361-460)
tests/unit/sdk/test_infrahub_query_analyzer.py (1)
infrahub_sdk/query_analyzer.py (4)
GraphQLQueryReport(289-404)only_has_unique_targets(386-404)query_report(460-464)top_level_kinds(354-355)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: integration-tests-latest-infrahub
- GitHub Check: Cloudflare Pages
🔇 Additional comments (3)
infrahub_sdk/schema/main.py (1)
367-431: LGTM! Well-structured schema accessors.The new properties and getter methods provide clean, type-safe access to different schema types. The error handling with KeyError and TypeError is appropriate, and the docstrings clearly document the API compatibility with backend SchemaBranch.
tests/unit/sdk/test_infrahub_query_analyzer.py (1)
1-170: LGTM! Comprehensive test coverage.The test suite effectively validates the query analyzer's behavior across various scenarios:
- Empty queries
- Single/multi-target detection with required, optional, and static filters
- Top-level kinds extraction
- Unknown model handling
The fixtures provide isolated, fast test execution without external dependencies.
infrahub_sdk/query_analyzer.py (1)
1-760: LGTM! Solid architecture for GraphQL query analysis.The implementation provides comprehensive query parsing with:
- Fragment handling with circular dependency detection
- Hierarchical query node tree construction
- Model and permission reasoning
- Clean separation of concerns with dataclasses and protocols
The code is well-structured with proper type hints throughout.
| def _print_query_result(console: Console, report: object, results: CheckResults) -> None: | ||
| """Print the result for a single query analysis.""" | ||
| if report.only_has_unique_targets: | ||
| console.print("[green] Result: Single-target query (good)[/green]") | ||
| console.print(" This query targets unique nodes, enabling selective artifact regeneration.") | ||
| results.single_target_count += 1 | ||
| else: | ||
| console.print("[yellow] Result: Multi-target query[/yellow]") | ||
| console.print(" May cause excessive artifact regeneration. Fix: filter by ID or unique attribute.") | ||
| results.multi_target_count += 1 | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion | 🟠 Major
Improve type hint for report parameter.
The report parameter is typed as object, which bypasses type checking when accessing report.only_has_unique_targets on line 42. This should be GraphQLQueryReport for proper type safety.
🔎 Proposed fix
Add the import at the top of the file:
from ..query_analyzer import InfrahubQueryAnalyzer
+from ..query_analyzer import GraphQLQueryReport
from ..schema import BranchSchemaThen update the type hint:
-def _print_query_result(console: Console, report: object, results: CheckResults) -> None:
+def _print_query_result(console: Console, report: GraphQLQueryReport, results: CheckResults) -> None:
"""Print the result for a single query analysis."""As per coding guidelines, type hints are required on all function signatures.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| def _print_query_result(console: Console, report: object, results: CheckResults) -> None: | |
| """Print the result for a single query analysis.""" | |
| if report.only_has_unique_targets: | |
| console.print("[green] Result: Single-target query (good)[/green]") | |
| console.print(" This query targets unique nodes, enabling selective artifact regeneration.") | |
| results.single_target_count += 1 | |
| else: | |
| console.print("[yellow] Result: Multi-target query[/yellow]") | |
| console.print(" May cause excessive artifact regeneration. Fix: filter by ID or unique attribute.") | |
| results.multi_target_count += 1 | |
| from ..query_analyzer import InfrahubQueryAnalyzer | |
| from ..query_analyzer import GraphQLQueryReport | |
| from ..schema import BranchSchema | |
| def _print_query_result(console: Console, report: GraphQLQueryReport, results: CheckResults) -> None: | |
| """Print the result for a single query analysis.""" | |
| if report.only_has_unique_targets: | |
| console.print("[green] Result: Single-target query (good)[/green]") | |
| console.print(" This query targets unique nodes, enabling selective artifact regeneration.") | |
| results.single_target_count += 1 | |
| else: | |
| console.print("[yellow] Result: Multi-target query[/yellow]") | |
| console.print(" May cause excessive artifact regeneration. Fix: filter by ID or unique attribute.") | |
| results.multi_target_count += 1 |
🤖 Prompt for AI Agents
In infrahub_sdk/ctl/graphql.py around lines 40 to 50, the function
_print_query_result currently types its report parameter as object which
disables type checking when accessing report.only_has_unique_targets; change the
parameter type to GraphQLQueryReport and add the corresponding import at the top
of the file (import GraphQLQueryReport from the module where that
dataclass/typing is defined in the codebase), then update the function signature
to use GraphQLQueryReport so static type checkers can validate attribute access.
| client = initialize_client() | ||
|
|
||
| schema_data = await client.schema.all(branch=branch) | ||
| branch_schema = BranchSchema(hash="", nodes=schema_data) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
# First, let's find the BranchSchema class definition
rg -n "class BranchSchema" --type=pyRepository: opsmill/infrahub-sdk-python
Length of output: 133
🏁 Script executed:
# Find the schema.all() method implementation
rg -n "def all\(" --type=py -A 10 -B 2 | grep -A 10 "def all"Repository: opsmill/infrahub-sdk-python
Length of output: 8906
🏁 Script executed:
# Look for other usages of BranchSchema to understand the hash pattern
rg -n "BranchSchema\(" --type=py -B 3 -A 3Repository: opsmill/infrahub-sdk-python
Length of output: 2290
🏁 Script executed:
# Check the specific file mentioned in the review
cat -n infrahub_sdk/ctl/graphql.py | sed -n '280,310p'Repository: opsmill/infrahub-sdk-python
Length of output: 1287
🏁 Script executed:
# Look at the full schema.all() implementation
sed -n '281,320p' infrahub_sdk/schema/__init__.pyRepository: opsmill/infrahub-sdk-python
Length of output: 1855
🏁 Script executed:
# Search for schema_hash or hash-related methods in the schema module
rg -n "schema_hash|schema.*hash" --type=py -B 2 -A 2 infrahub_sdk/schema/Repository: opsmill/infrahub-sdk-python
Length of output: 4010
🏁 Script executed:
# Look for other BranchSchema usages in production code (not tests)
rg -n "BranchSchema(" --type=py -B 3 -A 3 | grep -v "tests/"Repository: opsmill/infrahub-sdk-python
Length of output: 147
🏁 Script executed:
# Check what SchemaCacheManager provides
rg -n "class SchemaCacheManager" --type=py -A 20Repository: opsmill/infrahub-sdk-python
Length of output: 53
🏁 Script executed:
# Fix the regex for BranchSchema search
rg -n "BranchSchema\\\(" --type=py -B 3 -A 3 | grep -v "tests/"Repository: opsmill/infrahub-sdk-python
Length of output: 151
🏁 Script executed:
# Look at the SchemaCacheManager/SchemaManager class to see if cache is accessible
sed -n '1,100p' infrahub_sdk/schema/__init__.py | head -60Repository: opsmill/infrahub-sdk-python
Length of output: 1446
🏁 Script executed:
# Check what the _fetch method returns and how it creates BranchSchema
rg -n "async def _fetch" --type=py -A 30 infrahub_sdk/schema/__init__.pyRepository: opsmill/infrahub-sdk-python
Length of output: 1397
🏁 Script executed:
# See if there's a way to access the schema cache from client
rg -n "self.cache\[" --type=py -B 2 -A 2 infrahub_sdk/schema/__init__.py | head -40Repository: opsmill/infrahub-sdk-python
Length of output: 1514
🏁 Script executed:
# Check if there's a method to get the full BranchSchema (not just nodes) from the schema manager
rg -n "def.*schema" --type=py infrahub_sdk/schema/__init__.py | head -20Repository: opsmill/infrahub-sdk-python
Length of output: 1798
🏁 Script executed:
# Look at the InfrahubSchema class structure to see all available methods
sed -n '100,250p' infrahub_sdk/schema/__init__.pyRepository: opsmill/infrahub-sdk-python
Length of output: 5835
🏁 Script executed:
# Check if there's a cache property or way to access the BranchSchema with hash
rg -n "cache\[" --type=py -B 3 -A 1 infrahub_sdk/schema/__init__.py | head -50Repository: opsmill/infrahub-sdk-python
Length of output: 2072
🏁 Script executed:
# Look at how BranchSchema.from_api_response works to understand how hash is populated
sed -n '400,500p' infrahub_sdk/schema/main.pyRepository: opsmill/infrahub-sdk-python
Length of output: 2314
🏁 Script executed:
# Verify cache is public and check if there are other usages accessing cache directly
rg -n "client\.schema\.cache" --type=py -B 2 -A 2Repository: opsmill/infrahub-sdk-python
Length of output: 2160
🏁 Script executed:
# Check if branch variable is available at that point in the function
sed -n '270,300p' infrahub_sdk/ctl/graphql.pyRepository: opsmill/infrahub-sdk-python
Length of output: 1177
Use the schema from cache instead of constructing with empty hash.
After calling await client.schema.all(branch=branch), the full BranchSchema with the proper hash is available in client.schema.cache[branch]. Replace the manual construction with:
branch_schema = client.schema.cache[branch]This avoids the empty hash placeholder and uses the actual schema hash from the API response.
🤖 Prompt for AI Agents
In infrahub_sdk/ctl/graphql.py around lines 293 to 296, the code calls
client.schema.all(branch=branch) then constructs BranchSchema with an empty hash
which loses the real schema hash; replace the manual construction and instead
obtain the full BranchSchema from the client's cache
(client.schema.cache[branch]) so the real hash and nodes from the API response
are used.
| def _populate_inline_fragment_node( | ||
| self, node: InlineFragmentNode, query_node: GraphQLQueryNode | ||
| ) -> GraphQLQueryNode: | ||
| infrahub_model = self.schema_branch.get(name=node.type_condition.name.value, duplicate=False) | ||
| context_type = ContextType.DIRECT | ||
| current_node = GraphQLQueryNode( | ||
| parent=query_node, | ||
| path=node.type_condition.name.value, | ||
| context_type=context_type, | ||
| infrahub_model=infrahub_model, | ||
| ) | ||
| if node.selection_set: | ||
| selections = self._get_selections(selection_set=node.selection_set) | ||
| for field_node in selections.field_nodes: | ||
| current_node.children.append(self._populate_field_node(node=field_node, query_node=current_node)) | ||
| for inline_fragment_node in selections.inline_fragment_nodes: | ||
| current_node.children.append( | ||
| self._populate_inline_fragment_node(node=inline_fragment_node, query_node=current_node) | ||
| ) | ||
|
|
||
| return current_node |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add error handling for unknown inline fragment types.
In _populate_inline_fragment_node (line 648), the call to self.schema_branch.get() can raise KeyError if the type is not found in the schema. Unlike _populate_named_fragments (lines 582-584) which handles this with try-except, this code will propagate the exception.
🔎 Proposed fix
def _populate_inline_fragment_node(
self, node: InlineFragmentNode, query_node: GraphQLQueryNode
) -> GraphQLQueryNode:
- infrahub_model = self.schema_branch.get(name=node.type_condition.name.value, duplicate=False)
+ try:
+ infrahub_model = self.schema_branch.get(name=node.type_condition.name.value, duplicate=False)
+ except KeyError:
+ infrahub_model = None
context_type = ContextType.DIRECT
current_node = GraphQLQueryNode(
parent=query_node,
path=node.type_condition.name.value,
context_type=context_type,
infrahub_model=infrahub_model,
)This ensures consistent behavior when encountering unknown types in inline fragments, similar to how named fragments handle missing types.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| def _populate_inline_fragment_node( | |
| self, node: InlineFragmentNode, query_node: GraphQLQueryNode | |
| ) -> GraphQLQueryNode: | |
| infrahub_model = self.schema_branch.get(name=node.type_condition.name.value, duplicate=False) | |
| context_type = ContextType.DIRECT | |
| current_node = GraphQLQueryNode( | |
| parent=query_node, | |
| path=node.type_condition.name.value, | |
| context_type=context_type, | |
| infrahub_model=infrahub_model, | |
| ) | |
| if node.selection_set: | |
| selections = self._get_selections(selection_set=node.selection_set) | |
| for field_node in selections.field_nodes: | |
| current_node.children.append(self._populate_field_node(node=field_node, query_node=current_node)) | |
| for inline_fragment_node in selections.inline_fragment_nodes: | |
| current_node.children.append( | |
| self._populate_inline_fragment_node(node=inline_fragment_node, query_node=current_node) | |
| ) | |
| return current_node | |
| def _populate_inline_fragment_node( | |
| self, node: InlineFragmentNode, query_node: GraphQLQueryNode | |
| ) -> GraphQLQueryNode: | |
| try: | |
| infrahub_model = self.schema_branch.get(name=node.type_condition.name.value, duplicate=False) | |
| except KeyError: | |
| infrahub_model = None | |
| context_type = ContextType.DIRECT | |
| current_node = GraphQLQueryNode( | |
| parent=query_node, | |
| path=node.type_condition.name.value, | |
| context_type=context_type, | |
| infrahub_model=infrahub_model, | |
| ) | |
| if node.selection_set: | |
| selections = self._get_selections(selection_set=node.selection_set) | |
| for field_node in selections.field_nodes: | |
| current_node.children.append(self._populate_field_node(node=field_node, query_node=current_node)) | |
| for inline_fragment_node in selections.inline_fragment_nodes: | |
| current_node.children.append( | |
| self._populate_inline_fragment_node(node=inline_fragment_node, query_node=current_node) | |
| ) | |
| return current_node |
🤖 Prompt for AI Agents
In infrahub_sdk/query_analyzer.py around lines 645 to 665, wrap the call to
self.schema_branch.get(...) in a try/except KeyError to match the named-fragment
handling: if the type is missing catch KeyError, log a warning (using the same
logger used elsewhere in this class) including the missing type name, and return
the parent query_node (i.e., skip creating/populating this inline fragment)
instead of allowing the exception to propagate.
Codecov Report❌ Patch coverage is
@@ Coverage Diff @@
## stable #711 +/- ##
===========================================
- Coverage 76.03% 34.27% -41.77%
===========================================
Files 113 114 +1
Lines 9744 10303 +559
Branches 1491 1621 +130
===========================================
- Hits 7409 3531 -3878
- Misses 1840 6459 +4619
+ Partials 495 313 -182
Flags with carried forward coverage won't be shown. Click here to find out more.
... and 83 files with indirect coverage changes 🚀 New features to boost your workflow:
|
GraphQL Query Analyzer - Migration & Implementation
Overview
The
InfrahubQueryAnalyzerenables static analysis of GraphQL queries to determine if they target single or multiple objects. This is exposed viainfrahubctl graphql check [query-path].Architecture
Components
infrahub_sdk/query_analyzer.pyInfrahubQueryAnalyzerclass and data modelsinfrahub_sdk/schema/main.pyBranchSchemawithnode_names,generic_names,profile_namesproperties and getter methodsinfrahub_sdk/ctl/graphql.pycheckcommand implementationType Adaptations from Backend
infrahub.core.constants.RelationshipCardinalityinfrahub_sdk.schema.RelationshipCardinalityinfrahub.core.schema.GenericSchemainfrahub_sdk.schema.GenericSchemaAPIinfrahub.core.schema.MainSchemaTypesNodeSchemaAPI | GenericSchemaAPI | ProfileSchemaAPI | TemplateSchemaAPISchemaBranchSchemaBranchProtocol(satisfied byBranchSchema)SchemaNotFoundErrorKeyErrorImplementation Details
Schema Data Flow
client.schema.all()returnsMutableMapping[str, MainSchemaTypesAPI]— a flat dict with kind names as keys and schema objects as values. This is not the raw API response format.Single-Target Detection Logic
A query is considered "single-target" when:
infrahub_modelwithuniqueness_constraints$name: String!)name__value: "my-tag")idsargument with a required variableCritical: The uniqueness constraint must match the GraphQL argument name format (e.g.,
name__value, notname).Edge Cases Handled
only_has_unique_targetsreturnsFalsetop_level_kindsemptyFalse(multi-target).gqlfiles in pathUsage
The
checkcommand accepts a file or directory path. If a directory is provided, it recursively finds all.gqlfiles. If no path is provided, it defaults to the current directory.Tests
Unit tests in
tests/unit/sdk/test_infrahub_query_analyzer.pycover:Known Limitations & Future Improvements
1. BranchSchema Construction Workaround
Issue: Direct instantiation bypasses proper factory method.
Improvement: Add a factory method to
BranchSchemathat accepts theclient.schema.all()output format:Or modify
from_api_responseto detect the input format.2. Uniqueness Constraint Format Mismatch
Issue: The analyzer compares
argument.name(e.g.,name__value) againstuniqueness_constraints(e.g.,[["name"]]). These use different formats.Current behavior: Only works if
uniqueness_constraintsare defined with the GraphQL argument format (name__value).Improvement: Normalize the comparison — either strip
__valuesuffix from arguments or expand constraint names to include the suffix.3. Template Schema Handling
Issue:
_get_operations()checksnode_names,generic_names,profile_namesbut not templates.Improvement: Add
template_namesproperty and corresponding handling.4. Schema Hash Not Populated
Issue:
BranchSchemais constructed with empty hash.Improvement: Fetch and pass the actual schema hash for cache validation purposes.
5. GraphQL Schema Fetched Separately
Issue: Two API calls required — one for Infrahub schema, one for GraphQL schema.
Improvement: Consider caching or combining these calls if performance becomes an issue.