Skip to content

Conversation

@the-david-oy
Copy link
Contributor

Summary

Enables Quick search mode to optimize ensemble and BLS models with composing models that have different resource requirements (e.g., CPU tokenizers with high instance counts alongside GPU models with limited instances).

Previously, Quick search mode rejected any model with parameter ranges, preventing users from optimizing composing models independently. This implementation allows composing models to specify instance_group count ranges while maintaining the restriction for top-level models.

Changes

Config Validation (config_command.py)

  • Add _is_composing_model() helper to identify BLS and CPU-only composing models
  • Update _check_quick_search_model_config_parameters_combinations() to allow composing models to have parameter ranges
  • Update _check_per_model_model_config_parameters() to permit max_batch_size and instance_group ranges for composing models
  • Maintain existing restrictions for top-level models

SearchDimensions Construction (run_config_generator_factory.py)

  • Add _get_instance_count_list() to extract user-specified count lists from model configs
  • Add _create_instance_dimension_from_list() to create constrained SearchDimensions
  • Support two sequence types:
    • Powers of 2: [1, 2, 4, 8, 16, 32] → EXPONENTIAL dimensions
    • Contiguous sequences: [1, 2, 3, 4, 5] → LINEAR dimensions
  • Add _is_powers_of_two() and _is_linear_sequence() validators
  • Update _get_dimensions_for_model() to use user-specified lists when available

Coordinate Mapping (quick_run_config_generator.py)

  • Add _extract_instance_group_kind() to preserve user-specified KIND (CPU/GPU)
  • Update _get_next_model_config_variant() to extract kind before removing searchable parameters
  • Remove restrictive assertion that blocked composing models with multiple parameter combinations
  • Maintain single-combination requirement after removing searchable parameters

Documentation (docs/config_search.md)

  • Add "Ensemble Composing Model Parameter Ranges" subsection under Quick Search Mode
  • Provide complete YAML example with CPU tokenizer and GPU inference model
  • Document supported patterns and limitations
  • Add cross-references in Ensemble and BLS sections

Testing

Tests cover:

  • Valid patterns (powers of 2, contiguous sequences)
  • Invalid patterns with helpful error messages
  • Mixed scenarios (ranged + fixed parameters)
  • Edge cases (empty lists, single values, nested structures)

Example Configuration

model_repository: /path/to/model/repository/
run_config_search_mode: quick

cpu_only_composing_models:
  - tokenizer

profile_models:
  tokenizer:
    model_config_parameters:
      instance_group:
        - kind: KIND_CPU
          count: [1, 2, 4, 8, 16, 32]  # Search CPU instances
      dynamic_batching:
        max_queue_delay_microseconds: [0]

  inference_model:
    model_config_parameters:
      instance_group:
        - kind: KIND_GPU
          count: [1, 2, 4, 8]  # Search GPU instances
      dynamic_batching:
        max_queue_delay_microseconds: [0]

  ensemble_model:
    model_config_parameters:
      dynamic_batching:
        max_queue_delay_microseconds: [0]

## Impact

Enables optimization of ensemble models with heterogeneous composing models (e.g., CPU tokenizers + GPU inference).

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR enables Quick search mode to optimize ensemble and BLS models where composing models have different resource requirements (e.g., CPU tokenizers with high instance counts alongside GPU models with limited instances). Previously, Quick search mode rejected any model with parameter ranges.

Key changes:

  • Allow composing models to specify instance_group count ranges in Quick mode
  • Add validation logic to distinguish between top-level and composing models
  • Support powers-of-2 and contiguous sequence patterns for instance counts
  • Preserve user-specified instance_group KIND (CPU/GPU) during coordinate mapping

Reviewed changes

Copilot reviewed 9 out of 9 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
tests/test_search_dimensions_factory.py New test suite validating SearchDimension creation from user-specified instance count lists
tests/test_quick_run_config_generator.py Updated copyright header and test expectation for dynamic batching configuration
tests/test_quick_coordinate_mapping.py New test suite for instance_group KIND extraction helper
tests/test_ensemble_composing_model_integration.py Integration tests for ensemble models with mixed CPU/GPU composing models
tests/test_config_composing_model_validation.py Validation tests ensuring composing models can use ranges while top-level models cannot
model_analyzer/config/input/config_command.py Updated validation logic to permit parameter ranges for composing models
model_analyzer/config/generate/run_config_generator_factory.py Added methods to create SearchDimensions from user-specified count lists
model_analyzer/config/generate/quick_run_config_generator.py Enhanced coordinate mapping to preserve instance_group KIND and handle composing models
docs/config_search.md Documentation for ensemble composing model parameter ranges feature

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@the-david-oy the-david-oy force-pushed the doy-ensemble branch 5 times, most recently from 7850677 to 275e9fb Compare December 5, 2025 00:09
@the-david-oy the-david-oy force-pushed the doy-ensemble branch 2 times, most recently from 209e349 to db7fa66 Compare December 5, 2025 23:24
Fix copyrights

Fix tests

Update tests

Allow user to specify composing models instead of relying on auto-discovery

Warn when there is a non-existent composing model.

Update copyrights

Update copyrights

Fix model name YAML

Correctly get kind

Properly set CPU/GPU kind

Address Copilot feedback

Address Copilot feedback

Fix regex for CI, add config in test
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

2 participants