Skip to content

Conversation

@briangreunke
Copy link
Contributor

@briangreunke briangreunke commented Dec 21, 2025

Summary

This pull request refactors the dataset versioning system by substituting the existing VersionInfo class with the more robust packaging.Version. This change enhances version parsing and validation by leveraging the packaging library. Additionally, utility helpers for parsing and validating version strings are introduced to streamline these operations.

Changes

Refactoring

  • Replace VersionInfo with packaging.Version:

    • The current versioning implementation using VersionInfo is replaced with packaging.Version which provides a more standardized and reliable approach to handle semantic versions.
    • Update dataset metadata to use DatasetVersion, an alias for version objects from the packaging library.
  • Modification of Functionality:

    • Enhance DatasetMetadata to handle version bumping using the new bump_patch_version method.
    • Adapt DatasetManager and related classes to work with new version types, thereby removing the need for custom version comparison logic.

Features

  • Utility Functions:
    • Implement parse_version in dreadnode.util, a function to safely convert version strings to Version objects.
    • Enhance valid_version to work with both strings and Version objects.

Code Removal

  • Deprecate VersionInfo Class:

    • Eliminate the VersionInfo class and its associated methods that handled manual version string operations and comparisons.
  • Remove compare_versions Methods:

    • Discard the methods compare_versions and compare_versions_from_paths from DatasetManager, replaced by direct use of Python's comparison capabilities with packaging.Version.

Code Adjustments

  • Adjust Dataset Operations:
    • Amend dataset saving, loading, and version bumping logic to accommodate the usage of packaging.Version.
    • Update comments and structures related to dataset manifest creation and management to align with new version handling.

The refactor significantly improves the maintainability and reliability of version handling across all dataset-related functionalities, ensuring that version-related operations adhere to industry standards.


Generated Summary:

  • Removed the VersionInfo class and replaced it with DatasetVersion, simplifying our versioning model.
  • Introduced a new function valid_version that integrates support for both string and Version types.
  • Updated version handling logic across the codebase for consistency and correctness, including:
    • Bumping versions through a dedicated method in DatasetMetadata.
    • Enhanced version comparison logic in DatasetManager.
  • Changed the way version strings are parsed to utilize the new parse_version function.
  • Optimized dataset save and load functions to streamline version handling.
  • This adjustment promotes a more consistent approach to versioning, reducing potential bugs related to string manipulation and version comparisons.
  • Overall, these changes aim to enhance robustness and maintainability of version management in the dataset implementation.

This summary was generated with ❤️ by rigging

@briangreunke briangreunke merged commit cdf2c90 into feat/datasets Dec 21, 2025
1 check passed
@briangreunke briangreunke deleted the brian/eng-3788-refactor-versioning-functionality branch December 21, 2025 17:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants