-
Notifications
You must be signed in to change notification settings - Fork 3
Consistency check script #114
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
utils/validate_yaml.py
Outdated
| "textual description", | ||
| ] | ||
|
|
||
| UNIQUE_FIELDS = ["name", "reference", "implementation"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This triggers an error when:
- reference is
''(empty), which is probably not what we want. Thinking about it: A reference can also introduce multiple problems, so does not need to be unique in any case? What do you think? - Similarly, I expect the same happens for the implementation field, which may also not need to be unique, because a single package may implement multiple problems/benchmarks. (I guess it might depend a bit on how specific we want the reference to the implementation to be, but it probably cannot always be specific enough to be unique.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I added it since I was hoping this could weed out instances where the same problem was added multiple times just with a different name. But you are right, there are several valid reasons for why the reference/implementation could be the same. I am going to change it to a warning instead.
utils/validate_yaml.py
Outdated
| import sys | ||
|
|
||
| # Define the required fields your YAML must have | ||
| REQUIRED_FIELDS = [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have the same list of fields in yaml_to_html.py (called default_columns). We should probably maintain it in a single place, and/or let one inherit the other?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great catch, I am going to import from the file
utils/validate_yaml.py
Outdated
|
|
||
|
|
||
| def check_fields(data): | ||
| if len(data) != len(REQUIRED_FIELDS): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should test that there are at least this many fields. I would explicitly want people to add new fields for interesting properties we do not collect yet. Then:
- Properties not in the
REQUIRED_FIELDSshould then be checked against a (to be created)NOT_REQUIRED_FIELDSwhich would contain all other fields (might be empty for now). - A message should be returned listing the new fields (found in neither the required or not-required lists), to be verified by an OPL maintainer as actually new (not just a new name for an existing property), and then either added to the not required list or fixed (or requested as change on a PR) to have the correct existing name.
- Ideally all other checks are still done before such a list is returned, so we know everything else already passes the checks, and verifying new fields (or maybe other similar things) is all that needs to be done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I removed the length check because it doesn't really add anything in this case.
I am now adding warnings for unknown fields that the reviewers can check.
Adding to the optional fields variable will be done on merge in the merging script, so this is not part of this PR. For now there are just no optional fields.
|
Something else that came to mind that I did not think about or try yesterday:
|
Yes, very good point. I changed this now to test for each entry and not assume it is only one. Note: I also changed the format of the prints. with the new syntax, they should be automatically picked up by the GitHub Action. This is to be tested for PR #127 |
Added a preliminary script for checking consistency, along with a README detailing the envisioned github workflow.
Main question for review: Should I be checking for anything else? Is there anything that is too strict?
Closes #123