-
Notifications
You must be signed in to change notification settings - Fork 724
Defer merging of common stanzas #11277
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
19bf4b9 to
d24351e
Compare
492ad9e to
d8a1fcc
Compare
|
Here are the benchmark results by timing the
|
|
I'm not sure why validate is used for benchmarking parser changes. Validate is likely dominated by completely unrelated factors (e. g. solver). Would it be possible to just download a bunch of cabal files from Hackage (see clc-stackage for a curated set of packages) and run only the parser on them? |
The options
I thought it would be important to test out whether I broke the solver by changing the representation to GPD too, which is why I tested everything :) |
these are tests, which is different from benchmarking. You seem to be using validate for benchmarking, which makes no sense to me, as explained above. On the other hand, using validate for tests may have some value although it is preferable to use the actual test like |
This is pretty much what the hackage-test test does. Do you think timing this alone would be more acceptable (compare to ./validate) even though it's not a benchmark? |
|
If timing hackage-tests would mean timing download times too, then no, it doesn't sound right. I don't know if you can arrange for separate download, which would make it more reasonable. Generally, if you want to have an estimate for performance footprint, you have to develop a benchmark, you can't just hijack test executables whole-sale. You could probably reuse some of the testing infrastructure (tasty-bench exists for a reason). If this sounds too much, you should just stop trying to achieve it because currently you're posting numbers for validate that most likely have no connection to the actual performance change, and as such are actively misleading. It'd be great to have some very simple benchmark that shows that we don't degrade performance significantly but other than that we should focus on design rather than performance, I think: the design issue has been a die-hard for a long time... |
hackage-tests looks for
I totally understand. I can make sure Otherwise, do you have an example of using tasty-bench to benchmark something similar in cabal? We wanted to add benchmark because in the tech proposal phase people were concerned about performance. |
|
How does the bidirectional pattern synonym work? If the accessor flattens the structure then the setter won't know whether to update the common stanza or the specific component. This seems like a big foot-gun in the API if the goal is to allow people to manipulate It would be helpful if you could lay out some comments pointing out how to review the patch since it is rather large. |
The part before the The pattern synonym was introduced to maintain backward compatibility of the record syntax. We managed to not change the majority of the existing code (except traversals that touch the condtrees for example). Without the record syntax pattern synonym, one would need to pattern match on the entire
Either the accessors or the caller of the accessors has to decide whether to update the common stanza or the specific component, this is a hard problem to solve. Should we choose to not flatten the structure, the call-sites of the accessors would then have to decide where to make the modification to the data (e.g. the common stanza or the component). This modification must also not create missing imports in any component. This is done by
c.f. semantics of accessors using bidirectional PatternSynonyms 1 Further more, the first part and the second part of an explicitly bidirectional pattern synonym must have the same type. If we want to be able to construct We could also choose to not have a right hand side, but due to how pattern synonym works 1, we won't able to use these fields as setters, only accessors. This is also a valid option. As a side note, -- This doesn't work
foo
{ internalField = bar -- a field from the internal description
, compatibilityField = baz -- a field from the new "compatibility" pattern
}
-- We have to do this because these two fields don't come from the same pattern.
foo
{ internalField = bar -- a field from the internal description
}
{ compatibilityField = baz -- a field from the new "compatibility" pattern
}What do you think about this trade off for backward compatibilty? Are there different solutions/details that I might be missing?
I grepped with
I will come back later for this :) Footnotes
|
We annotate each node of a In the parser test common-conditional you can compare the cabal file with its parsed expression and see this. The merging logic is done in Previously, merging was done in The types Let me know if I can elaborate on some details! |
proof of concept retain imports for foreignLib section retain imports for executable section retain imports for test-suite retain imports for benchmark attempt to insert used imports and not all imports use newtype change import data type; fix issues with import propagation & decoration fix missing import names in testSuite fix missing import names in benchmark run fourmolu run hlint defer merging prototype Currently we have achieve the following: - Stop merging, the merging function "endo" is id - CondTree are completly retained in bigger types such as libarry and executable We will need to do the following: - Allow merging in the accessor We broke: - A bunch of Read and Ord instances Revert "defer merging prototype" This reverts commit 21636db. stop merging retain common stanza in GenericPackageDescription experiment: use WithImport in gpd add function to merge imports mergeLibrary fix transitive imports retaintion and merging run fourmolu deferred merging for sublibraries simplification; remove todos retain foreignlib imports retain executable imports retain TestSuiteStanza imports We isolated the type TestSuiteStanza and the logic to infer test type retain BenchmarkStanza imports We isolated the BenchmarkStanza type and the logic to infer benchmark type. clean up remove benchmark import field We now use the WithImports type to tag imports introduce type alias in GenericPackageDescription add GenericPackageDescription pattern to hide internal implementation backward compatible accessors fix compiler errors add todo fix compiler errors for integration test run fourmolu don't expose intemediary accessors remove early experiment "import" fields in TestSuite and Benchmark move TestSuiteStanza validation to its module clean up {TestSuite,Benchmark}Stanza exports from FieldGrammar restore old behaviour in code working with PackageDescription we fixed GenericPackageDescription's constructor patch {TestSuite,Benchmark}Stanza type when using accessor remove accessors tests fix accessor dropping common stanza map when non it is not required This guarantees that unmerged internal representation is correct run fourmolu remove FieldGrammar export {TestSuite,Benchmark}Stanza fix(GenericPackageDescription): "pattern" keyword deprecated after 914 test: remove new test files test: add tests for gpd accessors test: check equality on each field test: improve import list test: use @? operator test: use a tuple to store all gpd fields test: define ToExpr tuple instance manually test: update expected test: use Rec constructor to annotate field names test: update expected test: remove comment not in scope for this PR test: use field equality in hackage tests test: add new test files test: fix test build test: newtype to guide GPD ToExpr instance test: add internal accessors test test: update expected for internal accessors test test: update expected test: fix hackage tests ToExpr instance
6ddbcc4 to
413bb83
Compare
|
I downloaded all latest packages on hackage using nh2/hackage-download and grepped to see how much downstream code is using You can also view it here using serokell's tool, though I don't know how exhaustive it is. The regex pattern I used: 60 occurrences of accessor usages in latest hackage |
This is the second part of the exact print parser. We changed the parser so that it doesn't merge the common stanzas during the parsing stage.
Depends on #11285 to update the test suite (after which the diff will be significantly smaller).
Background
Common stanzas are merged into their corresponding section during the parsing stage. This merging process doesn't retain the definition of the imports (i.e.
common <name> ...) nor where they are used (import: <name>). To achieve exact printing and to be able to programmatically update the import list, we need to retain this information.This PR is to remedy exactly that by deferring the merging process.
Implementation
We insert a new field
gpdCommonStanza :: Map ImportName (CondTree ConfVar [Dependency] BuildInfo)in theGenericPackageDescriptiondata type to track all the defined common stanzas. The merging is run only when the new accessors are called on aGenericPackageDescription.We also added a bidirectional
PatternSynonymthat uses the old record syntax. The impact on existing code should be very low. Comparing against my PR to update the test suite is promising leana8959/cabal@accessor-tests...leana8959:cabal:no-flatten-package-description.Please let me know your thoughts :)
Checklist below:
This PR modifies behaviour or interface
Include the following checklist in your PR:
significance: significantin the changelog file.