Replies: 1 comment 2 replies
-
|
I love this comprehensive overview! What do you think should be our action points on this? |
Beta Was this translation helpful? Give feedback.
2 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
I'm creating this discussion as a means to document my findings and to open up discussion with others on this. All ideas, suggestions and contributions are welcome.
I have found that there are a few key issues AI chat models have when generating proper diff patch formatting which are:
I have found that it is likely the AI will generate proper code that will compile and runs in a diff patch however the items listed above will prevent the patch from being applied if one or more are present.
I have been working on improving the success rate of apply patches generated by AI models and will be posting my results here.
Line Counts
--recountflag withgit applyusually will allow a patch with incorrect line counts to apply.White Space and Line Breaks
For example:
Original code:
var test_string = some_variable + ' ' + some_other_varPatch code:
test_string=some_variable + ' ' +some_other_var+ \nwhich should be+\n. This will cause git apply to fail even when--whitespace=fix --ignore-whitespaceare passed togit apply. If we do a bit of pre-processing on the generated patch before passing it togit applywe can clean up issues like this.Using
wiggleto apply rejected patches or as a replacement forgit apply--rejectwithgit applywill produce .rej files of any rejected patches that it could not apply. Then usingwiggle --replaceto apply the rejected patches works but on large patches it can fail.wiggleas a replacement forgit apply.This may work better than solution below. Not sure yet.. Wiggle always wants line endings that match the original code so normalizing the diff patch to \n causes issues with wiggle. Decided to go for the custom patch processor approach.Writing a custom diff patch processor [Currently testing]
Update: I have implemented a custom patch processor and so far it's working quite well (much better than using
git applyorwiggle). Still testing and will update.This may end up being the final answer as it will allow for better control over cleaning up, parsing and applying the patches when there are formatting issues with the generated patch.
An idea would be to normalize a copy of both the original code and diff patch and then treat the diff as a search and replace ignoring the generated patch line counts. Something like this might work?
Use an algorithm like levenshtein distance to calculate the similarity between the search text and code lines in the original code.Not needed unless the diff patch is really badly malformed and then in that case we shouldn't even bother to fix it.Beta Was this translation helpful? Give feedback.
All reactions