AutoCodeRover could be tested on other benchmarks to see if the updated codebase can handle memorization-proof benchmarks https://github.com/livebench/liveswebench%20https://www.kprize.ai/%20https://livebench.ai/
cross-reference to another newer repo with similar SWE-bot goals smallcloudai/refact#796