-
Notifications
You must be signed in to change notification settings - Fork 171
Open
Description
I am very excited to read the cool work Magicoder. I strongly believe that OSS-Instruct will push the boundaries of instruction tuning for code LLMs.
I want to ask a question about Magicoder. It seems that you do not test the correctness of the generated solutions from seed code snippets. I am curious about the reason why it is not necessary to go through the code validity checking process. Below are some assumptions I made about this:
- The most of generated solutions are just correct by manual checking, and LLMs are robust to some wrong codes during fine-tuning.
- OSS-Instruct creates new data more like a combination of seed code snippets. And the LLMs (GPT-3.5/GPT-4) used to generate solutions can handle the combination easily since they could see correct seed code snippets.
What’s your opinion on this problem? I am looking forward to your reply and thanks for your help!
Metadata
Metadata
Assignees
Labels
No labels