Idea for tuning LLMs for translation & code

Idea for tuning LLMs for translation & code / RLAIF

Give one instance of a LLM some rich context and produce a description of the code it would like

Pass it to a second instance and then have it generate the code

Pass the code back and have it iterate until the code fits the context.

You now have (long context, short description, code pairs) which can be used to enrich the training