Pre-translation vs. direct inference in multilingual LLM functions

Giant language fashions (LLMs) have gotten omnipresent instruments for fixing a variety of issues. Nonetheless, their effectiveness in dealing with various languages has been hampered by inherent limitations in coaching information, which are sometimes skewed in the direction of English. To deal with this, pre-translation, the place inputs are translated to English earlier than feeding them to the LLM, has change into a normal apply.

Earlier analysis has demonstrated the effectiveness of pre-translation for optimum LLM efficiency for GPT-3/3.5/4, ChatGPT, PaLM and different fashions. Whereas pre-translation helps tackle the language bias situation, it introduces complexities and inefficiencies, and it might result in info loss. With the introduction of recent highly effective LLMs skilled on large multilingual datasets, it’s time to revisit the assumed necessity of pre-translation.

In our current work “Breaking the Language Barrier: Can Direct Inference Outperform Pre-Translation in Multilingual LLM Purposes?”, to be introduced at NAACL’24, we re-evaluate the necessity for pre-translation utilizing PaLM2, which has been established as extremely performant in multilingual duties. Our findings problem the pre-translation paradigm established in prior analysis and spotlight some great benefits of direct inference in PaLM2. Particularly, we exhibit that PaLM2-L persistently outperforms pre-translation in 94 out of 108 languages, providing a extra environment friendly and efficient software in multilingual settings whereas unlocking linguistic authenticity and assuaging the constraints of pre-translation.