CodeTransOcean
CodeTransOcean, a large-scale comprehensive benchmark that supports the largest variety of programming languages for code translation. CodeTransOcean consists of three novel multilingual datasets, namely, MultilingualTrans supporting translations between multiple popular programming languages, NicheTrans for translating between niche programming languages and popular ones, and LLMTrans for evaluating executability of translated code by large language models (LLMs). CodeTransOcean also includes a novel cross-framework dataset, DLTrans, for translating deep learning code across different frameworks.
MultilingualTrans
Rank Model Organization Date BlEU
1 CodeT5+ (One-to-Many) Salesforce Research 2023-05 6.42
2 CodeT5+ (Many-to-Many) Salesforce Research 2023-05 5.81
3 CodeT5+ (Many-to-One) Salesforce Research 2023-05 5.31
4 CodeT5+ (One-to-One) Salesforce Research 2023-05 5.19
NicheTrans (Many-to-Many)
Rank Model Organization Date BLEU
1 CodeT5+ Salesforce Research 2023-05 3.92
2 Naive CodeTransOcean Team 2023-06 2.76
LLMTrans
Rank Model Organization Date DSR
1 GPT-3.5 (3rd Debug) OpenAI 2023-03 52.57%
2 GPT-3.5 (One-shot) OpenAI 2023-03 50.29%
3 GPT-3.5 (Zero-shot) OpenAI 2023-03 48.57%
4 GPT-3.5 (CoT) OpenAI 2023-03 48.29%
DLTrans (Many-to-Many)
Rank Model Organization Date EM BlEU CodeBlEU
1 CodeT5+ Salesforce Research 2023-05 35.35 75.53 75.29
2 Naive CodeTransOcean Team 2023-06 28.48 68.44 72.57
CodeTransOcean Submission Instructions

When you build a language model that meets your expectations, you can submit your test results for official reporting. To submit your model evaluation results on CodeTransOcean, please submit the following information:

1. The evaluation results of your language model on each task of the CodeTransOcean benchmark, please follow the evaluation metrics in our paper. [Required]

2. Individual/Team Organization: The name of the organization where the individual or team appears in the leaderboard. [Required]

3. Training code for your large language model. [Recommended]

4. Information about your large language model: The name of the model/method that appears in the leaderboard. [Required]

5. Your paper information: If the model is from a published work, the name and URL of the paper will appear on the leaderboard. [Recommended]

Submit the above information by email to yanweixiang.ywx@gmail.com and we will respond to your email within 72 hours.

How to cite