Parrot

https://github.com/AIDC-AI/Parrot

📊 Stats

⭐ Stars: 77

📝 Language: Python

📝 Description: 🎉 The code repository for "Parrot: Multilingual Visual Instruction Tuning" in PyTorch.

⭐ Star Growth (12 months)

🔬 Research Notes

Stats

⭐ Stars: 77

🍴 Forks: 3

📝 Language: Python

📅 Created: 2024-08-02

🔄 Updated: 2026-01-15

🏷️ Latest Release: No releases

Description

🎉 The code repository for "Parrot: Multilingual Visual Instruction Tuning" in PyTorch.

Topics

mixture-of-experts, multilingual, multimodal-large-language-models, vision-language-model

Research Summary

Key Features

Architecture

Use Cases

Assessment

Maturity:

Documentation:

Community:

Recommendation:

README Excerpt

```

# 🦜 Parrot: Multilingual Visual Instruction Tuning

🎉Introduction •

📰What's New •

☄️Install •

🦜Model •

🔥Train •

🌟Datasets •

🎄MMMB

🔑Evaluation •

📍Quick Start •

👨‍🏫Acknowledgement •

🤗Contact

---

🎉 Introduction

Welcome to Parrot [[paper](https://arxiv.org/abs/2406.02539)], a novel method that utilizes textual guidance to drive visual token alignment at the language level. Parrot makes the visual tokens condition on diverse language inputs and uses Mixture-of-Experts (MoE) to promote the alignment of multilingual tokens. Moreover, considering the current lack of benchmarks for evaluating multilingual capabilities within the field, we collect and make available a Massive Multilingual Multimodal Benchmark which includes 6 languages, 15 categories, and 12,000 questions, named as MMMB.

If you find Parrot useful for your research and applications, please cite using this BibTeX:

```bibtex

@inproceedings{sun2025parrot,

title={Parrot: Multilingual Visual Instruction Tuning},

author={Sun, Hai-Long and Zhou, Da-Wei and Li, Yang and Lu, Shiyin and Yi, Chao and Chen, Qing-Guo and Xu, Zhao and Luo, Weihua and Zhang, Kaifu and Zhan, De-Chuan and others},

booktitle={ICML},

year={2025}

}

```

📰 What's New

[05/01] 🔥 Parrot is accepted by ICML 2025.

[08/21] 🔥 We have supported our multilingual MLLM Parrot in [VLMEvalKit](https://github.com/open-compass/VLMEvalKit), now you can evaluate Parrot easily. Welcome to have a try!

[08/20] 🔥 We have supported MMMB and Multilingual MMBench in [VLMEvalKit](https://github.com/open-compass/VLMEvalKit), now you can use the name MMMB and MTL_MMBench_DEV to obtain the results of 6 langs at the a time. Welcome to have a try!

[08/02] 🔥 We release the [code](https://github.com/AIDC-AI/Parrot), inhouse multilingual [dataset](https://huggingface.co/datasets/AIDC-AI/Parrot-dataset/tree/main/sharegpt_4v), benchmark [MMMB](https://huggingface.co/datasets/AIDC-AI/Parrot-dataset/tree/main/mmmb), and [model](https://huggingface.co/AIDC-AI/Parrot-7B), welcome to have a try!

[06/05] 🔥 Parrot is coming! We release the [paper](https://arxiv.org/abs/2406.02539)!

☄️ Install

Please follow the instructions below to install the required packages.

1. Clone this repository and navigate to Parrot folder

```bash

git clone https://github.com/AIDC-AI/Parrot.git

cd Parrot

```

2. Install Package

```Shell

conda create -n parrot python=3.10 -y

conda activate parrot

pip install --upgrade pip

pip install -e .

```

Upgrade to latest code base

```Shell

git pull

pip install -e . --no-deps

```

🦜 Model

Parrot is a multilingual multimodal large language model. We provide our fully finetuned models below:

| --- | --- | :---: | :---: | :---: |

🔥 Train

Parrot is trained in two stages: modality alignment and instruction tuning for multilingual alignment. Each stage's training script is provided in the scripts folder. Before starting the training, ensure you properly set the ROOT variable in the training script. Below are the commands to train Parrot for each stage:

```shell

bash scripts/train/pretrain.sh

bash scripts/train/finetune.sh

```

Hyperparameters

We use a similar set of hyperparameters as Vicuna in finetuning. Both hyperparameters used in pretraining and finetuning are provided below.

1. Pretraining

```

---

*Researched: 2026-03-28*