Tu Vu

prof_pic.jpg

I am an Assistant Professor at Virginia Tech and a Research Scientist at Google DeepMind. Previously, I received my PhD in Computer Science at the University of Massachusetts Amherst, advised by Mohit Iyyer. My research aims to develop effective and efficient methods for advancing and democratizing artificial intelligence in the era of large language models (LLMs). Specific areas of focus include:

  • In-context learning and tool-use LLMs: injecting knowledge into LLM prompts and augmenting LLMs with external tools
  • Instruction tuning: enhancing LLMs’ instruction-following capabilities
  • Parameter-efficient transfer learning: efficiently transferring knowledge across tasks, languages, and modalities
  • Advanced planning and reasoning: improving LLMs’ ability to solve complex reasoning problems
  • Long-context modeling: designing efficient model architectures for long sequences.
For prospective PhD students

I plan to recruit one or two new PhD students for Fall 2025. If you are interested in joining my lab, please apply to the Virginia Tech Graduate School and list me as a potential advisor. Please also check out the application deadlines and information for prospective students.


Recent news

Oct. 2024 :speaking_head: Invited talk at Mila / McGill NLP seminar
Oct. 2024 :speaking_head: Invited talk at University of Toronto / Ontario Tech University
Sep. 2024 :page_facing_up: One paper to appear at EMNLP 2024 on foundational autoraters (FLAMe)! :tada:
Aug. 2024 :briefcase: I started my professorship at Virginia Tech
Jul. 2024 :page_facing_up: New preprint on Foundational Autoraters (FLAMe)
May. 2024 :page_facing_up: FreshLLMs got accepted to ACL 2024 Findings! :tada:
Feb. 2024 :briefcase: I am now serving as an Area Chair for ACL Rolling Review (ARR)
Jan. 2024 :page_facing_up: Flan-MoE got accepted to ICLR 2024! :tada:
Nov. 2023 :speaking_head: Invited talk at Graph Neural Networks Reading Group, Google
Oct. 2023 :page_facing_up: New preprint on LLM factuality (FreshLLMs)
Aug. 2023 :briefcase: I joined Google in Mountain View, CA as a Research Scientist
Jul. 2023 :mortar_board: I successfully defended my PhD thesis! :tada: :champagne:

Advisees

Group:
Pin-Jie (Linus) Lin (1st year PhD student @ Virginia Tech)
Quyet Do (1st year PhD student @ Virginia Tech)
Rishab Balasubramanian (1st year PhD student @ Virginia Tech)
Thinh Pham (1st year PhD student @ Virginia Tech)
Others:
Prateek Yadav (Research Intern @ Google Gemini, Summer — Fall 2024)
Simeng (Shirley) Han (Student Researcher @ Google DeepMind, Summer — Fall 2024)
Salaheddin Alzubi (Masters student @ UMass Amherst, Fall 2022 — Spring 2023)
Dheeraj Mekala (PhD student @ UCSD, Spring — Summer 2022)

Selected publications

For an up-to-date list of my research papers, please see my Google Scholar profile. * denotes equal contribution.
  1. EMNLP
    Foundational Autoraters: Taming Large Language Models for Better Automatic Evaluation
    Tu Vu*Kalpesh Krishna*Salaheddin AlzubiChris TarManaal Faruquiand Yun-Hsuan Sung
    In Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024
    // The top-performing generative model on RewardBench trained solely on publicly available data
  2. Preprint
    Gemini: A Family of Highly Capable Multimodal Models
    Google Gemini Team: Rohan AnilSebastian BorgeaudYonghui WuJean-Baptiste AlayracJiahui YuRadu SoricutJohan SchalkwykAndrew DaiAnja Hauthand  others including Tu Vu
    In arXiv preprint arXiv:2312.11805, 2023
    // Google AI Blog
  3. ACL
    FreshLLMs: Refreshing large language models with search engine augmentation
    Tu VuMohit IyyerXuezhi WangNoah ConstantJerry WeiJason WeiChris TarYun-Hsuan SungDenny ZhouQuoc Leand Thang Luong
    In Findings of the Association for Computational Linguistics: ACL 2024, 2024
    // Our dataset and method have inspired or been used for the development of Google’s Gemini, Perplexity.AI’s Online LLMs, You.com, and Contextual AI’s RAG 2.0
  4. ICML
    The Flan Collection: Designing Data and Methods for Effective Instruction Tuning
    Shayne LongpreLe HouTu VuAlbert WebsonHyung Won ChungYi TayDenny ZhouQuoc V LeBarret ZophJason Weiand Adam Roberts
    In Proceedings of the 40th International Conference on Machine Learning, 2023
    // Google Research Blog
  5. ICLR
    Mixture-of-experts meets instruction tuning: A winning combination for large language models
    Sheng ShenLe HouYanqi ZhouNan DuShayne LongpreJason WeiHyung Won ChungBarret ZophWilliam FedusXinyun ChenTu VuYuexin WuWuyang ChenAlbert WebsonYunxuan LiVincent ZhaoHongkun YuKurt KeutzerTrevor Darrelland Denny Zhou
    In Proceedings of the 12th International Conference on Learning Representations, 2024
  6. ACL
    SPoT: Better Frozen Model Adaptation through Soft Prompt Transfer
    Tu VuBrian LesterNoah ConstantRami Al-Rfouand Daniel Cer
    In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2022
    // Headlines of Google AI’s Natural Language Accelerated Newsletter Q1, 2022
  7. EMNLP
    Overcoming Catastrophic Forgetting in Zero-Shot Cross-Lingual Generation
    Tu VuAditya BaruaBrian LesterDaniel CerMohit Iyyerand Noah Constant
    In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, 2022
  8. EMNLP
    STraTA: Self-Training with Task Augmentation for Better Few-shot Learning
    Tu VuThang LuongQuoc LeGrady Simonand Mohit Iyyer
    In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, 2021
  9. EMNLP
    Exploring and Predicting Transferability across NLP Tasks
    Tu VuTong WangTsendsuren MunkhdalaiAlessandro SordoniAdam TrischlerAndrew Mattarella-MickeSubhransu Majiand Mohit Iyyer
    In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, 2020