Rishi Veerapaneni

Hello! I am a PhD student in the School of Computer Science at Carnegie Mellon University. I work with Professors Maxim Likhachev and Jiaoyang Li and am supported by the NSF Graduate Research Fellowship. Previously, I double majored in EECS and Applied Math at UC Berkeley where I conducted research with Professor Sergey Levine in Berkeley AI Research and was very active in teaching (EE16A, CS188, CS170 x2).

My research develops scalable methods for long-horizon decision making at the intersection of reinforcement learning, imitation learning, and search-based planning. My solutions combine learning with structured planning and leverage test-time compute. While much of my work is motivated by large-scale multi-robot coordination problems, my broader goal is building learning-and-reasoning architectures for complex decision making. My work on scalable imitation learning combined with search received the Best Multi-Agent Systems Paper and Best Student Paper awards at ICRA 2025. Looking ahead, I am particularly interested in working on LLMs, robot learning, or foundation models.

Selected Publications

Deploying Ten Thousand Robots: Scalable Imitation Learning for Lifelong Multi-Agent Path Finding

He Jiang*, Yutong Wang*, Rishi Veerapaneni, Tanishq Duhan, Guillaume Sartoretti, and Jiaoyang Li

In 2025 IEEE International Conference on Robotics and Automation (ICRA), 2025

Abs arXiv Bib PDF

Best Student and Best Multi-Agent Systems Paper ⭐

Lifelong Multi-Agent Path Finding (LMAPF) is a variant of MAPF where agents are continually assigned new goals, necessitating frequent re-planning to accommodate these dynamic changes. Recently, this field has embraced learning-based methods, which reactively generate single-step actions based on individual local observations. However, it is still challenging for them to match the performance of the best search-based algorithms, especially in large-scale settings. This work proposes an imitation-learning-based LMAPF solver that introduces a novel communication module and systematic single-step collision resolution and global guidance techniques. Our proposed solver, Scalable Imitation Learning for LMAPF (SILLM), inherits the fast reasoning speed of learning-based methods and the high solution quality of search-based methods with the help of modern GPUs. Across six large-scale maps with up to 10,000 agents and varying obstacle structures, SILLM surpasses the best learning- and search-based baselines, achieving average throughput improvements of 137.7% and 16.0%, respectively. Furthermore, SILLM also beats the winning solution of the 2023 League of Robot Runners, an international LMAPF competition sponsored by Amazon Robotics. Finally, we validated SILLM with 10 real robots and 100 virtual robots in a mockup warehouse environment.
@inproceedings{jiang2025scalable_il_lmapf, author = {Jiang, He and Wang, Yutong and Veerapaneni, Rishi and Duhan, Tanishq and Sartoretti, Guillaume and Li, Jiaoyang}, booktitle = {2025 IEEE International Conference on Robotics and Automation (ICRA)}, title = {Deploying Ten Thousand Robots: Scalable Imitation Learning for Lifelong Multi-Agent Path Finding}, year = {2025}, pages = {1-7}, doi = {10.1109/ICRA55743.2025.11127445}, }
Improving Learnt Local MAPF Policies with Heuristic Search

Rishi Veerapaneni*, Qian Wang*, Kevin Ren*, Arthur Jakobsson*, Jiaoyang Li, and Maxim Likhachev

International Conference on Automated Planning and Scheduling (ICAPS), 2024

Abs arXiv Bib PDF Code

Multi-agent path finding (MAPF) is the problem of finding collision-free paths for a team of agents to reach their goal locations. State-of-the-art classical MAPF solvers typically employ heuristic search to find solutions for hundreds of agents but are typically centralized and can struggle to scale when run with short timeouts. Machine learning (ML) approaches that learn policies for each agent are appealing as these could enable decentralized systems and scale well while maintaining good solution quality. Current ML approaches to MAPF have proposed methods that have started to scratch the surface of this potential. However, state-of-the-art ML approaches produce “local" policies that only plan for a single timestep and have poor success rates and scalability. Our main idea is that we can improve a ML local policy by using heuristic search methods on the output probability distribution to resolve deadlocks and enable full horizon planning. We show several model-agnostic ways to use heuristic search with learnt policies that significantly improve the policies’ success rates and scalability. To our best knowledge, we demonstrate the first time ML-based MAPF approaches have scaled to high congestion scenarios (e.g. 20% agent density).
@article{veerapaneni2024improving_mapf_policies_with_search, title = {Improving Learnt Local MAPF Policies with Heuristic Search}, volume = {34}, url = {https://ojs.aaai.org/index.php/ICAPS/article/view/31522}, doi = {10.1609/icaps.v34i1.31522}, number = {1}, journal = {International Conference on Automated Planning and Scheduling (ICAPS)}, author = {Veerapaneni, Rishi and Wang, Qian and Ren, Kevin and Jakobsson, Arthur and Li, Jiaoyang and Likhachev, Maxim}, year = {2024}, pages = {597-606}, }
Entity Abstraction in Visual Model-Based Reinforcement Learning

Rishi Veerapaneni*, John D. Co-Reyes*, Michael Chang*, Michael Janner, Chelsea Finn, Jiajun Wu, Joshua Tenenbaum, and Sergey Levine

In Conference on Robot Learning (CoRL), 2020

Abs arXiv Bib PDF Website

We present OP3, a framework for model-based reinforcement learning that acquires object representations from raw visual observations without supervision and uses them to predict and plan. To ground these abstract representations of entities to actual objects in the world, we formulate an interactive inference algorithm which incorporates dynamic information in the scene. Our model can handle a variable number of entities by symmetrically processing each object representation with the same locally-scoped function. On block-stacking tasks, OP3 can generalize to novel block configurations and more objects than seen during training, outperforming both a model that assumes access to object supervision and a state-of-the-art video prediction model.
@inproceedings{veerapaneni20entity_abstraction, title = {Entity Abstraction in Visual Model-Based Reinforcement Learning}, author = {Veerapaneni, Rishi and Co-Reyes, John D. and Chang, Michael and Janner, Michael and Finn, Chelsea and Wu, Jiajun and Tenenbaum, Joshua and Levine, Sergey}, booktitle = {Conference on Robot Learning (CoRL)}, pages = {1439--1456}, year = {2020}, editor = {Kaelbling, Leslie Pack and Kragic, Danica and Sugiura, Komei}, volume = {100}, series = {Proceedings of Machine Learning Research}, publisher = {PMLR}, url = {https://proceedings.mlr.press/v100/veerapaneni20a}, }