
I am a 1st year Master’s student in Computer Science at University of Maryland, College Park. I work audio-visual understanding at Gamma Lab and PIRL lab under the guidance of Prof. Dinesh Manocha and Prof. Ramani Duraiswami
I completed my undergrad from IIT Madras where I had the opportunity to work with Prof. Umesh at the SPRING Lab on multilingual ASR systems for Indian languages. I was also fortunate to work with Prof. Hema Murthy on the Bhashini Project in collaboration with the Govt. of India.
After my undergrad I’ve worked as a Member of Technical Staff at Oracle working on various edge-computing solutions and integrating them into the Oracle Cloud Infrastructure (OCI). In the meanwhile I’ve also collaborated with students from University of Wisconsin Madision on understanding the Cooperative reasoning capabilities of LLMs through the Hanabi game
CV / Resume, Google Scholar Email ID: kaousheik@gmail.com
Updates
| April 2026: | Sparks of Cooperative Reasoning: LLMs as Strategic Hanabi Agents accepted at ICML 2026 |
| April 2026: | Releasing Audio Visual Flamingo: Audio-Visual Flamingo: Open Audio-Visual Intelligence for Long and Complex Videos in collaboration with NVIDIA |
| April 2026: | Released Audio Flamingo Next: Next-Generation Open Audio-Language Models for Speech, Sound, and Music in collaboration with NVIDIA |
| March 2026: | Released MMOU - Massive Multi-Task Omni Understanding and Reasoning Benchmark for Long and Complex Real-World Videos in collaboration with NVIDIA |
| February 2026: | First author Paper on Audio-Visual Interpretability accepted at CVPR 2026 Findings |
| January 2026: | Released preprint on Multi-turn RL analysing the cooperative capabilities of 17 SOTA LLMs |
| October 2025: | 1 paper accepted at Neurips Multi-turn intelligence workshop |
| October 2025: | 1 paper accepted at Neurips Workshop on Scaling Environments for Agents |
| October 2025: | 1 paper accepted at NeurIPS 2025 Workshop on Bridging Language, Agent, and World Models for Reasoning and Planning |
| Sept 2025: | 1 paper accepted at Knowledge Intensive Multimodal Reasoning workshop ICCV 2025, Hawaii |
| August 2025: | started my masters in CS at UMD as part of the Gamma Lab |
| June 2025: | 1 paper accepted at Multi-Agent Systems workshop ICML 2025, Vancouver |
| August 2023: | presented paper on multilingual ASR systems for Indian languages at Interspeech 2023, Dublin |
| Sep 2018: | Started working at Oracle as a Member of Technical Staff - Private Cloud Appliance team |
| July 2023: | graduated from IIT Madras |
| Spring 2023: | Teaching Assistant for Signals and Systems |
| Fall 2022: | Teaching Assistant for Digital Signal Processing |
| Summer 2022: | Project Intern at Oracle |