Amir Aghdam

📍 Philadelphia, PA, USA

Hey, thanks for stopping by! 👋

I’m a Master’s student in the CS Department at Temple University, currently wrapping it up at Summer 2025. My research focus include Vision-Language Models (VLMs), Multimodal Learning, and Computer Vision.

My current research focuses on zero-shot adaptation of VLMs, with a particular focus on fine-grained video understanding by leveraging the open-set recognition power of image-language models. I’m especially interested in how we can harness the capabilities of LLMs and VLMs responsibly, equipping them with effective workflows to solve high-impact problems.

Previously, I worked on active fine-tuning of foundational vision models like DINO, and I bring over two years of hands-on research experience in image segmentation, active learning, and VLMs.

I’m always open to new ideas, collaborations, or just a good conversation. Feel free to reach out! 📬

news

Jul 01, 2025	📢 My most recent work is published on arXiv in collaboration with LMU Munich Project Page.
Jul 01, 2025	🎉 I will finish my Master’s degree in Computer Science from Temple University this summer!