Amir Aghdam
Researcher in VLMs and Computer Vision @ Temple University

📍 Philadelphia, PA, USA
Hey, thanks for stopping by! 👋
I’m a Master’s student in the CS Department at Temple University, currently wrapping it up at Summer 2025. My research focus include Vision-Language Models (VLMs), Multimodal Learning, and Computer Vision.
My current research focuses on zero-shot adaptation of VLMs, with a particular focus on fine-grained video understanding by leveraging the open-set recognition power of image-language models. I’m especially interested in how we can harness the capabilities of LLMs and VLMs responsibly, equipping them with effective workflows to solve high-impact problems.
Previously, I worked on active fine-tuning of foundational vision models like DINO, and I bring over two years of hands-on research experience in image segmentation, active learning, and VLMs.
I’m always open to new ideas, collaborations, or just a good conversation. Feel free to reach out! 📬
news
Jul 01, 2025 | 📢 My most recent work is published on arXiv in collaboration with LMU Munich Project Page. |
---|---|
Jul 01, 2025 | 🎉 I will finish my Master’s degree in Computer Science from Temple University this summer! |