ActAlign: Zero-Shot Fine-Grained Video Classification
A zero-shot framework that uses LLM-generated sub-action scripts and sequence alignment to classify fine-grained actions in video without any video–text supervision.
A zero-shot framework that uses LLM-generated sub-action scripts and sequence alignment to classify fine-grained actions in video without any video–text supervision.