Research
2025
Mask Tokens as Prophet: Fine-Grained Cache Eviction for Efficient dLLM Inference
MaskKV exploits mask-token attention signals to evict low-utility KV pairs in diffusion LLMs, shrinking cache budgets while preserving long-context accuracy and increasing throughput.
FreqCa: Accelerating Diffusion Models via Frequency-Aware Caching
FreqCa accelerates diffusion models by analyzing frequency dynamics of features across timesteps, reusing low-frequency components and interpolating high-frequency ones with great memory reduction.
A Survey of Data Attribution: Methods, Applications, and Evaluation in the Era of Generative AI
A comprehensive survey on data attribution methods, applications, and evaluation protocols that address how training data influences the behaviors of generative AI systems.
Do Vision-Language Models Have Internal World Models? Towards an Atomic Evaluation
This paper evaluates whether modern Vision-Language Models (VLMs) like GPT-4o and Gemini can act as internal world models (WMs)—systems that understand and predict the world.
2024
Contrasting Adversarial Perturbations: The Space of Harmless Perturbations
Defining and extracting generalizable interaction primitives from DNNs
Given different DNNs trained for the same task, developed a new method to extract interactions that are shared by these DNNs. Experiments show that the extracted interactions can better reflect common ...