#Explainablity

Research

Defining and extracting generalizable interaction primitives from DNNs
Lu Chen, 
Siyu Lou, 
Benhao Huang, 
Quanshi Zhang
ICLR-2024
#LLM
#Explainablity

Given different DNNs trained for the same task, developed a new method to extract interactions that are shared by these DNNs. Experiments show that the extracted interactions can better reflect common knowledge shared by different DNNs.

paper
GitHub