Abstract: Multi-modal image fusion aims to integrate complementary cues from different modalities into a single image, facilitating downstream tasks such as object detection. However, input image ...