Compared to ordinary news, fake news is characterized by faster dissemination and lower production cost and therefore causes a great social harm. For these reasons, the challenge to efficiently and accurately detect fake news has attracted a lot of attention in the research community. We propose a Two-Round Inconsistency-based Multi-modal fusion Network (TRIMOON) for fake news detection, which consists of three main components: the multi-modal feature extraction module, the multi-modal feature fusion module and the classification module. To filter the noise generated in the fusion process, we perform a two-fold inconsistency detection, once before and once after the fusion process. Experimental results also prove this to be quite effective. Our proposed TRIMOON is evaluated on both the Chinese and the English datasets, and our model outperforms the state-of-the-art approaches on several classification evaluation metrics.