Motor imagery electroencephalogram (MI-EEG) decoding algorithms face multiple challenges. These include incomplete feature extraction, susceptibility of attention mechanisms to distraction under low signal-to-noise ratios, and limited capture of long-range temporal dependencies. To address these issues, this paper proposes a multi-branch differential attention temporal network (MDAT-Net). First, the method constructed a multi-branch feature fusion module to extract and fuse diverse spatio-temporal features from different scales. Next, to suppress noise and stabilize attention, a novel multi-head differential attention mechanism was introduced to enhance key signal dynamics by calculating the difference between attention maps. Finally, an adaptive residual separable temporal convolutional network was designed to efficiently capture long-range dependencies within the feature sequence for precise classification. Experimental results showed that the proposed method achieved average classification accuracies of 85.73%, 90.04%, and 96.30% on the public datasets BCI-IV-2a, BCI-IV-2b, and HGD, respectively, significantly outperforming several baseline models. This research provides an effective new solution for developing high-precision motor imagery brain-computer interface systems.