The accurate classification of seismic signals is the key link in constructing seismic catalog, which is of great significance for seismic catalog cleaning, earthquake monitoring and alerting, and seismological research. Aiming at the existing seismic event classification algorithms with low accuracy and large computational overhead, this paper designs a deep learning network CL-MobileViT for automatic classification of seismic events. CL-MobileViT comprehensively considers the performance and efficiency of the algorithm, selects MobileViT as the main body of the network, adds the attention mechanism to improve the sensitivity of the network to effective features, and uses the idea of large kernel convolution decomposition to reduce the computing overhead of the network. At the same time, the AdamW optimization strategy is adopted to guarantee that the final model can maximize the performance of the network. Specifically, first of all, add Coordinate Attention in the skip connection of MobileViT block, so that the network can pay fine attention to the information of different locations, strengthen the interaction modeling between long-distance seismic phase features, and improve the classification accuracy; Secondly, the traditional convolution used in the local feature extraction part of MobileViT block is replaced by multiple small-size convolution kernels decomposed by a large kernel convolution, which improves the nonlinear fitting ability of the network while reducing the computation and parameter number, thus improving the accuracy of seismic event classification. Finally, AdamW optimizer is used to prevent network from being overfitted and improve the training effect. By comparison with 11 existing mainstream deep learning classification models, it is found that CL-MobileViT can reach 97.3% accuracy in recognizing three seismic events, namely natural earthquake, collapse and blasting, which is superior to the comparison methods. Moreover, the number of parameters of CL-MobileViT is only 1.19 M, which is far lower than the comparison methods. It is proved that the method in this paper has better ability of seismic event classification.