Motion-Guided Global-Local Aggregation Transformer Network for Precipitation Nowcasting
More Info
expand_more
Abstract
Nowadays deep learning-based weather radar echo extrapolation methods have competently improved nowcasting quality. Current pure convolutional or convolutional recurrent neural network-based extrapolation pipelines inherently struggle in capturing both global and local spatiotemporal interactions simultaneously, thereby limiting nowcasting performances, e.g., they not only tend to underestimate heavy rainfalls' spatial coverage and intensity but also fail to precisely predict nonlinear motion patterns. Furthermore, the usually adopted pixel-wise objective functions lead to blurry predictions. To this end, we propose a novel motion-guided global-local aggregation Transformer network for effectively combining spatiotemporal cues at different time scales, thereby strengthening global-local spatiotemporal aggregation urgently required by the extrapolation task. First, we divide existing observations into both short- and long-term sequences to represent echo dynamics at different time scales. Then, to introduce reasonable motion guidance to Transformer, we customize an end-to-end module for jointly extracting motion representation of short- and long-term echo sequences (MRS, MRL), while estimating optical flow. Subsequently, based on Transformer architecture, MRS is used as queries to retrospect the most useful information from MRL for an effective aggregation of global long-term and local short-term cues. Finally, the fused feature is employed for future echo prediction. Additionally, for the blurry prediction problem, predictions from our model trained with an adversarial regularization achieve superior performances not only in nowcasting skill scores but also in precipitation details and image clarity over existing methods. Extensive experiments on two challenging radar echo datasets demonstrate the effectiveness of our proposed method.