Video frame prediction is a very challenging problem. Many recent neural network based solution-attempts trained using a mean squared error lead to blurry predictions. My most recent paper currently under review proposes to use Phase correlation and the Fourier-Shift theorem estimate changes and transform current images into predictions. A demo is shown below. The video shows ground truth (left), shift prediction (middle) and an off the shelf GRU prediction (right).
Source code is available on github .
A more detailed description is available in the paper .