Most rerecording emulators work this way, not by showing input pressed right now.
This also fixes bugs in autoholding.
Instead of using full 2*100 bytes for each subframe of movie data, pack it in controller-dependent way, reducing the memory usage to 7-20 bytes per subframe (90-96% reduction).