using threads or GPU decoding, real time spent for each step goes down. For example, when I used vdpau for decoding and display, I get zero time used for video decode and display. Only audio handling still produces time. Dan