Hi, I want to know whether there is possibility that 2-pass copy run faster than direct copy. I have seen you said 2-pass copy means source -> L1 cache, L1 cache -> destination. But I think it's not helpful for reducing cache misses of source and dest, isn't it?
Thanks~