In working this demo using two different shader programs to draw a scene, I was getting a hard GPU hang on real hardware (and not on Citra). I narrowed the cause down to one shader I was using having a geo part as well as a vertex part and the other one just having a vertex part. If you flip-flopped which shader you used every frame, there'd be no issue.
Here's a demonstration based on the composite_scene example. This exhibits the same behavior I was running into. However, I just went and made another one based on the lenny example and this one doesn't seem to hang. Is something done in citro2d the culprit? Or is the lenny-based example not hanging because the shader programs are basically the same aside from the geo part?
I was told to submit an issue here. Hopefully this helps out!