currently theres text inversion but it has its limitations and does not retain identity.
Or this is something totally different despite being a diffusion and cant be applied to SD, from what i see in the paper, the results are pretty spectacular and retaining identiy is off the charts, will this be public ?
Doyou have original 1024 output images somewhere ? results are downscaled to 796 in the resuts.png.Also do you have results on cartoon style and on human faces to see how it retains the identity and cartoon style