Skip to content

Conversation

@jantic
Copy link
Contributor

@jantic jantic commented Oct 13, 2022

Removing unnecessary additional memory usage by replacing separate Dreambooth db_pipe assignment with pipe, and then deleting that pipe before running "Looking inside the pipeline" section.

Models in "Looking inside the pipeline" set to fp16 to further memory efficiency.

Combined the changes allow for running the notebook beginning to end on a 11GB 1080TI gpu.

Removing unnecessary additional memory usage by replacing separate Dreambooth db_pipe assignment with pipe, and then deleting that pipe before running "Looking inside the pipeline" section.

Models in "Looking inside the pipeline" set to fp16 to further memory efficiency.

Combined the changes allow for running the notebook beginning to end on a 11GB 1080TI gpu.
@review-notebook-app
Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@pcuenca
Copy link
Contributor

pcuenca commented Oct 13, 2022

Thanks for this @jantic, I should have been more considerate and tested on my own 11 GB GPU!

One minor question: is the autocast necessary? Can we maybe create the latents in fp16 too?

@jantic
Copy link
Contributor Author

jantic commented Oct 13, 2022

@pcuenca I've made the update you've requested. Can you expand upon your thinking about favoring fp16 over the autocast? I just want to know how you are thinking about it going forward so I don't make the same sort of mistake.

I should have been more considerate and tested on my own 11 GB GPU!

Hey, my previous attempt on this pull request was dumping a whole bunch of changed images into the commit history. And that's just the start of a long list of things I regularly screw up as I stumble like a drunkard towards software that works :) . We'll get there- that's what I keep saying.

@jantic
Copy link
Contributor Author

jantic commented Oct 13, 2022

I did omit one thing that might be important in the description: The change from db_pipe to pipe means that the output under "Latents and callbacks" will be different. It still looks decent from my perspective but not sure if this is acceptable. Now that I look at it, it's making the last image blank with the safety filter it appears. I can change the seed and find a better result if you desire.

From this:
index2

To this:
index

@jph00
Copy link
Member

jph00 commented Oct 13, 2022

Yeah maybe you or someone in a future PR could try to find a pic of me that's not NSFW... ;)

Many thanks for this PR @jantic !

@jph00 jph00 merged commit 8bdb983 into fastai:master Oct 13, 2022
@jantic
Copy link
Contributor Author

jantic commented Oct 14, 2022

Thanks @jph00 ! Here's that pull request for a NNSFW pic. #8

@pcuenca
Copy link
Contributor

pcuenca commented Oct 14, 2022

@pcuenca I've made the update you've requested. Can you expand upon your thinking about favoring fp16 over the autocast? I just want to know how you are thinking about it going forward so I don't make the same sort of mistake.

I'm just repeating what I saw here :) Apparently the overhead to copy and cast the tensors adds up to something not negligible. So if inference works in fp16 (which apparently it does), it's worthwhile using it according to those tests. It probably won't work for training, though.

I should have been more considerate and tested on my own 11 GB GPU!

Hey, my previous attempt on this pull request was dumping a whole bunch of changed images into the commit history. And that's just the start of a long list of things I regularly screw up as I stumble like a drunkard towards software that works :) . We'll get there- that's what I keep saying.

Isn't that what we all do? :)

@jantic
Copy link
Contributor Author

jantic commented Oct 14, 2022

@pcuenca Thanks for the explanation! I definitely noticed the slowdown as well when trying it elsewhere in the notebooks. It was a bit surprising actually.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants