626 Commits (ba04a87d104ca73d8ed8e8423706edcdf5e209a8)

Author SHA1 Message Date
comfyanonymous ba04a87d10 Refactor and improve the sag node.
Moved all the sag related code to comfy_extras/nodes_sag.py
1 year ago
Rafie Walker 6761233e9d
Implement Self-Attention Guidance (#2201)
* First SAG test

* need to put extra options on the model instead of patcher

* no errors and results seem not-broken

* Use @ashen-uncensored formula, which works better!!!

* Fix a crash when using weird resolutions. Remove an unnecessary UNet call

* Improve comments, optimize memory in blur routine

* SAG works with sampler_cfg_function
1 year ago
comfyanonymous b454a67bb9 Support segmind vega model. 1 year ago
comfyanonymous 824e4935f5 Add dtype parameter to VAE object. 1 year ago
comfyanonymous 32b7e7e769 Add manual cast to controlnet. 1 year ago
comfyanonymous 3152023fbc Use inference dtype for unet memory usage estimation. 1 year ago
comfyanonymous 77755ab8db Refactor comfy.ops
comfy.ops -> comfy.ops.disable_weight_init

This should make it more clear what they actually do.

Some unused code has also been removed.
1 year ago
comfyanonymous b0aab1e4ea Add an option --fp16-unet to force using fp16 for the unet. 1 year ago
comfyanonymous ba07cb748e Use faster manual cast for fp8 in unet. 1 year ago
comfyanonymous 57926635e8 Switch text encoder to manual cast.
Use fp16 text encoder weights for CPU inference to lower memory usage.
1 year ago
comfyanonymous 340177e6e8 Disable non blocking on mps. 1 year ago
comfyanonymous 614b7e731f Implement GLora. 1 year ago
comfyanonymous cb63e230b4 Make lora code a bit cleaner. 1 year ago
comfyanonymous 174eba8e95 Use own clip vision model implementation. 1 year ago
comfyanonymous 97015b6b38 Cleanup. 1 year ago
comfyanonymous a4ec54a40d Add linear_start and linear_end to model_config.sampling_settings 1 year ago
comfyanonymous 9ac0b487ac Make --gpu-only put intermediate values in GPU memory instead of cpu. 1 year ago
comfyanonymous efb704c758 Support attention masking in CLIP implementation. 1 year ago
comfyanonymous fbdb14d4c4 Cleaner CLIP text encoder implementation.
Use a simple CLIP model implementation instead of the one from
transformers.

This will allow some interesting things that would too hackish to implement
using the transformers implementation.
1 year ago
comfyanonymous 2db86b4676 Slightly faster lora applying. 1 year ago
comfyanonymous 1bbd65ab30 Missed this one. 1 year ago
comfyanonymous 9b655d4fd7 Fix memory issue with control loras. 1 year ago
comfyanonymous 26b1c0a771 Fix control lora on fp8. 1 year ago
comfyanonymous be3468ddd5 Less useless downcasting. 1 year ago
comfyanonymous ca82ade765 Use .itemsize to get dtype size for fp8. 1 year ago
comfyanonymous 31b0f6f3d8 UNET weights can now be stored in fp8.
--fp8_e4m3fn-unet and --fp8_e5m2-unet are the two different formats
supported by pytorch.
1 year ago
comfyanonymous af365e4dd1 All the unet ops with weights are now handled by comfy.ops 1 year ago
comfyanonymous 61a123a1e0 A different way of handling multiple images passed to SVD.
Previously when a list of 3 images [0, 1, 2] was used for a 6 frame video
they were concated like this:
[0, 1, 2, 0, 1, 2]

now they are concated like this:
[0, 0, 1, 1, 2, 2]
1 year ago
comfyanonymous c97be4db91 Support SD2.1 turbo checkpoint. 1 year ago
comfyanonymous 983ebc5792 Use smart model management for VAE to decrease latency. 1 year ago
comfyanonymous c45d1b9b67 Add a function to load a unet from a state dict. 1 year ago
comfyanonymous f30b992b18 .sigma and .timestep now return tensors on the same device as the input. 1 year ago
comfyanonymous 13fdee6abf Try to free memory for both cond+uncond before inference. 1 year ago
comfyanonymous be71bb5e13 Tweak memory inference calculations a bit. 1 year ago
comfyanonymous 39e75862b2 Fix regression from last commit. 1 year ago
comfyanonymous 50dc39d6ec Clean up the extra_options dict for the transformer patches.
Now everything in transformer_options gets put in extra_options.
1 year ago
comfyanonymous 5d6dfce548 Fix importing diffusers unets. 1 year ago
comfyanonymous 3e5ea74ad3 Make buggy xformers fall back on pytorch attention. 1 year ago
comfyanonymous 871cc20e13 Support SVD img2vid model. 1 year ago
comfyanonymous 410bf07771 Make VAE memory estimation take dtype into account. 1 year ago
comfyanonymous 32447f0c39 Add sampling_settings so models can specify specific sampling settings. 1 year ago
comfyanonymous c3ae99a749 Allow controlling downscale and upscale methods in PatchModelAddDownscale. 1 year ago
comfyanonymous 72741105a6 Remove useless code. 1 year ago
comfyanonymous 6a491ebe27 Allow model config to preprocess the vae state dict on load. 1 year ago
comfyanonymous cd4fc77d5f Add taesd and taesdxl to VAELoader node.
They will show up if both the taesd_encoder and taesd_decoder or taesdxl
model files are present in the models/vae_approx directory.
1 year ago
comfyanonymous ce67dcbcda Make it easy for models to process the unet state dict on load. 1 year ago
comfyanonymous d9d8702d8d percent_to_sigma now returns a float instead of a tensor. 1 year ago
comfyanonymous 0cf4e86939 Add some command line arguments to store text encoder weights in fp8.
Pytorch supports two variants of fp8:
--fp8_e4m3fn-text-enc (the one that seems to give better results)
--fp8_e5m2-text-enc
1 year ago
comfyanonymous 107e78b1cb Add support for loading SSD1B diffusers unet version.
Improve diffusers model detection.
1 year ago
comfyanonymous 7e3fe3ad28 Make deep shrink behave like it should. 1 year ago