ComfyUI-Repo

Commit Graph

Author	SHA1	Message	Date
comfyanonymous	31b0f6f3d8	UNET weights can now be stored in fp8. --fp8_e4m3fn-unet and --fp8_e5m2-unet are the two different formats supported by pytorch.	1 year ago
comfyanonymous	0cf4e86939	Add some command line arguments to store text encoder weights in fp8. Pytorch supports two variants of fp8: --fp8_e4m3fn-text-enc (the one that seems to give better results) --fp8_e5m2-text-enc	1 year ago
comfyanonymous	7339479b10	Disable xformers when it can't load properly.	1 year ago
comfyanonymous	dd4ba68b6e	Allow different models to estimate memory usage differently.	1 year ago
comfyanonymous	8594c8be4d	Empty the cache when torch cache is more than 25% free mem.	1 year ago
comfyanonymous	c8013f73e5	Add some Quadro cards to the list of cards with broken fp16.	1 year ago
comfyanonymous	fd4c5f07e7	Add a --bf16-unet to test running the unet in bf16.	1 year ago
comfyanonymous	9a55dadb4c	Refactor code so model can be a dtype other than fp32 or fp16.	1 year ago
comfyanonymous	88733c997f	pytorch_attention_enabled can now return True when xformers is enabled.	1 year ago
comfyanonymous	20d3852aa1	Pull some small changes from the other repo.	1 year ago
Simon Lui	eec449ca8e	Allow Intel GPUs to LoRA cast on GPU since it supports BF16 natively.	1 year ago
comfyanonymous	1cdfb3dba4	Only do the cast on the device if the device supports it.	1 year ago
comfyanonymous	321c5fa295	Enable pytorch attention by default on xpu.	1 year ago
comfyanonymous	0966d3ce82	Don't run text encoders on xpu because there are issues.	1 year ago
comfyanonymous	1938f5c5fe	Add a force argument to soft_empty_cache to force a cache empty.	1 year ago
Simon Lui	4a0c4ce4ef	Some fixes to generalize CUDA specific functionality to Intel or other GPUs.	1 year ago
comfyanonymous	b8c7c770d3	Enable bf16-vae by default on ampere and up.	2 years ago
comfyanonymous	a57b0c797b	Fix lowvram model merging.	2 years ago
comfyanonymous	f72780a7e3	The new smart memory management makes this unnecessary.	2 years ago
comfyanonymous	30eb92c3cb	Code cleanups.	2 years ago
comfyanonymous	51dde87e97	Try to free enough vram for control lora inference.	2 years ago
comfyanonymous	cc44ade79e	Always shift text encoder to GPU when the device supports fp16.	2 years ago
comfyanonymous	a6ef08a46a	Even with forced fp16 the cpu device should never use it.	2 years ago
comfyanonymous	f081017c1a	Save memory by storing text encoder weights in fp16 in most situations. Do inference in fp32 to make sure quality stays the exact same.	2 years ago
comfyanonymous	0d7b0a4dc7	Small cleanups.	2 years ago
Simon Lui	9225465975	Further tuning and fix mem_free_total.	2 years ago
Simon Lui	2c096e4260	Add ipex optimize and other enhancements for Intel GPUs based on recent memory changes.	2 years ago
comfyanonymous	e9469e732d	--disable-smart-memory now disables loading model directly to vram.	2 years ago
comfyanonymous	3aee33b54e	Add --disable-smart-memory for those that want the old behaviour.	2 years ago
comfyanonymous	2be2742711	Fix issue with regular torch version.	2 years ago
comfyanonymous	89a0767abf	Smarter memory management. Try to keep models on the vram when possible. Better lowvram mode for controlnets.	2 years ago
comfyanonymous	1ce0d8ad68	Add CMP 30HX card to the nvidia_16_series list.	2 years ago
comfyanonymous	4a77fcd6ab	Only shift text encoder to vram when CPU cores are under 8.	2 years ago
comfyanonymous	3cd31d0e24	Lower CPU thread check for running the text encoder on the CPU vs GPU.	2 years ago
comfyanonymous	22f29d66ca	Try to fix memory issue with lora.	2 years ago
comfyanonymous	4760c29380	Merge branch 'fix-AttributeError-module-'torch'-has-no-attribute-'mps'' of https://github.com/KarryCharon/ComfyUI	2 years ago
comfyanonymous	18885f803a	Add MX450 and MX550 to list of cards with broken fp16.	2 years ago
comfyanonymous	ff6b047a74	Fix device print on old torch version.	2 years ago
comfyanonymous	1679abd86d	Add a command line argument to enable backend:cudaMallocAsync	2 years ago
comfyanonymous	5f57362613	Lower lora ram usage when in normal vram mode.	2 years ago
comfyanonymous	490771b7f4	Speed up lora loading a bit.	2 years ago
KarryCharon	3e2309f149	fix mps miss import	2 years ago
comfyanonymous	0ae81c03bb	Empty cache after model unloading for normal vram and lower.	2 years ago
comfyanonymous	e7bee85df8	Add arguments to run the VAE in fp16 or bf16 for testing.	2 years ago
comfyanonymous	ddc6f12ad5	Disable autocast in unet for increased speed.	2 years ago
comfyanonymous	8d694cc450	Fix issue with OSX.	2 years ago
comfyanonymous	dc9d1f31c8	Improvements for OSX.	2 years ago
comfyanonymous	2c4e0b49b7	Switch to fp16 on some cards when the model is too big.	2 years ago
comfyanonymous	6f3d9f52db	Add a --force-fp16 argument to force fp16 for testing.	2 years ago
comfyanonymous	1c1b0e7299	--gpu-only now keeps the VAE on the device.	2 years ago

1 2 3

114 Commits (31b0f6f3d8034371e95024d6bba5c193db79bd9d)