Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Version 0.8.6 #1212

Merged
merged 98 commits into from
Apr 7, 2024
Merged
Show file tree
Hide file tree
Changes from 4 commits
Commits
Show all changes
98 commits
Select commit Hold shift click to select a range
dfe08f3
support deepspeed
BootsofLagrangian Feb 3, 2024
64873c1
fix offload_optimizer_device typo
BootsofLagrangian Feb 5, 2024
2824312
fix vae type error during training sdxl
BootsofLagrangian Feb 5, 2024
4295f91
fix all trainer about vae
BootsofLagrangian Feb 5, 2024
3970bf4
maybe fix branch to run offloading
BootsofLagrangian Feb 5, 2024
7d2a926
apply offloading method runable for all trainer
BootsofLagrangian Feb 5, 2024
6255661
fix full_fp16 compatible and train_step
BootsofLagrangian Feb 7, 2024
2445a5b
remove test requirements
BootsofLagrangian Feb 7, 2024
a98feca
forgot setting mixed_precision for deepspeed. sorry
BootsofLagrangian Feb 7, 2024
03f0816
the reason not working grad accum steps found. it was becasue of my a…
BootsofLagrangian Feb 9, 2024
4d5186d
refactored codes, some function moved into train_utils.py
BootsofLagrangian Feb 22, 2024
577e991
add some new dataset settings
kohya-ss Feb 26, 2024
f2c727f
add minimal impl for masked loss
kohya-ss Feb 26, 2024
1751936
update readme
kohya-ss Feb 26, 2024
4a5546d
fix typo
kohya-ss Feb 26, 2024
074d32a
Merge branch 'main' into dev
kohya-ss Feb 27, 2024
eefb3cc
Merge branch 'deep-speed' into deepspeed
kohya-ss Feb 27, 2024
0e4a573
Merge pull request #1101 from BootsofLagrangian/deepspeed
kohya-ss Feb 27, 2024
e3ccf8f
make deepspeed_utils
kohya-ss Feb 27, 2024
a9b64ff
support masked loss in sdxl_train ref #589
kohya-ss Feb 27, 2024
14c9372
add doc about Colab/rich issue
kohya-ss Mar 3, 2024
124ec45
Add "encoding='utf-8'"
Horizon1704 Mar 10, 2024
095b803
save state on train end
gesen2egee Mar 10, 2024
d282c45
Update train_network.py
gesen2egee Mar 11, 2024
74c266a
Merge branch 'dev' into masked-loss
kohya-ss Mar 12, 2024
97524f1
Merge branch 'dev' into deep-speed
kohya-ss Mar 12, 2024
948029f
random ip_noise_gamma strength
KohakuBlueleaf Mar 12, 2024
8639940
random noise_offset strength
KohakuBlueleaf Mar 12, 2024
53954a1
use correct settings for parser
KohakuBlueleaf Mar 12, 2024
0a8ec52
Merge branch 'main' into dev
kohya-ss Mar 15, 2024
443f029
fix doc
kohya-ss Mar 15, 2024
0ef4fe7
Merge branch 'dev' into masked-loss
kohya-ss Mar 17, 2024
7081a0c
extension of src image could be different than target image
kohya-ss Mar 17, 2024
3419c3d
common masked loss func, apply to all training script
kohya-ss Mar 17, 2024
86e40fa
Merge branch 'dev' into deep-speed
kohya-ss Mar 17, 2024
a7dff59
Update tag_images_by_wd14_tagger.py
sdbds Mar 18, 2024
5410a8c
Update requirements.txt
sdbds Mar 18, 2024
a71c35c
Update requirements.txt
sdbds Mar 18, 2024
6c51c97
fix typo
sdbds Mar 20, 2024
e281e86
Merge branch 'main' into dev
kohya-ss Mar 20, 2024
7da41be
Merge pull request #1192 from sdbds/main
kohya-ss Mar 20, 2024
80dbbf5
tagger now stores model under repo_id subdir
kohya-ss Mar 20, 2024
cf09c6a
Merge pull request #1177 from KohakuBlueleaf/random-strength-noise
kohya-ss Mar 20, 2024
46331a9
English Translation of config_README-ja.md (#1175)
darkstorm2150 Mar 20, 2024
5f6196e
update readme
kohya-ss Mar 20, 2024
119cc99
Merge pull request #1167 from Horizon1704/patch-1
kohya-ss Mar 20, 2024
3b0db0f
update readme
kohya-ss Mar 20, 2024
bf6cd4b
Merge pull request #1168 from gesen2egee/save_state_on_train_end
kohya-ss Mar 20, 2024
855add0
update option help and readme
kohya-ss Mar 20, 2024
9b6b39f
Merge branch 'dev' into masked-loss
kohya-ss Mar 20, 2024
fbb98f1
Merge branch 'dev' into deep-speed
kohya-ss Mar 20, 2024
d945602
Fix most of ZeRO stage uses optimizer partitioning
BootsofLagrangian Mar 20, 2024
a35e7bd
Merge pull request #1200 from BootsofLagrangian/deep-speed
kohya-ss Mar 20, 2024
d17c0f5
update dataset config doc
kohya-ss Mar 20, 2024
863c7f7
format by black
kohya-ss Mar 23, 2024
f4a4c11
support multiline captions ref #1155
kohya-ss Mar 23, 2024
0c7baea
register reg images with correct subset
feffy380 Mar 23, 2024
79d1c12
disable sample_every_n_xxx if value less than 1 ref #1202
kohya-ss Mar 24, 2024
691f043
update readme
kohya-ss Mar 24, 2024
ad97410
Merge pull request #1205 from feffy380/patch-1
kohya-ss Mar 24, 2024
381c449
update readme and typing hint
kohya-ss Mar 24, 2024
ae97c8b
[Experimental] Add cache mechanism for dataset groups to avoid long w…
KohakuBlueleaf Mar 24, 2024
0253472
refactor metadata caching for DreamBooth dataset
kohya-ss Mar 24, 2024
8d58588
Merge branch 'dev' into masked-loss
kohya-ss Mar 24, 2024
993b2ab
Merge branch 'dev' into deep-speed
kohya-ss Mar 24, 2024
1648ade
format by black
kohya-ss Mar 24, 2024
9bbb28c
update PyTorch version and reorganize dependencies
kohya-ss Mar 24, 2024
9c4492b
fix pytorch version 2.1.1 to 2.1.2
kohya-ss Mar 24, 2024
c24422f
Merge branch 'dev' into deep-speed
kohya-ss Mar 25, 2024
a2b8531
make each script consistent, fix to work w/o DeepSpeed
kohya-ss Mar 25, 2024
ea05e3f
Merge pull request #1139 from kohya-ss/deep-speed
kohya-ss Mar 26, 2024
ab1e389
Merge branch 'dev' into masked-loss
kohya-ss Mar 26, 2024
5a2afb3
Merge pull request #1207 from kohya-ss/masked-loss
kohya-ss Mar 26, 2024
c86e356
Merge branch 'dev' into dataset-cache
kohya-ss Mar 26, 2024
78e0a76
Merge pull request #1206 from kohya-ss/dataset-cache
kohya-ss Mar 26, 2024
6c08e97
update readme
kohya-ss Mar 26, 2024
6f7e93d
Add OpenVINO and ROCm ONNX Runtime for WD14
Disty0 Mar 27, 2024
b86af67
Merge pull request #1213 from Disty0/dev
kohya-ss Mar 27, 2024
dd9763b
Rating support for WD Tagger
Disty0 Mar 27, 2024
954731d
fix typo
Disty0 Mar 27, 2024
4012fd2
IPEX fix pin_memory
Disty0 Mar 28, 2024
bc586ce
Add --use_rating_tags and --character_tags_first for WD Tagger
Disty0 Mar 29, 2024
f1f30ab
fix to work with num_beams>1 closes #1149
kohya-ss Mar 30, 2024
ae3f625
Merge branch 'dev' of https://github.com/kohya-ss/sd-scripts into dev
kohya-ss Mar 30, 2024
434dc40
update readme
kohya-ss Mar 30, 2024
6ba8428
Merge pull request #1216 from Disty0/dev
kohya-ss Mar 30, 2024
cae5aa0
update wd14 tagger and doc
kohya-ss Mar 30, 2024
f5323e3
update tagger doc
kohya-ss Mar 30, 2024
2c2ca9d
update tagger doc
kohya-ss Mar 30, 2024
059ee04
fix typo
kohya-ss Mar 30, 2024
2258a1b
add save/load hook to remove U-Net/TEs from state
kohya-ss Mar 31, 2024
b748b48
fix attention couple deep shink cause error in some reso
kohya-ss Apr 3, 2024
cd587ce
verify command line args if wandb is enabled
kohya-ss Apr 4, 2024
921036d
Merge pull request #1240 from kohya-ss/verify-command-line-args
kohya-ss Apr 7, 2024
089727b
update readme
kohya-ss Apr 7, 2024
90b1879
Add option to use Scheduled Huber Loss in all training pipelines to i…
kabachuha Apr 7, 2024
d30ebb2
update readme, add metadata for network module
kohya-ss Apr 7, 2024
dfa3079
update readme
kohya-ss Apr 7, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions docs/config_README-en.md
Original file line number Diff line number Diff line change
Expand Up @@ -177,6 177,7 @@ Options related to the configuration of DreamBooth subsets.
| `image_dir` | `'C:\hoge'` | - | - | o (required) |
| `caption_extension` | `".txt"` | o | o | o |
| `class_tokens` | `"sks girl"` | - | - | o |
| `cache_info` | `false` | o | o | o |
| `is_reg` | `false` | - | - | o |

Firstly, note that for `image_dir`, the path to the image files must be specified as being directly in the directory. Unlike the previous DreamBooth method, where images had to be placed in subdirectories, this is not compatible with that specification. Also, even if you name the folder something like "5_cat", the number of repeats of the image and the class name will not be reflected. If you want to set these individually, you will need to explicitly specify them using `num_repeats` and `class_tokens`.
Expand All @@ -187,6 188,9 @@ Firstly, note that for `image_dir`, the path to the image files must be specifie
* `class_tokens`
* Sets the class tokens.
* Only used during training when a corresponding caption file does not exist. The determination of whether or not to use it is made on a per-image basis. If `class_tokens` is not specified and a caption file is not found, an error will occur.
* `cache_info`
* Specifies whether to cache the image size and caption. If not specified, it is set to `false`. The cache is saved in `metadata_cache.json` in `image_dir`.
* Caching speeds up the loading of the dataset after the first time. It is effective when dealing with thousands of images or more.
* `is_reg`
* Specifies whether the subset images are for normalization. If not specified, it is set to `false`, meaning that the images are not for normalization.

Expand Down
4 changes: 4 additions & 0 deletions docs/config_README-ja.md
Original file line number Diff line number Diff line change
Expand Up @@ -173,6 173,7 @@ DreamBooth 方式のサブセットの設定に関わるオプションです。
| `image_dir` | `‘C:\hoge’` | - | - | o(必須) |
| `caption_extension` | `".txt"` | o | o | o |
| `class_tokens` | `“sks girl”` | - | - | o |
| `cache_info` | `false` | o | o | o |
| `is_reg` | `false` | - | - | o |

まず注意点として、 `image_dir` には画像ファイルが直下に置かれているパスを指定する必要があります。従来の DreamBooth の手法ではサブディレクトリに画像を置く必要がありましたが、そちらとは仕様に互換性がありません。また、`5_cat` のようなフォルダ名にしても、画像の繰り返し回数とクラス名は反映されません。これらを個別に設定したい場合、`num_repeats` と `class_tokens` で明示的に指定する必要があることに注意してください。
Expand All @@ -183,6 184,9 @@ DreamBooth 方式のサブセットの設定に関わるオプションです。
* `class_tokens`
* クラストークンを設定します。
* 画像に対応する caption ファイルが存在しない場合にのみ学習時に利用されます。利用するかどうかの判定は画像ごとに行います。`class_tokens` を指定しなかった場合に caption ファイルも見つからなかった場合にはエラーになります。
* `cache_info`
* 画像サイズ、キャプションをキャッシュするかどうかを指定します。指定しなかった場合は `false` になります。キャッシュは `image_dir` に `metadata_cache.json` というファイル名で保存されます。
* キャッシュを行うと、二回目以降のデータセット読み込みが高速化されます。数千枚以上の画像を扱う場合には有効です。
* `is_reg`
* サブセットの画像が正規化用かどうかを指定します。指定しなかった場合は `false` として、つまり正規化画像ではないとして扱います。

Expand Down
4 changes: 4 additions & 0 deletions library/config_util.py
Original file line number Diff line number Diff line change
Expand Up @@ -85,6 85,7 @@ class DreamBoothSubsetParams(BaseSubsetParams):
is_reg: bool = False
class_tokens: Optional[str] = None
caption_extension: str = ".caption"
cache_info: bool = False


@dataclass
Expand All @@ -96,6 97,7 @@ class FineTuningSubsetParams(BaseSubsetParams):
class ControlNetSubsetParams(BaseSubsetParams):
conditioning_data_dir: str = None
caption_extension: str = ".caption"
cache_info: bool = False


@dataclass
Expand Down Expand Up @@ -205,6 207,7 @@ def __validate_and_convert_scalar_or_twodim(klass, value: Union[float, Sequence]
DB_SUBSET_ASCENDABLE_SCHEMA = {
"caption_extension": str,
"class_tokens": str,
"cache_info": bool,
}
DB_SUBSET_DISTINCT_SCHEMA = {
Required("image_dir"): str,
Expand All @@ -217,6 220,7 @@ def __validate_and_convert_scalar_or_twodim(klass, value: Union[float, Sequence]
}
CN_SUBSET_ASCENDABLE_SCHEMA = {
"caption_extension": str,
"cache_info": bool,
}
CN_SUBSET_DISTINCT_SCHEMA = {
Required("image_dir"): str,
Expand Down
99 changes: 77 additions & 22 deletions library/train_util.py
Original file line number Diff line number Diff line change
Expand Up @@ -63,6 63,7 @@
from huggingface_hub import hf_hub_download
import numpy as np
from PIL import Image
import imagesize
import cv2
import safetensors.torch
from library.lpw_stable_diffusion import StableDiffusionLongPromptWeightingPipeline
Expand Down Expand Up @@ -410,6 411,7 @@ def __init__(
is_reg: bool,
class_tokens: Optional[str],
caption_extension: str,
cache_info: bool,
num_repeats,
shuffle_caption,
caption_separator: str,
Expand Down Expand Up @@ -458,6 460,7 @@ def __init__(
self.caption_extension = caption_extension
if self.caption_extension and not self.caption_extension.startswith("."):
self.caption_extension = "." self.caption_extension
self.cache_info = cache_info

def __eq__(self, other) -> bool:
if not isinstance(other, DreamBoothSubset):
Expand Down Expand Up @@ -527,6 530,7 @@ def __init__(
image_dir: str,
conditioning_data_dir: str,
caption_extension: str,
cache_info: bool,
num_repeats,
shuffle_caption,
caption_separator,
Expand Down Expand Up @@ -574,6 578,7 @@ def __init__(
self.caption_extension = caption_extension
if self.caption_extension and not self.caption_extension.startswith("."):
self.caption_extension = "." self.caption_extension
self.cache_info = cache_info

def __eq__(self, other) -> bool:
if not isinstance(other, ControlNetSubset):
Expand Down Expand Up @@ -1081,8 1086,7 @@ def cache_text_encoder_outputs(
)

def get_image_size(self, image_path):
image = Image.open(image_path)
return image.size
return imagesize.get(image_path)

def load_image_with_face_info(self, subset: BaseSubset, image_path: str):
img = load_image(image_path)
Expand Down Expand Up @@ -1411,6 1415,8 @@ def get_item_for_caching(self, bucket, bucket_batch_size, image_index):


class DreamBoothDataset(BaseDataset):
IMAGE_INFO_CACHE_FILE = "metadata_cache.json"

def __init__(
self,
subsets: Sequence[DreamBoothSubset],
Expand Down Expand Up @@ -1485,26 1491,54 @@ def load_dreambooth_dir(subset: DreamBoothSubset):
logger.warning(f"not directory: {subset.image_dir}")
return [], []

img_paths = glob_images(subset.image_dir, "*")
logger.info(f"found directory {subset.image_dir} contains {len(img_paths)} image files")

# 画像ファイルごとにプロンプトを読み込み、もしあればそちらを使う
captions = []
missing_captions = []
for img_path in img_paths:
cap_for_img = read_caption(img_path, subset.caption_extension, subset.enable_wildcard)
if cap_for_img is None and subset.class_tokens is None:
info_cache_file = os.path.join(subset.image_dir, self.IMAGE_INFO_CACHE_FILE)
use_cached_info_for_subset = subset.cache_info
if use_cached_info_for_subset:
logger.info(
f"using cached image info for this subset / このサブセットで、キャッシュされた画像情報を使います: {info_cache_file}"
)
if not os.path.isfile(info_cache_file):
logger.warning(
f"neither caption file nor class tokens are found. use empty caption for {img_path} / キャプションファイルもclass tokenも見つかりませんでした。空のキャプションを使用します: {img_path}"
f"image info file not found. You can ignore this warning if this is the first time to use this subset"
" / キャッシュファイルが見つかりませんでした。初回実行時はこの警告を無視してください: {metadata_file}"
)
captions.append("")
missing_captions.append(img_path)
else:
if cap_for_img is None:
captions.append(subset.class_tokens)
use_cached_info_for_subset = False

if use_cached_info_for_subset:
# json: {`img_path`:{"caption": "caption...", "resolution": [width, height]}, ...}
with open(info_cache_file, "r", encoding="utf-8") as f:
metas = json.load(f)
img_paths = list(metas.keys())
sizes = [meta["resolution"] for meta in metas.values()]

# we may need to check image size and existence of image files, but it takes time, so user should check it before training
else:
img_paths = glob_images(subset.image_dir, "*")
sizes = [None] * len(img_paths)

logger.info(f"found directory {subset.image_dir} contains {len(img_paths)} image files")

if use_cached_info_for_subset:
captions = [meta["caption"] for meta in metas.values()]
missing_captions = [img_path for img_path, caption in zip(img_paths, captions) if caption is None or caption == ""]
else:
# 画像ファイルごとにプロンプトを読み込み、もしあればそちらを使う
captions = []
missing_captions = []
for img_path in img_paths:
cap_for_img = read_caption(img_path, subset.caption_extension, subset.enable_wildcard)
if cap_for_img is None and subset.class_tokens is None:
logger.warning(
f"neither caption file nor class tokens are found. use empty caption for {img_path} / キャプションファイルもclass tokenも見つかりませんでした。空のキャプションを使用します: {img_path}"
)
captions.append("")
missing_captions.append(img_path)
else:
captions.append(cap_for_img)
if cap_for_img is None:
captions.append(subset.class_tokens)
missing_captions.append(img_path)
else:
captions.append(cap_for_img)

self.set_tag_frequency(os.path.basename(subset.image_dir), captions) # タグ頻度を記録

Expand All @@ -1521,7 1555,19 @@ def load_dreambooth_dir(subset: DreamBoothSubset):
logger.warning(missing_caption f"... and {remaining_missing_captions} more")
break
logger.warning(missing_caption)
return img_paths, captions

if not use_cached_info_for_subset and subset.cache_info:
logger.info(f"cache image info for / 画像情報をキャッシュします : {info_cache_file}")
sizes = [self.get_image_size(img_path) for img_path in tqdm(img_paths, desc="get image size")]
matas = {}
for img_path, caption, size in zip(img_paths, captions, sizes):
matas[img_path] = {"caption": caption, "resolution": list(size)}
with open(info_cache_file, "w", encoding="utf-8") as f:
json.dump(matas, f, ensure_ascii=False, indent=2)
logger.info(f"cache image info done for / 画像情報を出力しました : {info_cache_file}")

# if sizes are not set, image size will be read in make_buckets
return img_paths, captions, sizes

logger.info("prepare images.")
num_train_images = 0
Expand All @@ -1540,7 1586,7 @@ def load_dreambooth_dir(subset: DreamBoothSubset):
)
continue

img_paths, captions = load_dreambooth_dir(subset)
img_paths, captions, sizes = load_dreambooth_dir(subset)
if len(img_paths) < 1:
logger.warning(
f"ignore subset with image_dir='{subset.image_dir}': no images found / 画像が見つからないためサブセットを無視します"
Expand All @@ -1552,8 1598,10 @@ def load_dreambooth_dir(subset: DreamBoothSubset):
else:
num_train_images = subset.num_repeats * len(img_paths)

for img_path, caption in zip(img_paths, captions):
for img_path, caption, size in zip(img_paths, captions, sizes):
info = ImageInfo(img_path, subset.num_repeats, caption, subset.is_reg, img_path)
if size is not None:
info.image_size = size
if subset.is_reg:
reg_infos.append((info, subset))
else:
Expand Down Expand Up @@ -1842,7 1890,8 @@ def __init__(
subset.image_dir,
False,
None,
subset.caption_extension,
subset.caption_extension,
subset.cache_info,
subset.num_repeats,
subset.shuffle_caption,
subset.caption_separator,
Expand Down Expand Up @@ -3384,6 3433,12 @@ def add_dataset_arguments(
parser.add_argument(
"--train_data_dir", type=str, default=None, help="directory for train images / 学習画像データのディレクトリ"
)
parser.add_argument(
"--cache_info",
action="store_true",
help="cache meta information (caption and image size) for faster dataset loading. only available for DreamBooth"
" / メタ情報(キャプションとサイズ)をキャッシュしてデータセット読み込みを高速化する。DreamBooth方式のみ有効",
)
parser.add_argument(
"--shuffle_caption", action="store_true", help="shuffle separated caption / 区切られたcaptionの各要素をshuffleする"
)
Expand Down
2 changes: 2 additions & 0 deletions requirements.txt
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 17,8 @@ easygui==0.98.3
toml==0.10.2
voluptuous==0.13.1
huggingface-hub==0.20.1
# for Image utils
imagesize==1.4.1
# for BLIP captioning
# requests==2.28.2
# timm==0.6.12
Expand Down
7 changes: 1 addition & 6 deletions train_network.py
Original file line number Diff line number Diff line change
Expand Up @@ -14,19 14,14 @@
import torch
from library.device_utils import init_ipex, clean_memory_on_device


init_ipex()

from torch.nn.parallel import DistributedDataParallel as DDP

from accelerate.utils import set_seed
from diffusers import DDPMScheduler
from library import deepspeed_utils, model_util

import library.train_util as train_util
from library.train_util import (
DreamBoothDataset,
)
from library.train_util import DreamBoothDataset
import library.config_util as config_util
from library.config_util import (
ConfigSanitizer,
Expand Down