ImageDatastore オブジェクトでイメージの水増しの前処理について

Question

NicknameAlpha il 14 Ott 2018

0
Link

Link diretto a questa domanda

https://it.mathworks.com/matlabcentral/answers/423884-imagedatastore

Commentato: Kenta il 11 Lug 2020

ImageDatastore オブジェクトを作成し, イメージを含むフォルダーに従って各イメージにラベルが付けられた後, 各ラベルのファイル数が最も大きいラベルのファイル数に合わせるように各ラベルでファイルがコピー・複製されるような前処理ができそうな方法はありそうですか?

参考のURL: https://jp.mathworks.com/help/matlab/ref/datastore.counteachlabel.html

0 Commenti
Mostra -2 commenti meno recentiNascondi -2 commenti meno recenti

Accedi per commentare.

Accedi per rispondere a questa domanda.

Answer 1

Kazuya il 14 Ott 2018

3
Link

Link diretto a questa risposta

https://it.mathworks.com/matlabcentral/answers/423884-imagedatastore#answer_341318

Apri in MATLAB Online

以前同じ方法を模索していまして、下記を参考にしました。

What is the best CNN for a small dataset?

Alpha Bravo さんの回答ですが参考までにコード転記します。まずラベルの数が（ほぼ）均等になるように、ラベルが少ない画像は単純に増やして、そのあと augmentedImageDatastore を使っておけば、同じ画像をそのまま学習に使うということは避けられるかと。

 trainStore = shuffle(trainStore); % i forgot to add the shuffle in the answer before
bootstrap_factor = 1; % how big do you want the new, balanced datastore to be, as a multiple of the size of the trainStore
  alphabetical_labels = {'happy', 'sad'}; % labels in alphabetical order, to map label names to their indices, if using the foldernames as labels
  labels = trainStore.Labels;
  labelCounts = countEachLabel(trainStore);
  labelCounts = labelCounts.Count;
  weights = labelCounts/sum(labelCounts);
  weights = weights.^(-1); % so less is more
  weightVec = [];
  for lab = 1:length(labels)
    for labidx = 1:length(alphabetical_labels)
      if labels(lab) == alphabetical_labels(labidx)
        weightVec(lab) = weights(labidx);
      end
    end
  end
  trainFiles = trainStore.Files;
  bootstrapSize = round(length(trainFiles) * bootstrap_factor);
  Bootstrap = datasample(trainFiles, bootstrapSize, 'Weights', weightVec);
  bootStrapTrainStore = imageDatastore(Bootstrap, 'LabelSource', 'foldernames', IncludeSubfolders', true);

の後に

 augmentedResolution = [128 128]; % or whatever image resolution you want to use
 augmenter = imageDataAugmenter('RandRotation', [-10 10]); % optional, used to augment data, see documentation for full options
 trainStoreAug = augmentedImageDatastore(augmentedResolution, bootStrapTrainStore, 'DataAugmentation', augmenter);

と続けるイメージ。

2 Commenti
Mostra NessunoNascondi Nessuno

Kazuya il 20 Ott 2018

Apri in MATLAB Online

weightVec = [];
for lab = 1:length(labels)
  for labidx = 1:length(alphabetical_labels)
    if labels(lab) == alphabetical_labels(labidx)
      weightVec(lab) = weights(labidx);
    end
  end
end

そのまま引用したこの部分、forループで回すのは効率が悪く時間がかかるので、論理配列を使う形にする方がよいです。ご注意ください。改めてコメントつけて書き直すと、

trainStore = shuffle(trainStore); % もともとの imageDatastore : trainStore （順番をランダム化）
bootstrap_factor = 1; % ラベルの数を合わせた後の画像の総数は、もともとの数の何倍になるようにするか。1の場合は画像総数は変化しないので、結果的にラベル数が多い画像数は少なくなります。
alphabetical_labels = {'happy', 'sad'}; % ラベル例
labels = trainStore.Labels; % ラベルのリスト（修正前）
labelCounts = countEachLabel(trainStore); % ラベルの数（修正前）
labelCounts = labelCounts.Count; % ラベルの数（修正前）
weights = labelCounts/sum(labelCounts); % ラベル数の割合
weights = weights.^(-1); % の逆数（ラベルを増やす割合）
trainFiles = trainStore.Files; % 画像ファイルへのパス
bootstrapSize = round(length(trainFiles) * bootstrap_factor); % ラベル数合わせ後の画像総数
weightVec = zeros(bootstrapSize,1); % ランダムサンプリングに使用する、重みのためのベクトル（数が少ないラベルには、大きな値が付く処理を下で）
for labidx = 1:length(alphabetical_labels) %
    index = labels == alphabetical_labels(labidx);
    werightVec(index) = weights(labidx);
end
% 重み付きランダムサンプリング（置換ナシ）
Bootstrap = datasample(trainFiles, bootstrapSize, 'Weights', weightVec);
bootStrapTrainStore = imageDatastore(Bootstrap, 'LabelSource', 'foldernames', IncludeSubfolders', true);

参考になれば。

Kenta il 11 Lug 2020

こちらの例もあります。上の方法は、確率的におなじ数になるようにするのに対し、こちらは、最も頻繁に現れるクラスの画像数を数えて、その枚数になるよう調整します。https://jp.mathworks.com/matlabcentral/fileexchange/78020-oversampling-for-deep-learning-classification-example

Accedi per commentare.

ImageDatastore オブジェクトでイメージの水増しの前処理について

0 Commenti
Mostra -2 commenti meno recentiNascondi -2 commenti meno recenti

Risposta accettata

2 Commenti
Mostra NessunoNascondi Nessuno

Più risposte (0)

Vedere anche

Categorie

Tag

Prodotti

Release

Community Treasure Hunt

ImageDatastore オブジェクトでイメージの水増しの前処理について

0 Commenti Mostra -2 commenti meno recentiNascondi -2 commenti meno recenti

Risposta accettata

2 Commenti Mostra NessunoNascondi Nessuno

Più risposte (0)

Vedere anche

Categorie

Tag

Prodotti

Release

Community Treasure Hunt

0 Commenti
Mostra -2 commenti meno recentiNascondi -2 commenti meno recenti

2 Commenti
Mostra NessunoNascondi Nessuno