Implementation - Find and Fill Method for Dropout Layer #3684

MarkFischinger · 2024-04-10T13:42:25Z

Following our discussion in issue #3662, I've implemented the new algorithm in the dropout layer.

rcurtin

Should we split this one into two implementations, like #3683? The numbers in #3662 show that this implementation is actually a slowdown, although the implementation here is general enough to be used with Bandicoot too.

(Requesting changes just because I don't want to merge a slowdown without a bit of discussion)

shrit · 2024-04-11T14:52:21Z

it is up to you, usually the dropout is happening once to my understanding per epoch, while there is a slow down, but it is not going to affect the general speed of training.
We can do the split, but I would trade the code maintenance and size compared to the number of nanoseconds we will lose 😄

rcurtin · 2024-04-11T15:05:52Z

It will happen every step of the forward pass, so the slowdown may be noticeable. One quick test would just be to run the mnist_simple and/or mnist_cnn examples with both implementations and see what the overall time per epoch is.

shrit · 2024-04-11T15:30:09Z

okay it makes sense then to do the benchmark
@MarkFischinger I hope you have some time for this if you do not mind

MarkFischinger · 2024-04-11T15:58:29Z

@rcurtin @shrit Absolutely, I'm on the same page. I was actually leaning towards running additional benchmarks for that exact reason before. I've got a few ideas to potentially optimize the algorithm, and I'm eager to test them out in some upcoming benchmarks. Fingers crossed, I should be able to run them later today :)

MarkFischinger · 2024-04-12T23:31:04Z

@rcurtin @shrit The new algorithm appears slightly faster than the old one when processing small matrices but significantly slower with larger matrices (also visible in the graph in the discussion before). If it's not necessary to replace it, I wouldn't. However, if we need to eliminate the transform function, this could be a viable alternative.

For mnist_simple:

New Algorithm:
    Duration: 78910ms
    Loss: 0.0331713
Old Algorithm:
    Duration: 79709ms
    Loss: 0.0331713

For mnist_cnn:

Old Algorithm:
    Duration: 403866ms
    Loss: 0.919632 (Epoch: 32.9s, Step: 170ms)
    Validation loss: 2894.57
New Algorithm:
    Duration: 415430ms
    Loss: 0.919632 (Epoch: 31.93s, Step: 174ms)
    Validation loss: 2894.57
    Accuracy: Train = 86.9947%, Valid = 86.5476%

shrit · 2024-04-14T10:17:04Z

@MarkFischinger Okay, I see the benchmark, in this case it would make sense to have both implementation. Please update this PR by following a similar structure that is done in this #3683

MarkFischinger · 2024-04-14T16:43:07Z

@shrit In the latest commit, I've implemented both algorithms using a structure similar to what you did in #3683. I have a question about that: Why do we wrap the entire function within the #ifndef directive? Could it be placed inside the function instead? Is there a specific reason for this that I might not know about yet? I'd appreciate understanding your reasoning behind that :)

shrit · 2024-04-14T20:38:05Z

@MarkFischinger the other implementation that you added is for bandicoot library. It is a linear algebra library that we are developing (it has the same API as armadillo) to provide GPU acceleration for matrices operations.
The idea of putting all of that in a #ifdef is to have no dependency on bandicoot, otherwise, you will not be able to run mlpack or the tests without including bandicoot header which is something we are trying to avoid.
Technically, in the future, the user only needs to define MLPACK_HAS_COOT and then the makefile will look for bandicoot.
Our objective is too not add dependencies to mlpack, since they add a massive burden regarding maintenance and compilation, and when necessary we are only adding single file header only dependencies.

MarkFischinger · 2024-04-14T22:18:14Z

@shrit Thanks for your response! :) Sorry if my initial question wasn’t as clear. I've understood the general concept, but I'm specifically curious about the decision to place the ifdef outside the function. Is this meant to make it easier to distinguish between GPU and CPU operations, thereby improving maintainability?

Because my first thought was to implement it directly inside like this (for #3683 for example):

template<typename MatType>
void LogSoftMaxType<MatType>::ForwardImpl(const MatType& input,
                                          MatType& output)
{
  MatType maxInput = repmat(max(input), input.n_rows, 1);
  output = (maxInput - input);

  ApplyExponentialFunction(output);

  maxInput.each_row()  = log(sum(output));
  output = input - maxInput;
}

template<typename MatType>
void LogSoftMaxType<MatType>::ApplyExponentialFunction(MatType& output)
{
#ifdef MLPACK_HAS_COOT // <-- Inside the function like this 
  if constexpr (coot::is_coot_type<MatType>::value)
  {
    output = exp(output * -1);
  }
  else
#endif
  {
    // Approximation of the base-e exponential function. Credits to Leon Bottou.
    output.transform([](double x)
    {
      static constexpr double A0 = 1.0, A1 = 0.125, A2 = 0.0078125, A3 = 0.00032552083, A4 = 1.0172526e-5;
      if (x < 13.0)
      {
        double y = A0   x * (A1   x * (A2   x * (A3   x * A4)));
        y *= y; y *= y; y *= y; return 1 / y;
      }
      return 0.0;
    });
  }
}

I couldn't find any best practices in C on this topic on the web. Do you know which is the preferred implementation?

shrit · 2024-04-15T15:33:21Z

@MarkFischinger we can not use if constexpr because we do not support yet C 17

rcurtin

Looking good, just some comments on the multiple implementations whenever you have a chance. 👍

rcurtin · 2024-04-16T20:35:01Z

src/mlpack/methods/ann/layer/dropout.hpp

+   * @param input Input data used for evaluating the specified function.
+   * @param output Resulting output activation.
+   */
+  void ForwardImpl(const MatType& input, MatType& output);


You'll have to use SFINAE here, e.g. add an argument like const typename std::enable_if_t<arma::is_arma_type<MatType>::value>::type* = 0 (I just did that from memory, it may not be exactly right).

rcurtin · 2024-04-16T20:35:11Z

src/mlpack/methods/ann/layer/dropout_impl.hpp

@@ -74,10  74,18 @@ DropoutType<MatType>::operator=(DropoutType&& other)
  return *this;
 }

+


Suggested change

No need for an extra line 👍

Also, this one is still missing.

rcurtin · 2024-04-16T20:36:57Z

src/mlpack/methods/ann/layer/dropout.hpp

+   */
+  void ForwardImpl(const MatType& input, MatType& output);
+
+  #ifdef MLPACK_HAS_COOT


I don't think we actually need this guard here: we want to use the optimized implementation when we have an Armadillo matrix, and the general implementation otherwise. So long as the second implementation of ForwardImpl() doesn't use the name coot:: inside of it (and it is possible to avoid that), we should be able to avoid the need for MLPACK_HAS_COOT, and just use SFINAE here too with a negated condition (i.e. std::enable_if_t<!arma::is_arma_type< ...).

rcurtin · 2024-04-16T20:37:23Z

src/mlpack/methods/ann/layer/dropout_impl.hpp

+    mask.randu(input.n_rows, input.n_cols);
+    arma::uvec indices = arma::find(mask > ratio);
+    mask.zeros();
+    mask.elem(indices).fill(1);


Suggested change

mask.elem(indices).fill(1);

mask.elem(find(mask > ratio)).fill(1);

Would this work? It avoids the use of arma:: here, and would make this function fully general. But I think then we would need to find a way to get rid of the .zeros() call... maybe can we just do mask = (mask > ratio)? Or something along those lines? (It would be interesting to see the speed of that approach.)

@rcurtin You can take a look at my implementation in the latest commit. It should be a bit faster and is fully functional. I avoided using arma:: to keep it more general.

Nice, it looks good to me! Thanks for doing that.

shrit · 2024-04-23T13:36:33Z

src/mlpack/methods/ann/layer/dropout_impl.hpp

@@ -74,10  74,18 @@ DropoutType<MatType>::operator=(DropoutType&& other)
  return *this;
 }

+


Also, this one is still missing.

shrit · 2024-04-23T13:37:52Z

src/mlpack/methods/ann/layer/dropout.hpp

+  template<typename T = MatType, typename std::enable_if_t<arma::is_arma_type<T>::value, int> = 0>
+  void ForwardImpl(const T& input, T& output);
+
+  /**
+   * General implementation of the forward pass of the dropout layer.
+   *
+   * @param input Input data used for evaluating the specified function.
+   * @param output Resulting output activation.
+   */
+  template<typename T = MatType, typename std::enable_if_t<!arma::is_arma_type<T>::value, int> = 0>
+  void ForwardImpl(const T& input, T& output);


@MarkFischinger
Looks better, just use directly MatType instead of T. It is better for readability because T does not mean a lot.
Also, could you break the line? mlpack code base is usually under 80 lines of code.
Also, the syntax is correct, I would remove int and instead add a * after > so the signature looks as follows:

typename std::enable_if_t<!arma::is_arma_type<T>::value>* = 0>

The reason for this is to have the same signature over all the code base, making it easier for everyone.

mlpack-bot · 2024-05-24T13:45:47Z

This issue has been automatically marked as stale because it has not had any recent activity. It will be closed in 7 days if no further activity occurs. Thank you for your contributions! 👍

shrit · 2024-06-25T08:10:16Z

src/mlpack/methods/ann/layer/dropout_impl.hpp

+    for (size_t i = 0; i < input.n_rows;   i) {
+      for (size_t j = 0; j < input.n_cols;   j) {
+        mask(i, j) = (mask(i, j) > this->ratio) ? 1.0 : 0.0;
+      }


Suggested change

for (size_t i = 0; i < input.n_rows; i) {

for (size_t j = 0; j < input.n_cols; j) {

mask(i, j) = (mask(i, j) > this->ratio) ? 1.0 : 0.0;

}

for (size_t i = 0; i < input.n_rows; i)

{

for (size_t j = 0; j < input.n_cols; j)

{

mask(i, j) = (mask(i, j) > this->ratio) ? 1.0 : 0.0;

}

shrit · 2024-06-25T08:10:50Z

src/mlpack/methods/ann/layer/dropout_impl.hpp

+    for (size_t i = 0; i < input.n_rows;   i) {
+      for (size_t j = 0; j < input.n_cols;   j) {
+        mask(i, j) = (mask(i, j) > this->ratio) ? 1.0 : 0.0;
+      }
+    }


Could you update it as above to me in mlpack style ?

shrit · 2024-06-27T12:09:18Z

src/mlpack/methods/ann/layer/dropout_impl.hpp

+    mask.randu(input.n_rows, input.n_cols);
+    #pragma omp parallel for collapse(2)
+    for (size_t i = 0; i < input.n_rows;   i)
+    {
+      for (size_t j = 0; j < input.n_cols;   j)
+      {
+        mask(i, j) = (mask(i, j) > this->ratio) ? 1.0 : 0.0;
+      }
+    }
+    output = input % mask * this->scale;
+  }
+}
+


@MarkFischinger It looks to me as if it is the same implementation when using Armadillo or Bandicoot.
It this is the case, then it is fine, no worries, I would just keep one of them, and remove SFINAE and the other overload since we no longer use the transform function.
Let me know if I overlooked something

This would be a really bad implementation for Bandicoot, but we can handle that some other time. (Elementwise accesses cause GPU/CPU transfers and are painful.)

MarkFischinger · 2024-06-27T12:56:48Z

The updated Forward Dropout Layer implementation shows approximately $23.2$ % increase in speed on the benchmark :)

Here are the benchmarks for the OpenMP Find and Fill on an MNIST Benchmark:

756/756 [====================================================================================================] 100% - 69.8892s/epoch; 92ms/step; loss: 9.42397
Validation loss: 10191.9.
Time elapsed: 72218ms
Accuracy: train = 70.4048%,      valid = 70.1429%

Original Version

756/756 [====================================================================================================] 100% - 89.117s/epoch; 117ms/step; loss: 9.42397
Validation loss: 10191.9.
Time elapsed: 91523ms
Accuracy: train = 70.4048%,      valid = 70.1429%

shrit · 2024-07-01T09:58:43Z

@MarkFischinger this is still failing can you be sure that this is compiling on your machine ?

shrit · 2024-07-04T14:08:35Z

@MarkFischinger can you checkout to the master and pull out the latest version, then you go back to this branch and merge it with the master. It will show you the conflict it has with the master so you can resolve it and push it here again.

…and dropout_impl.hpp

shrit

@rcurtin nothing to add on my end, let us get this one merged

src/mlpack/methods/ann/layer/dropout_impl.hpp

rcurtin

Awesome, thanks for the fixes 👍

mlpack-bot bot added s: needs review s: unanswered s: unlabeled labels Apr 10, 2024

shrit approved these changes Apr 10, 2024

View reviewed changes

shrit removed s: unanswered s: unlabeled labels Apr 11, 2024

rcurtin requested changes Apr 11, 2024

View reviewed changes

shrit approved these changes Apr 16, 2024

View reviewed changes

rcurtin reviewed Apr 16, 2024

View reviewed changes

shrit requested changes Apr 23, 2024

View reviewed changes

mlpack-bot bot added the s: stale label May 24, 2024

shrit added s: keep open and removed s: stale labels May 27, 2024

shrit requested changes Jun 25, 2024

View reviewed changes

shrit requested changes Jun 27, 2024

View reviewed changes

feat: updated Dropout Forward with find and fill

5b4f2fb

MarkFischinger and others added 19 commits July 5, 2024 21:22

struct: implemented both versions of the algorithm

adf63a9

fixed structure

4747b7c

refactor dropout layer implementation

10d44eb

fix

2608c9d

remove line break

3db482e

speedup using openMP and structure fixes

c5f6948

resolved conflicts

356036d

updated to mlpack style

d215a96

removed unwanted files

f517f67

Restore accidentally deleted files and commit changes to dropout.hpp …

40a03c7

…and dropout_impl.hpp

remove unwanted files

7a927b9

restore permissions

158406d

restore

7f088af

fix: ForwardImplementation template

c3d54b3

Update HISTORY.md

5fe7402

Update HISTORY.md

b82f017

Update HISTORY.md

4de58d5

struct: simplified the forward function

6b8f0c1

fix

5027de1

MarkFischinger force-pushed the feat/fill_and_fill_implemenation branch from 5167b23 to 5027de1 Compare July 5, 2024 19:23

Merge branch 'master' into feat/fill_and_fill_implemenation

29d6b0b

shrit approved these changes Jul 6, 2024

View reviewed changes

rcurtin reviewed Jul 6, 2024

View reviewed changes

src/mlpack/methods/ann/layer/dropout_impl.hpp Show resolved Hide resolved

Update dropout_impl.hpp

715c71f

rcurtin approved these changes Jul 6, 2024

View reviewed changes

shrit merged commit 5580848 into mlpack:master Jul 7, 2024
19 of 20 checks passed

This was referenced Sep 16, 2024

Release version 4.4.1 #3797

Closed

Release version 4.5.0 #3798

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implementation - Find and Fill Method for Dropout Layer #3684

Implementation - Find and Fill Method for Dropout Layer #3684

MarkFischinger commented Apr 10, 2024

rcurtin left a comment

shrit commented Apr 11, 2024

rcurtin commented Apr 11, 2024

shrit commented Apr 11, 2024

MarkFischinger commented Apr 11, 2024

MarkFischinger commented Apr 12, 2024

shrit commented Apr 14, 2024

MarkFischinger commented Apr 14, 2024

shrit commented Apr 14, 2024

MarkFischinger commented Apr 14, 2024 •

edited

Loading

shrit commented Apr 15, 2024

rcurtin left a comment

rcurtin Apr 16, 2024

rcurtin Apr 16, 2024

shrit Apr 23, 2024

rcurtin Apr 16, 2024

rcurtin Apr 16, 2024

MarkFischinger Apr 22, 2024

rcurtin Apr 24, 2024

shrit Apr 23, 2024

shrit Apr 23, 2024

mlpack-bot bot commented May 24, 2024

shrit Jun 25, 2024

shrit Jun 25, 2024

shrit Jun 27, 2024

rcurtin Jun 27, 2024

MarkFischinger commented Jun 27, 2024

shrit commented Jul 1, 2024

shrit commented Jul 4, 2024

shrit left a comment

rcurtin left a comment

		@@ -74,10 74,18 @@ DropoutType<MatType>::operator=(DropoutType&& other)
		return *this;
		}

	mask.elem(indices).fill(1);
	mask.elem(find(mask > ratio)).fill(1);

Implementation - Find and Fill Method for Dropout Layer #3684

Implementation - Find and Fill Method for Dropout Layer #3684

Conversation

MarkFischinger commented Apr 10, 2024

rcurtin left a comment

Choose a reason for hiding this comment

shrit commented Apr 11, 2024

rcurtin commented Apr 11, 2024

shrit commented Apr 11, 2024

MarkFischinger commented Apr 11, 2024

MarkFischinger commented Apr 12, 2024

shrit commented Apr 14, 2024

MarkFischinger commented Apr 14, 2024

shrit commented Apr 14, 2024

MarkFischinger commented Apr 14, 2024 • edited Loading

shrit commented Apr 15, 2024

rcurtin left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mlpack-bot bot commented May 24, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

MarkFischinger commented Jun 27, 2024

shrit commented Jul 1, 2024

shrit commented Jul 4, 2024

shrit left a comment

Choose a reason for hiding this comment

rcurtin left a comment

Choose a reason for hiding this comment

MarkFischinger commented Apr 14, 2024 •

edited

Loading