During training, randomly zeroes some of the elements of the input
tensor with probability p
using samples from a Bernoulli
distribution. Each channel will be zeroed out independently on every forward
call.
Arguments
- p
probability of an element to be zeroed. Default: 0.5
- inplace
If set to
TRUE
, will do this operation in-place. Default:FALSE
.
Details
This has proven to be an effective technique for regularization and preventing the co-adaptation of neurons as described in the paper Improving neural networks by preventing co-adaptation of feature detectors.
Furthermore, the outputs are scaled by a factor of :math:\frac{1}{1-p}
during
training. This means that during evaluation the module simply computes an
identity function.
Shape
Input: \((*)\). Input can be of any shape
Output: \((*)\). Output is of the same shape as input
Examples
if (torch_is_installed()) {
m <- nn_dropout(p = 0.2)
input <- torch_randn(20, 16)
output <- m(input)
}