During training, randomly zeroes some of the elements of the input
tensor with probability `p`

using samples from a Bernoulli
distribution. Each channel will be zeroed out independently on every forward
call.

nn_dropout(p = 0.5, inplace = FALSE)

p | probability of an element to be zeroed. Default: 0.5 |
---|---|

inplace | If set to |

This has proven to be an effective technique for regularization and preventing the co-adaptation of neurons as described in the paper Improving neural networks by preventing co-adaptation of feature detectors.

Furthermore, the outputs are scaled by a factor of :math:`\frac{1}{1-p}`

during
training. This means that during evaluation the module simply computes an
identity function.

Input: \((*)\). Input can be of any shape

Output: \((*)\). Output is of the same shape as input

if (torch_is_installed()) { m <- nn_dropout(p = 0.2) input <- torch_randn(20, 16) output <- m(input) }