Parametric Function Classes¶
- class nnabla.experimental.parametric_function_class.affine.Affine(n_inmaps, n_outmaps, base_axis=1, w_init=None, b_init=None, fix_parameters=False, rng=None, with_bias=True)[source]¶
The affine layer, also known as the fully connected layer. Computes
\[{\mathbf y} = {\mathbf A} {\mathbf x} + {\mathbf b}.\]where \({\mathbf x}, {\mathbf y}\) are the inputs and outputs respectively, and \({\mathbf A}, {\mathbf b}\) are constants.
- Parameters
inp (Variable) – Input N-D array with shape (\(M_0 \times \ldots \times M_{B-1} \times D_B \times \ldots \times D_N\)). Dimensions before and after base_axis are flattened as if it is a matrix.
n_outmaps (
int
ortuple
ofint
) – Number of output neurons per data.base_axis (int) – Dimensions up to
base_axis
are treated as the sample dimensions.w_init (
nnabla.initializer.BaseInitializer
ornumpy.ndarray
) – Initializer for weight. By default, it is initialized withnnabla.initializer.UniformInitializer
within the range determined bynnabla.initializer.calc_uniform_lim_glorot
.b_init (
nnabla.initializer.BaseInitializer
ornumpy.ndarray
) – Initializer for bias. By default, it is initialized with zeros ifwith_bias
isTrue
.fix_parameters (bool) – When set to
True
, the weights and biases will not be updated.rng (numpy.random.RandomState) – Random generator for Initializer.
with_bias (bool) – Specify whether to include the bias term.
- Returns
\((B + 1)\)-D array. (\(M_0 \times \ldots \times M_{B-1} \times L\))f
- Return type
- nnabla.experimental.parametric_function_class.affine.Linear¶
alias of
nnabla.experimental.parametric_function_class.affine.Affine
- class nnabla.experimental.parametric_function_class.convolution.Convolution(inmaps, outmaps, kernel, pad=None, stride=None, dilation=None, group=1, w_init=None, b_init=None, base_axis=1, fix_parameters=False, rng=None, with_bias=True)[source]¶
N-D Convolution with a bias term.
For Dilated Convolution (a.k.a. Atrous Convolution), refer to:
Chen et al., DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs. https://arxiv.org/abs/1606.00915
Yu et al., Multi-Scale Context Aggregation by Dilated Convolutions. https://arxiv.org/abs/1511.07122
Note
Convolution is a computationally intensive operation that should preferably be run with the
cudnn
backend. NNabla then uses CuDNN library functions to determine and cache the fastest algorithm for the given set of convolution parameters, which results in additional memory consumption which may pose a problem for GPUs with insufficient memory size. In that case, theNNABLA_CUDNN_WORKSPACE_LIMIT
environment variable can be used to restrict the choice of algorithms to those that fit the given workspace memory limit, expressed in bytes. In some cases it may also be desired to restrict the automatic search to algorithms that produce deterministic (reproducable) results. This can be requested by setting the the environment variableNNABLA_CUDNN_DETERMINISTIC
to a non-zero value.- Parameters
inp (Variable) – N-D array.
outmaps (int) – Number of convolution kernels (which is equal to the number of output channels). For example, to apply convolution on an input with 16 types of filters, specify 16.
kernel (
tuple
ofint
) – Convolution kernel size. For example, to apply convolution on an image with a 3 (height) by 5 (width) two-dimensional kernel, specify (3,5).group (int) – Number of groups of channels. This makes connections across channels more sparse by grouping connections along map direction.
w_init (
nnabla.initializer.BaseInitializer
ornumpy.ndarray
) – Initializer for weight. By default, it is initialized withnnabla.initializer.UniformInitializer
within the range determined bynnabla.initializer.calc_uniform_lim_glorot
.b_init (
nnabla.initializer.BaseInitializer
ornumpy.ndarray
) – Initializer for bias. By default, it is initialized with zeros ifwith_bias
isTrue
.base_axis (int) – Dimensions up to
base_axis
are treated as the sample dimensions.fix_parameters (bool) – When set to
True
, the weights and biases will not be updated.rng (numpy.random.RandomState) – Random generator for Initializer.
with_bias (bool) – Specify whether to include the bias term.
- Returns
N-D array. See
convolution
for the output shape.- Return type
- nnabla.experimental.parametric_function_class.convolution.Conv1d¶
alias of
nnabla.experimental.parametric_function_class.convolution.Convolution
- nnabla.experimental.parametric_function_class.convolution.Conv2d¶
alias of
nnabla.experimental.parametric_function_class.convolution.Convolution
- nnabla.experimental.parametric_function_class.convolution.Conv3d¶
alias of
nnabla.experimental.parametric_function_class.convolution.Convolution
- nnabla.experimental.parametric_function_class.convolution.ConvNd¶
alias of
nnabla.experimental.parametric_function_class.convolution.Convolution
- class nnabla.experimental.parametric_function_class.deconvolution.Deconvolution(inmaps, outmaps, kernel, pad=None, stride=None, dilation=None, group=1, w_init=None, b_init=None, base_axis=1, fix_parameters=False, rng=None, with_bias=True)[source]¶
Deconvolution layer.
- Parameters
inp (Variable) – N-D array.
outmaps (int) – Number of deconvolution kernels (which is equal to the number of output channels). For example, to apply deconvolution on an input with 16 types of filters, specify 16.
kernel (
tuple
ofint
) – Convolution kernel size. For example, to apply deconvolution on an image with a 3 (height) by 5 (width) two-dimensional kernel, specify (3,5).group (int) – Number of groups of channels. This makes connections across channels sparser by grouping connections along map direction.
w_init (
nnabla.initializer.BaseInitializer
ornumpy.ndarray
) – Initializer for weight. By default, it is initialized withnnabla.initializer.UniformInitializer
within the range determined bynnabla.initializer.calc_uniform_lim_glorot
.b_init (
nnabla.initializer.BaseInitializer
ornumpy.ndarray
) – Initializer for bias. By default, it is initialized with zeros ifwith_bias
isTrue
.base_axis (int) – Dimensions up to
base_axis
are treated as the sample dimensions.fix_parameters (bool) – When set to
True
, the weights and biases will not be updated.rng (numpy.random.RandomState) – Random generator for Initializer.
with_bias (bool) – Specify whether to include the bias term.
- Returns
N-D array. See
deconvolution
for the output shape.- Return type
- nnabla.experimental.parametric_function_class.deconvolution.Deconv1d¶
alias of
nnabla.experimental.parametric_function_class.deconvolution.Deconvolution
- nnabla.experimental.parametric_function_class.deconvolution.Deconv2d¶
alias of
nnabla.experimental.parametric_function_class.deconvolution.Deconvolution
- nnabla.experimental.parametric_function_class.deconvolution.Deconv3d¶
alias of
nnabla.experimental.parametric_function_class.deconvolution.Deconvolution
- nnabla.experimental.parametric_function_class.deconvolution.DeconvNd¶
alias of
nnabla.experimental.parametric_function_class.deconvolution.Deconvolution
- class nnabla.experimental.parametric_function_class.batch_normalization.BatchNormalization(n_features, n_dims, axes=[1], decay_rate=0.9, eps=1e-05, batch_stat=True, output_stat=False, fix_parameters=False, param_init=None)[source]¶
Batch normalization layer.
\[\begin{split}\begin{array}{lcl} \mu &=& \frac{1}{M} \sum x_i\\ \sigma^2 &=& \frac{1}{M} \left(\sum x_i - \mu\right)^2\\ \hat{x}_i &=& \frac{x_i - \mu}{\sqrt{\sigma^2 + \epsilon }}\\ y_i &= & \hat{x}_i \gamma + \beta. \end{array}\end{split}\]where \(x_i, y_i\) are the inputs. In testing, the mean and variance computed by moving average calculated during training are used.
- Parameters
inp (Variable) – N-D array of input.
axes (
tuple
ofint
) – Mean and variance for each element inaxes
are calculated using elements on the rest axes. For example, if an input is 4 dimensions, andaxes
is[1]
, batch mean is calculated asnp.mean(inp.d, axis=(0, 2, 3), keepdims=True)
(using numpy expression as an example).decay_rate (float) – Decay rate of running mean and variance.
eps (float) – Tiny value to avoid zero division by std.
batch_stat (bool) – Use mini-batch statistics rather than running ones.
output_stat (bool) – Output batch mean and variance.
fix_parameters (bool) – When set to
True
, the beta and gamma will not be updated.param_init (dict) – Parameter initializers can be set with a dict. A key of the dict must be
'beta'
,'gamma'
,'mean'
or'var'
. A value of the dict must be anInitializer
or anumpy.ndarray
. E.g.{'beta': ConstantInitializer(0), 'gamma': np.ones(gamma_shape) * 2}
.
- Returns
N-D array.
- Return type
References
Ioffe and Szegedy, Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift. https://arxiv.org/abs/1502.03167
The shape of parameters has the same number of dimensions with the input data, and the shapes in
axes
has the same dimensions with the input, while the rest has1
. If an input is 4-dim andaxes=[1]
, the parameter shape will beparam_shape = np.mean(inp.d, axis=(0, 2, 3), keepdims=True).shape
(using numpy expression as an example).
- class nnabla.experimental.parametric_function_class.batch_normalization.BatchNorm1d(n_features, axes=[1], decay_rate=0.9, eps=1e-05, batch_stat=True, output_stat=False, fix_parameters=False, param_init=None)[source]¶
Batch normalization layer for 3d-Array or 3d-Variable. This is typically used together with
Conv1d
.\[\begin{split}\begin{array}{lcl} \mu &=& \frac{1}{M} \sum x_i\\ \sigma^2 &=& \frac{1}{M} \left(\sum x_i - \mu\right)^2\\ \hat{x}_i &=& \frac{x_i - \mu}{\sqrt{\sigma^2 + \epsilon }}\\ y_i &= & \hat{x}_i \gamma + \beta. \end{array}\end{split}\]where \(x_i, y_i\) are the inputs. In testing, the mean and variance computed by moving average calculated during training are used.
- Parameters
inp (Variable) – N-D array of input.
axes (
tuple
ofint
) – Mean and variance for each element inaxes
are calculated using elements on the rest axes. For example, if an input is 4 dimensions, andaxes
is[1]
, batch mean is calculated asnp.mean(inp.d, axis=(0, 2, 3), keepdims=True)
(using numpy expression as an example).decay_rate (float) – Decay rate of running mean and variance.
eps (float) – Tiny value to avoid zero division by std.
batch_stat (bool) – Use mini-batch statistics rather than running ones.
output_stat (bool) – Output batch mean and variance.
fix_parameters (bool) – When set to
True
, the beta and gamma will not be updated.param_init (dict) – Parameter initializers can be set with a dict. A key of the dict must be
'beta'
,'gamma'
,'mean'
or'var'
. A value of the dict must be anInitializer
or anumpy.ndarray
. E.g.{'beta': ConstantInitializer(0), 'gamma': np.ones(gamma_shape) * 2}
.
- Returns
N-D array.
- Return type
References
Ioffe and Szegedy, Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift. https://arxiv.org/abs/1502.03167
The shape of parameters has the same number of dimensions with the input data, and the shapes in
axes
has the same dimensions with the input, while the rest has1
. If an input is 4-dim andaxes=[1]
, the parameter shape will beparam_shape = np.mean(inp.d, axis=(0, 2, 3), keepdims=True).shape
(using numpy expression as an example).
- class nnabla.experimental.parametric_function_class.batch_normalization.BatchNorm2d(n_features, axes=[1], decay_rate=0.9, eps=1e-05, batch_stat=True, output_stat=False, fix_parameters=False, param_init=None)[source]¶
Batch normalization layer for 4d-Array or 4d-Variable. This is typically used together with
Conv2d
.\[\begin{split}\begin{array}{lcl} \mu &=& \frac{1}{M} \sum x_i\\ \sigma^2 &=& \frac{1}{M} \left(\sum x_i - \mu\right)^2\\ \hat{x}_i &=& \frac{x_i - \mu}{\sqrt{\sigma^2 + \epsilon }}\\ y_i &= & \hat{x}_i \gamma + \beta. \end{array}\end{split}\]where \(x_i, y_i\) are the inputs. In testing, the mean and variance computed by moving average calculated during training are used.
- Parameters
inp (Variable) – N-D array of input.
axes (
tuple
ofint
) – Mean and variance for each element inaxes
are calculated using elements on the rest axes. For example, if an input is 4 dimensions, andaxes
is[1]
, batch mean is calculated asnp.mean(inp.d, axis=(0, 2, 3), keepdims=True)
(using numpy expression as an example).decay_rate (float) – Decay rate of running mean and variance.
eps (float) – Tiny value to avoid zero division by std.
batch_stat (bool) – Use mini-batch statistics rather than running ones.
output_stat (bool) – Output batch mean and variance.
fix_parameters (bool) – When set to
True
, the beta and gamma will not be updated.param_init (dict) – Parameter initializers can be set with a dict. A key of the dict must be
'beta'
,'gamma'
,'mean'
or'var'
. A value of the dict must be anInitializer
or anumpy.ndarray
. E.g.{'beta': ConstantInitializer(0), 'gamma': np.ones(gamma_shape) * 2}
.
- Returns
N-D array.
- Return type
References
Ioffe and Szegedy, Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift. https://arxiv.org/abs/1502.03167
The shape of parameters has the same number of dimensions with the input data, and the shapes in
axes
has the same dimensions with the input, while the rest has1
. If an input is 4-dim andaxes=[1]
, the parameter shape will beparam_shape = np.mean(inp.d, axis=(0, 2, 3), keepdims=True).shape
(using numpy expression as an example).
- class nnabla.experimental.parametric_function_class.batch_normalization.BatchNorm3d(n_features, axes=[1], decay_rate=0.9, eps=1e-05, batch_stat=True, output_stat=False, fix_parameters=False, param_init=None)[source]¶
Batch normalization layer for 5d-Array or 5d-Variable. This is typically used together with
Conv3d
.\[\begin{split}\begin{array}{lcl} \mu &=& \frac{1}{M} \sum x_i\\ \sigma^2 &=& \frac{1}{M} \left(\sum x_i - \mu\right)^2\\ \hat{x}_i &=& \frac{x_i - \mu}{\sqrt{\sigma^2 + \epsilon }}\\ y_i &= & \hat{x}_i \gamma + \beta. \end{array}\end{split}\]where \(x_i, y_i\) are the inputs. In testing, the mean and variance computed by moving average calculated during training are used.
- Parameters
inp (Variable) – N-D array of input.
axes (
tuple
ofint
) – Mean and variance for each element inaxes
are calculated using elements on the rest axes. For example, if an input is 4 dimensions, andaxes
is[1]
, batch mean is calculated asnp.mean(inp.d, axis=(0, 2, 3), keepdims=True)
(using numpy expression as an example).decay_rate (float) – Decay rate of running mean and variance.
eps (float) – Tiny value to avoid zero division by std.
batch_stat (bool) – Use mini-batch statistics rather than running ones.
output_stat (bool) – Output batch mean and variance.
fix_parameters (bool) – When set to
True
, the beta and gamma will not be updated.param_init (dict) – Parameter initializers can be set with a dict. A key of the dict must be
'beta'
,'gamma'
,'mean'
or'var'
. A value of the dict must be anInitializer
or anumpy.ndarray
. E.g.{'beta': ConstantInitializer(0), 'gamma': np.ones(gamma_shape) * 2}
.
- Returns
N-D array.
- Return type
References
Ioffe and Szegedy, Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift. https://arxiv.org/abs/1502.03167
The shape of parameters has the same number of dimensions with the input data, and the shapes in
axes
has the same dimensions with the input, while the rest has1
. If an input is 4-dim andaxes=[1]
, the parameter shape will beparam_shape = np.mean(inp.d, axis=(0, 2, 3), keepdims=True).shape
(using numpy expression as an example).
- class nnabla.experimental.parametric_function_class.embed.Embed(n_inputs, n_features, w_init=None, fix_parameters=False)[source]¶
Embed.
Embed slices a matrix/tensor with indexing array/tensor. Weights are initialized with
nnabla.initializer.UniformInitializer
within the range of \(-\sqrt{3}\) and \(\sqrt{3}\).- Parameters
- Returns
Output with shape \((I_0, ..., I_N, W_1, ..., W_M)\)
- Return type