MXNet
-
- ndarray
- ndarray.CachedOp
- ndarray.NDArray
- ndarray.Activation
- ndarray.BatchNorm
- ndarray.BatchNorm_v1
- ndarray.BilinearSampler
- ndarray.BlockGrad
- ndarray.CTCLoss
- ndarray.Cast
- ndarray.Concat
- ndarray.Convolution
- ndarray.Convolution_v1
- ndarray.Correlation
- ndarray.Crop
- ndarray.Custom
- ndarray.Deconvolution
- ndarray.Dropout
- ndarray.ElementWiseSum
- ndarray.Embedding
- ndarray.Flatten
- ndarray.FullyConnected
- ndarray.GridGenerator
- ndarray.GroupNorm
- ndarray.IdentityAttachKLSparseReg
- ndarray.InstanceNorm
- ndarray.L2Normalization
- ndarray.LRN
- ndarray.LayerNorm
- ndarray.LeakyReLU
- ndarray.LinearRegressionOutput
- ndarray.LogisticRegressionOutput
- ndarray.MAERegressionOutput
- ndarray.MakeLoss
- ndarray.Pad
- ndarray.Pooling
- ndarray.Pooling_v1
- ndarray.RNN
- ndarray.ROIPooling
- ndarray.Reshape
- ndarray.SVMOutput
- ndarray.SequenceLast
- ndarray.SequenceMask
- ndarray.SequenceReverse
- ndarray.SliceChannel
- ndarray.Softmax
- ndarray.SoftmaxActivation
- ndarray.SoftmaxOutput
- ndarray.SpatialTransformer
- ndarray.SwapAxis
- ndarray.UpSampling
- ndarray.abs
- ndarray.adam_update
- ndarray.add_n
- ndarray.all_finite
- ndarray.amp_cast
- ndarray.amp_multicast
- ndarray.arccos
- ndarray.arccosh
- ndarray.arcsin
- ndarray.arcsinh
- ndarray.arctan
- ndarray.arctanh
- ndarray.argmax
- ndarray.argmax_channel
- ndarray.argmin
- ndarray.argsort
- ndarray.batch_dot
- ndarray.batch_take
- ndarray.broadcast_add
- ndarray.broadcast_axes
- ndarray.broadcast_axis
- ndarray.broadcast_div
- ndarray.broadcast_equal
- ndarray.broadcast_greater
- ndarray.broadcast_greater_equal
- ndarray.broadcast_hypot
- ndarray.broadcast_lesser
- ndarray.broadcast_lesser_equal
- ndarray.broadcast_like
- ndarray.broadcast_logical_and
- ndarray.broadcast_logical_or
- ndarray.broadcast_logical_xor
- ndarray.broadcast_maximum
- ndarray.broadcast_minimum
- ndarray.broadcast_minus
- ndarray.broadcast_mod
- ndarray.broadcast_mul
- ndarray.broadcast_not_equal
- ndarray.broadcast_plus
- ndarray.broadcast_power
- ndarray.broadcast_sub
- ndarray.broadcast_to
- ndarray.cast
- ndarray.cast_storage
- ndarray.cbrt
- ndarray.ceil
- ndarray.choose_element_0index
- ndarray.clip
- ndarray.col2im
- ndarray.concat
- ndarray.cos
- ndarray.cosh
- ndarray.crop
- ndarray.ctc_loss
- ndarray.cumsum
- ndarray.degrees
- ndarray.depth_to_space
- ndarray.diag
- ndarray.dot
- ndarray.elemwise_add
- ndarray.elemwise_div
- ndarray.elemwise_mul
- ndarray.elemwise_sub
- ndarray.erf
- ndarray.erfinv
- ndarray.exp
- ndarray.expand_dims
- ndarray.expm1
- ndarray.fill_element_0index
- ndarray.fix
- ndarray.flatten
- ndarray.flip
- ndarray.floor
- ndarray.ftml_update
- ndarray.ftrl_update
- ndarray.gamma
- ndarray.gammaln
- ndarray.gather_nd
- ndarray.hard_sigmoid
- ndarray.identity
- ndarray.im2col
- ndarray.khatri_rao
- ndarray.lamb_update_phase1
- ndarray.lamb_update_phase2
- ndarray.linalg_det
- ndarray.linalg_extractdiag
- ndarray.linalg_extracttrian
- ndarray.linalg_gelqf
- ndarray.linalg_gemm
- ndarray.linalg_gemm2
- ndarray.linalg_inverse
- ndarray.linalg_makediag
- ndarray.linalg_maketrian
- ndarray.linalg_potrf
- ndarray.linalg_potri
- ndarray.linalg_slogdet
- ndarray.linalg_sumlogdiag
- ndarray.linalg_syrk
- ndarray.linalg_trmm
- ndarray.linalg_trsm
- ndarray.log
- ndarray.log10
- ndarray.log1p
- ndarray.log2
- ndarray.log_softmax
- ndarray.logical_not
- ndarray.make_loss
- ndarray.max
- ndarray.max_axis
- ndarray.mean
- ndarray.min
- ndarray.min_axis
- ndarray.moments
- ndarray.mp_lamb_update_phase1
- ndarray.mp_lamb_update_phase2
- ndarray.mp_nag_mom_update
- ndarray.mp_sgd_mom_update
- ndarray.mp_sgd_update
- ndarray.multi_all_finite
- ndarray.multi_lars
- ndarray.multi_mp_sgd_mom_update
- ndarray.multi_mp_sgd_update
- ndarray.multi_sgd_mom_update
- ndarray.multi_sgd_update
- ndarray.multi_sum_sq
- ndarray.nag_mom_update
- ndarray.nanprod
- ndarray.nansum
- ndarray.negative
- ndarray.norm
- ndarray.normal
- ndarray.one_hot
- ndarray.ones_like
- ndarray.pad
- ndarray.pick
- ndarray.preloaded_multi_mp_sgd_mom_update
- ndarray.preloaded_multi_mp_sgd_update
- ndarray.preloaded_multi_sgd_mom_update
- ndarray.preloaded_multi_sgd_update
- ndarray.prod
- ndarray.radians
- ndarray.random_exponential
- ndarray.random_gamma
- ndarray.random_generalized_negative_binomial
- ndarray.random_negative_binomial
- ndarray.random_normal
- ndarray.random_pdf_dirichlet
- ndarray.random_pdf_exponential
- ndarray.random_pdf_gamma
- ndarray.random_pdf_generalized_negative_binomial
- ndarray.random_pdf_negative_binomial
- ndarray.random_pdf_normal
- ndarray.random_pdf_poisson
- ndarray.random_pdf_uniform
- ndarray.random_poisson
- ndarray.random_randint
- ndarray.random_uniform
- ndarray.ravel_multi_index
- ndarray.rcbrt
- ndarray.reciprocal
- ndarray.relu
- ndarray.repeat
- ndarray.reset_arrays
- ndarray.reshape
- ndarray.reshape_like
- ndarray.reverse
- ndarray.rint
- ndarray.rmsprop_update
- ndarray.rmspropalex_update
- ndarray.round
- ndarray.rsqrt
- ndarray.sample_exponential
- ndarray.sample_gamma
- ndarray.sample_generalized_negative_binomial
- ndarray.sample_multinomial
- ndarray.sample_negative_binomial
- ndarray.sample_normal
- ndarray.sample_poisson
- ndarray.sample_uniform
- ndarray.scatter_nd
- ndarray.sgd_mom_update
- ndarray.sgd_update
- ndarray.shape_array
- ndarray.shuffle
- ndarray.sigmoid
- ndarray.sign
- ndarray.signsgd_update
- ndarray.signum_update
- ndarray.sin
- ndarray.sinh
- ndarray.size_array
- ndarray.slice
- ndarray.slice_axis
- ndarray.slice_like
- ndarray.smooth_l1
- ndarray.softmax
- ndarray.softmax_cross_entropy
- ndarray.softmin
- ndarray.softsign
- ndarray.sort
- ndarray.space_to_depth
- ndarray.split
- ndarray.sqrt
- ndarray.square
- ndarray.squeeze
- ndarray.stack
- ndarray.stop_gradient
- ndarray.sum
- ndarray.sum_axis
- ndarray.swapaxes
- ndarray.take
- ndarray.tan
- ndarray.tanh
- ndarray.tile
- ndarray.topk
- ndarray.transpose
- ndarray.trunc
- ndarray.uniform
- ndarray.unravel_index
- ndarray.where
- ndarray.zeros_like
- ndarray.concatenate
- ndarray.ones
- ndarray.add
- ndarray.arange
- ndarray.linspace
- ndarray.eye
- ndarray.divide
- ndarray.equal
- ndarray.full
- ndarray.greater
- ndarray.greater_equal
- ndarray.imdecode
- ndarray.lesser
- ndarray.lesser_equal
- ndarray.logical_and
- ndarray.logical_or
- ndarray.logical_xor
- ndarray.maximum
- ndarray.minimum
- ndarray.moveaxis
- ndarray.modulo
- ndarray.multiply
- ndarray.not_equal
- ndarray.onehot_encode
- ndarray.power
- ndarray.subtract
- ndarray.true_divide
- ndarray.waitall
- ndarray.histogram
- ndarray.split_v2
- ndarray.to_dlpack_for_read
- ndarray.to_dlpack_for_write
- ndarray.from_dlpack
- ndarray.from_numpy
- ndarray.zeros
- ndarray.indexing_key_expand_implicit_axes
- ndarray.get_indexing_dispatch_code
- ndarray.get_oshape_of_gather_nd_op
- ndarray.empty
- ndarray.array
- ndarray.load
- ndarray.load_frombuffer
- ndarray.save
-
- ndarray.contrib
- ndarray.contrib.rand_zipfian
- ndarray.contrib.foreach
- ndarray.contrib.while_loop
- ndarray.contrib.cond
- ndarray.contrib.isinf
- ndarray.contrib.isfinite
- ndarray.contrib.isnan
- ndarray.contrib.AdaptiveAvgPooling2D
- ndarray.contrib.BilinearResize2D
- ndarray.contrib.CTCLoss
- ndarray.contrib.DeformableConvolution
- ndarray.contrib.DeformablePSROIPooling
- ndarray.contrib.ModulatedDeformableConvolution
- ndarray.contrib.MultiBoxDetection
- ndarray.contrib.MultiBoxPrior
- ndarray.contrib.MultiBoxTarget
- ndarray.contrib.MultiProposal
- ndarray.contrib.PSROIPooling
- ndarray.contrib.Proposal
- ndarray.contrib.ROIAlign
- ndarray.contrib.RROIAlign
- ndarray.contrib.SparseEmbedding
- ndarray.contrib.SyncBatchNorm
- ndarray.contrib.allclose
- ndarray.contrib.arange_like
- ndarray.contrib.backward_gradientmultiplier
- ndarray.contrib.backward_hawkesll
- ndarray.contrib.backward_index_copy
- ndarray.contrib.backward_quadratic
- ndarray.contrib.bipartite_matching
- ndarray.contrib.boolean_mask
- ndarray.contrib.box_decode
- ndarray.contrib.box_encode
- ndarray.contrib.box_iou
- ndarray.contrib.box_nms
- ndarray.contrib.box_non_maximum_suppression
- ndarray.contrib.calibrate_entropy
- ndarray.contrib.count_sketch
- ndarray.contrib.ctc_loss
- ndarray.contrib.dequantize
- ndarray.contrib.dgl_adjacency
- ndarray.contrib.dgl_csr_neighbor_non_uniform_sample
- ndarray.contrib.dgl_csr_neighbor_uniform_sample
- ndarray.contrib.dgl_graph_compact
- ndarray.contrib.dgl_subgraph
- ndarray.contrib.div_sqrt_dim
- ndarray.contrib.edge_id
- ndarray.contrib.fft
- ndarray.contrib.getnnz
- ndarray.contrib.gradientmultiplier
- ndarray.contrib.group_adagrad_update
- ndarray.contrib.hawkesll
- ndarray.contrib.ifft
- ndarray.contrib.index_array
- ndarray.contrib.index_copy
- ndarray.contrib.interleaved_matmul_encdec_qk
- ndarray.contrib.interleaved_matmul_encdec_valatt
- ndarray.contrib.interleaved_matmul_selfatt_qk
- ndarray.contrib.interleaved_matmul_selfatt_valatt
- ndarray.contrib.quadratic
- ndarray.contrib.quantize
- ndarray.contrib.quantize_v2
- ndarray.contrib.quantized_act
- ndarray.contrib.quantized_batch_norm
- ndarray.contrib.quantized_concat
- ndarray.contrib.quantized_conv
- ndarray.contrib.quantized_elemwise_add
- ndarray.contrib.quantized_elemwise_mul
- ndarray.contrib.quantized_embedding
- ndarray.contrib.quantized_flatten
- ndarray.contrib.quantized_fully_connected
- ndarray.contrib.quantized_pooling
- ndarray.contrib.requantize
- ndarray.contrib.round_ste
- ndarray.contrib.sign_ste
-
- ndarray.image
- ndarray.image.adjust_lighting
- ndarray.image.crop
- ndarray.image.flip_left_right
- ndarray.image.flip_top_bottom
- ndarray.image.normalize
- ndarray.image.random_brightness
- ndarray.image.random_color_jitter
- ndarray.image.random_contrast
- ndarray.image.random_flip_left_right
- ndarray.image.random_flip_top_bottom
- ndarray.image.random_hue
- ndarray.image.random_lighting
- ndarray.image.random_saturation
- ndarray.image.resize
- ndarray.image.to_tensor
-
- ndarray.linalg
- ndarray.linalg.det
- ndarray.linalg.extractdiag
- ndarray.linalg.extracttrian
- ndarray.linalg.gelqf
- ndarray.linalg.gemm
- ndarray.linalg.gemm2
- ndarray.linalg.inverse
- ndarray.linalg.makediag
- ndarray.linalg.maketrian
- ndarray.linalg.potrf
- ndarray.linalg.potri
- ndarray.linalg.slogdet
- ndarray.linalg.sumlogdiag
- ndarray.linalg.syevd
- ndarray.linalg.syrk
- ndarray.linalg.trmm
- ndarray.linalg.trsm
-
- ndarray.op
- ndarray.op.CachedOp
- ndarray.op.Activation
- ndarray.op.BatchNorm
- ndarray.op.BatchNorm_v1
- ndarray.op.BilinearSampler
- ndarray.op.BlockGrad
- ndarray.op.CTCLoss
- ndarray.op.Cast
- ndarray.op.Concat
- ndarray.op.Convolution
- ndarray.op.Convolution_v1
- ndarray.op.Correlation
- ndarray.op.Crop
- ndarray.op.Custom
- ndarray.op.Deconvolution
- ndarray.op.Dropout
- ndarray.op.ElementWiseSum
- ndarray.op.Embedding
- ndarray.op.Flatten
- ndarray.op.FullyConnected
- ndarray.op.GridGenerator
- ndarray.op.GroupNorm
- ndarray.op.IdentityAttachKLSparseReg
- ndarray.op.InstanceNorm
- ndarray.op.L2Normalization
- ndarray.op.LRN
- ndarray.op.LayerNorm
- ndarray.op.LeakyReLU
- ndarray.op.LinearRegressionOutput
- ndarray.op.LogisticRegressionOutput
- ndarray.op.MAERegressionOutput
- ndarray.op.MakeLoss
- ndarray.op.Pad
- ndarray.op.Pooling
- ndarray.op.Pooling_v1
- ndarray.op.RNN
- ndarray.op.ROIPooling
- ndarray.op.Reshape
- ndarray.op.SVMOutput
- ndarray.op.SequenceLast
- ndarray.op.SequenceMask
- ndarray.op.SequenceReverse
- ndarray.op.SliceChannel
- ndarray.op.Softmax
- ndarray.op.SoftmaxActivation
- ndarray.op.SoftmaxOutput
- ndarray.op.SpatialTransformer
- ndarray.op.SwapAxis
- ndarray.op.UpSampling
- ndarray.op.abs
- ndarray.op.adam_update
- ndarray.op.add_n
- ndarray.op.all_finite
- ndarray.op.amp_cast
- ndarray.op.amp_multicast
- ndarray.op.arccos
- ndarray.op.arccosh
- ndarray.op.arcsin
- ndarray.op.arcsinh
- ndarray.op.arctan
- ndarray.op.arctanh
- ndarray.op.argmax
- ndarray.op.argmax_channel
- ndarray.op.argmin
- ndarray.op.argsort
- ndarray.op.batch_dot
- ndarray.op.batch_take
- ndarray.op.broadcast_add
- ndarray.op.broadcast_axes
- ndarray.op.broadcast_axis
- ndarray.op.broadcast_div
- ndarray.op.broadcast_equal
- ndarray.op.broadcast_greater
- ndarray.op.broadcast_greater_equal
- ndarray.op.broadcast_hypot
- ndarray.op.broadcast_lesser
- ndarray.op.broadcast_lesser_equal
- ndarray.op.broadcast_like
- ndarray.op.broadcast_logical_and
- ndarray.op.broadcast_logical_or
- ndarray.op.broadcast_logical_xor
- ndarray.op.broadcast_maximum
- ndarray.op.broadcast_minimum
- ndarray.op.broadcast_minus
- ndarray.op.broadcast_mod
- ndarray.op.broadcast_mul
- ndarray.op.broadcast_not_equal
- ndarray.op.broadcast_plus
- ndarray.op.broadcast_power
- ndarray.op.broadcast_sub
- ndarray.op.broadcast_to
- ndarray.op.cast
- ndarray.op.cast_storage
- ndarray.op.cbrt
- ndarray.op.ceil
- ndarray.op.choose_element_0index
- ndarray.op.clip
- ndarray.op.col2im
- ndarray.op.concat
- ndarray.op.cos
- ndarray.op.cosh
- ndarray.op.crop
- ndarray.op.ctc_loss
- ndarray.op.cumsum
- ndarray.op.degrees
- ndarray.op.depth_to_space
- ndarray.op.diag
- ndarray.op.dot
- ndarray.op.elemwise_add
- ndarray.op.elemwise_div
- ndarray.op.elemwise_mul
- ndarray.op.elemwise_sub
- ndarray.op.erf
- ndarray.op.erfinv
- ndarray.op.exp
- ndarray.op.expand_dims
- ndarray.op.expm1
- ndarray.op.fill_element_0index
- ndarray.op.fix
- ndarray.op.flatten
- ndarray.op.flip
- ndarray.op.floor
- ndarray.op.ftml_update
- ndarray.op.ftrl_update
- ndarray.op.gamma
- ndarray.op.gammaln
- ndarray.op.gather_nd
- ndarray.op.hard_sigmoid
- ndarray.op.identity
- ndarray.op.im2col
- ndarray.op.khatri_rao
- ndarray.op.lamb_update_phase1
- ndarray.op.lamb_update_phase2
- ndarray.op.linalg_det
- ndarray.op.linalg_extractdiag
- ndarray.op.linalg_extracttrian
- ndarray.op.linalg_gelqf
- ndarray.op.linalg_gemm
- ndarray.op.linalg_gemm2
- ndarray.op.linalg_inverse
- ndarray.op.linalg_makediag
- ndarray.op.linalg_maketrian
- ndarray.op.linalg_potrf
- ndarray.op.linalg_potri
- ndarray.op.linalg_slogdet
- ndarray.op.linalg_sumlogdiag
- ndarray.op.linalg_syrk
- ndarray.op.linalg_trmm
- ndarray.op.linalg_trsm
- ndarray.op.log
- ndarray.op.log10
- ndarray.op.log1p
- ndarray.op.log2
- ndarray.op.log_softmax
- ndarray.op.logical_not
- ndarray.op.make_loss
- ndarray.op.max
- ndarray.op.max_axis
- ndarray.op.mean
- ndarray.op.min
- ndarray.op.min_axis
- ndarray.op.moments
- ndarray.op.mp_lamb_update_phase1
- ndarray.op.mp_lamb_update_phase2
- ndarray.op.mp_nag_mom_update
- ndarray.op.mp_sgd_mom_update
- ndarray.op.mp_sgd_update
- ndarray.op.multi_all_finite
- ndarray.op.multi_lars
- ndarray.op.multi_mp_sgd_mom_update
- ndarray.op.multi_mp_sgd_update
- ndarray.op.multi_sgd_mom_update
- ndarray.op.multi_sgd_update
- ndarray.op.multi_sum_sq
- ndarray.op.nag_mom_update
- ndarray.op.nanprod
- ndarray.op.nansum
- ndarray.op.negative
- ndarray.op.norm
- ndarray.op.normal
- ndarray.op.one_hot
- ndarray.op.ones_like
- ndarray.op.pad
- ndarray.op.pick
- ndarray.op.preloaded_multi_mp_sgd_mom_update
- ndarray.op.preloaded_multi_mp_sgd_update
- ndarray.op.preloaded_multi_sgd_mom_update
- ndarray.op.preloaded_multi_sgd_update
- ndarray.op.prod
- ndarray.op.radians
- ndarray.op.random_exponential
- ndarray.op.random_gamma
- ndarray.op.random_generalized_negative_binomial
- ndarray.op.random_negative_binomial
- ndarray.op.random_normal
- ndarray.op.random_pdf_dirichlet
- ndarray.op.random_pdf_exponential
- ndarray.op.random_pdf_gamma
- ndarray.op.random_pdf_generalized_negative_binomial
- ndarray.op.random_pdf_negative_binomial
- ndarray.op.random_pdf_normal
- ndarray.op.random_pdf_poisson
- ndarray.op.random_pdf_uniform
- ndarray.op.random_poisson
- ndarray.op.random_randint
- ndarray.op.random_uniform
- ndarray.op.ravel_multi_index
- ndarray.op.rcbrt
- ndarray.op.reciprocal
- ndarray.op.relu
- ndarray.op.repeat
- ndarray.op.reset_arrays
- ndarray.op.reshape
- ndarray.op.reshape_like
- ndarray.op.reverse
- ndarray.op.rint
- ndarray.op.rmsprop_update
- ndarray.op.rmspropalex_update
- ndarray.op.round
- ndarray.op.rsqrt
- ndarray.op.sample_exponential
- ndarray.op.sample_gamma
- ndarray.op.sample_generalized_negative_binomial
- ndarray.op.sample_multinomial
- ndarray.op.sample_negative_binomial
- ndarray.op.sample_normal
- ndarray.op.sample_poisson
- ndarray.op.sample_uniform
- ndarray.op.scatter_nd
- ndarray.op.sgd_mom_update
- ndarray.op.sgd_update
- ndarray.op.shape_array
- ndarray.op.shuffle
- ndarray.op.sigmoid
- ndarray.op.sign
- ndarray.op.signsgd_update
- ndarray.op.signum_update
- ndarray.op.sin
- ndarray.op.sinh
- ndarray.op.size_array
- ndarray.op.slice
- ndarray.op.slice_axis
- ndarray.op.slice_like
- ndarray.op.smooth_l1
- ndarray.op.softmax
- ndarray.op.softmax_cross_entropy
- ndarray.op.softmin
- ndarray.op.softsign
- ndarray.op.sort
- ndarray.op.space_to_depth
- ndarray.op.split
- ndarray.op.sqrt
- ndarray.op.square
- ndarray.op.squeeze
- ndarray.op.stack
- ndarray.op.stop_gradient
- ndarray.op.sum
- ndarray.op.sum_axis
- ndarray.op.swapaxes
- ndarray.op.take
- ndarray.op.tan
- ndarray.op.tanh
- ndarray.op.tile
- ndarray.op.topk
- ndarray.op.transpose
- ndarray.op.trunc
- ndarray.op.uniform
- ndarray.op.unravel_index
- ndarray.op.where
- ndarray.op.zeros_like
-
- ndarray.random
- ndarray.random.uniform
- ndarray.random.normal
- ndarray.random.randn
- ndarray.random.poisson
- ndarray.random.exponential
- ndarray.random.gamma
- ndarray.random.multinomial
- ndarray.random.negative_binomial
- ndarray.random.generalized_negative_binomial
- ndarray.random.shuffle
- ndarray.random.randint
- ndarray.random.exponential_like
- ndarray.random.gamma_like
- ndarray.random.generalized_negative_binomial_like
- ndarray.random.negative_binomial_like
- ndarray.random.normal_like
- ndarray.random.poisson_like
- ndarray.random.uniform_like
- ndarray.register
-
- ndarray.sparse
- ndarray.sparse.csr_matrix
- ndarray.sparse.row_sparse_array
- ndarray.sparse.add
- ndarray.sparse.subtract
- ndarray.sparse.multiply
- ndarray.sparse.divide
- ndarray.sparse.ElementWiseSum
- ndarray.sparse.Embedding
- ndarray.sparse.FullyConnected
- ndarray.sparse.LinearRegressionOutput
- ndarray.sparse.LogisticRegressionOutput
- ndarray.sparse.MAERegressionOutput
- ndarray.sparse.abs
- ndarray.sparse.adagrad_update
- ndarray.sparse.adam_update
- ndarray.sparse.add_n
- ndarray.sparse.arccos
- ndarray.sparse.arccosh
- ndarray.sparse.arcsin
- ndarray.sparse.arcsinh
- ndarray.sparse.arctan
- ndarray.sparse.arctanh
- ndarray.sparse.broadcast_add
- ndarray.sparse.broadcast_div
- ndarray.sparse.broadcast_minus
- ndarray.sparse.broadcast_mul
- ndarray.sparse.broadcast_plus
- ndarray.sparse.broadcast_sub
- ndarray.sparse.cast_storage
- ndarray.sparse.cbrt
- ndarray.sparse.ceil
- ndarray.sparse.clip
- ndarray.sparse.concat
- ndarray.sparse.cos
- ndarray.sparse.cosh
- ndarray.sparse.degrees
- ndarray.sparse.dot
- ndarray.sparse.elemwise_add
- ndarray.sparse.elemwise_div
- ndarray.sparse.elemwise_mul
- ndarray.sparse.elemwise_sub
- ndarray.sparse.exp
- ndarray.sparse.expm1
- ndarray.sparse.fix
- ndarray.sparse.floor
- ndarray.sparse.ftrl_update
- ndarray.sparse.gamma
- ndarray.sparse.gammaln
- ndarray.sparse.log
- ndarray.sparse.log10
- ndarray.sparse.log1p
- ndarray.sparse.log2
- ndarray.sparse.make_loss
- ndarray.sparse.mean
- ndarray.sparse.negative
- ndarray.sparse.norm
- ndarray.sparse.radians
- ndarray.sparse.relu
- ndarray.sparse.retain
- ndarray.sparse.rint
- ndarray.sparse.round
- ndarray.sparse.rsqrt
- ndarray.sparse.sgd_mom_update
- ndarray.sparse.sgd_update
- ndarray.sparse.sigmoid
- ndarray.sparse.sign
- ndarray.sparse.sin
- ndarray.sparse.sinh
- ndarray.sparse.slice
- ndarray.sparse.sqrt
- ndarray.sparse.square
- ndarray.sparse.stop_gradient
- ndarray.sparse.sum
- ndarray.sparse.tan
- ndarray.sparse.tanh
- ndarray.sparse.trunc
- ndarray.sparse.where
- ndarray.sparse.zeros_like
- ndarray.sparse.BaseSparseNDArray
- ndarray.sparse.CSRNDArray
- ndarray.sparse.RowSparseNDArray
-
- gluon.Block
- gluon.Block.apply
- gluon.Block.cast
- gluon.Block.collect_params
- gluon.Block.forward
- gluon.Block.hybridize
- gluon.Block.initialize
- gluon.Block.load_parameters
- gluon.Block.load_params
- gluon.Block.name_scope
- gluon.Block.register_child
- gluon.Block.register_forward_hook
- gluon.Block.register_forward_pre_hook
- gluon.Block.register_op_hook
- gluon.Block.save_parameters
- gluon.Block.save_params
- gluon.Block.summary
-
- gluon.HybridBlock
- gluon.HybridBlock.apply
- gluon.HybridBlock.cast
- gluon.HybridBlock.collect_params
- gluon.HybridBlock.export
- gluon.HybridBlock.forward
- gluon.HybridBlock.hybrid_forward
- gluon.HybridBlock.hybridize
- gluon.HybridBlock.infer_shape
- gluon.HybridBlock.infer_type
- gluon.HybridBlock.initialize
- gluon.HybridBlock.load_parameters
- gluon.HybridBlock.load_params
- gluon.HybridBlock.name_scope
- gluon.HybridBlock.optimize_for
- gluon.HybridBlock.register_child
- gluon.HybridBlock.register_forward_hook
- gluon.HybridBlock.register_forward_pre_hook
- gluon.HybridBlock.register_op_hook
- gluon.HybridBlock.save_parameters
- gluon.HybridBlock.save_params
- gluon.HybridBlock.summary
-
- gluon.SymbolBlock
- gluon.SymbolBlock.apply
- gluon.SymbolBlock.cast
- gluon.SymbolBlock.collect_params
- gluon.SymbolBlock.export
- gluon.SymbolBlock.forward
- gluon.SymbolBlock.hybrid_forward
- gluon.SymbolBlock.hybridize
- gluon.SymbolBlock.imports
- gluon.SymbolBlock.infer_shape
- gluon.SymbolBlock.infer_type
- gluon.SymbolBlock.initialize
- gluon.SymbolBlock.load_parameters
- gluon.SymbolBlock.load_params
- gluon.SymbolBlock.name_scope
- gluon.SymbolBlock.optimize_for
- gluon.SymbolBlock.register_child
- gluon.SymbolBlock.register_forward_hook
- gluon.SymbolBlock.register_forward_pre_hook
- gluon.SymbolBlock.register_op_hook
- gluon.SymbolBlock.save_parameters
- gluon.SymbolBlock.save_params
- gluon.SymbolBlock.summary
-
- gluon.Constant
- gluon.Constant.cast
- gluon.Constant.data
- gluon.Constant.grad
- gluon.Constant.initialize
- gluon.Constant.list_ctx
- gluon.Constant.list_data
- gluon.Constant.list_grad
- gluon.Constant.list_row_sparse_data
- gluon.Constant.reset_ctx
- gluon.Constant.row_sparse_data
- gluon.Constant.set_data
- gluon.Constant.var
- gluon.Constant.zero_grad
-
- gluon.Parameter
- gluon.Parameter.cast
- gluon.Parameter.data
- gluon.Parameter.grad
- gluon.Parameter.initialize
- gluon.Parameter.list_ctx
- gluon.Parameter.list_data
- gluon.Parameter.list_grad
- gluon.Parameter.list_row_sparse_data
- gluon.Parameter.reset_ctx
- gluon.Parameter.row_sparse_data
- gluon.Parameter.set_data
- gluon.Parameter.var
- gluon.Parameter.zero_grad
-
- gluon.ParameterDict
- gluon.ParameterDict.get
- gluon.ParameterDict.get_constant
- gluon.ParameterDict.initialize
- gluon.ParameterDict.list_ctx
- gluon.ParameterDict.load
- gluon.ParameterDict.load_dict
- gluon.ParameterDict.reset_ctx
- gluon.ParameterDict.save
- gluon.ParameterDict.setattr
- gluon.ParameterDict.update
- gluon.ParameterDict.zero_grad
- gluon.contrib
-
- gluon.data
- gluon.data.vision.datasets
- gluon.data.vision.transforms
- gluon.data.Dataset
- gluon.data.ArrayDataset
- gluon.data.RecordFileDataset
- gluon.data.SimpleDataset
- gluon.data.BatchSampler
- gluon.data.DataLoader
- gluon.data.FilterSampler
- gluon.data.RandomSampler
- gluon.data.Sampler
- gluon.data.SequentialSampler
-
- gluon.loss
- gluon.loss.Loss
- gluon.loss.L2Loss
- gluon.loss.L1Loss
- gluon.loss.SigmoidBinaryCrossEntropyLoss
- gluon.loss.SigmoidBCELoss
- gluon.loss.SoftmaxCrossEntropyLoss
- gluon.loss.SoftmaxCELoss
- gluon.loss.KLDivLoss
- gluon.loss.CTCLoss
- gluon.loss.HuberLoss
- gluon.loss.HingeLoss
- gluon.loss.SquaredHingeLoss
- gluon.loss.LogisticLoss
- gluon.loss.TripletLoss
- gluon.loss.PoissonNLLLoss
- gluon.loss.CosineEmbeddingLoss
- gluon.loss.SDMLLoss
- gluon.nn
- gluon.rnn
- initializer
- initializer.Bilinear
- initializer.Constant
- initializer.FusedRNN
- initializer.InitDesc
- initializer.Initializer
- initializer.LSTMBias
- initializer.Load
- initializer.MSRAPrelu
- initializer.Mixed
- initializer.Normal
- initializer.One
- initializer.Orthogonal
- initializer.Uniform
- initializer.Xavier
- initializer.Zero
- optimizer
- optimizer.AdaDelta
- optimizer.AdaGrad
- optimizer.Adam
- optimizer.Adamax
- optimizer.DCASGD
- optimizer.FTML
- optimizer.Ftrl
- optimizer.LARS
- optimizer.LBSGD
- optimizer.NAG
- optimizer.Nadam
- optimizer.Optimizer
- optimizer.RMSProp
- optimizer.SGD
- optimizer.SGLD
- optimizer.Signum
- optimizer.LAMB
- optimizer.Test
- optimizer.Updater
- optimizer.ccSGD
- metric
- metric.Accuracy
- metric.Caffe
- metric.CompositeEvalMetric
- metric.CrossEntropy
- metric.CustomMetric
- metric.EvalMetric
- metric.F1
- metric.Loss
- metric.MAE
- metric.MCC
- metric.MSE
- metric.NegativeLogLikelihood
- metric.PCC
- metric.PearsonCorrelation
- metric.Perplexity
- metric.RMSE
- metric.TopKAccuracy
- metric.Torch
- symbol
-
- symbol.contrib
- symbol.contrib.rand_zipfian
- symbol.contrib.foreach
- symbol.contrib.while_loop
- symbol.contrib.cond
- symbol.contrib.AdaptiveAvgPooling2D
- symbol.contrib.BilinearResize2D
- symbol.contrib.CTCLoss
- symbol.contrib.DeformableConvolution
- symbol.contrib.DeformablePSROIPooling
- symbol.contrib.ModulatedDeformableConvolution
- symbol.contrib.MultiBoxDetection
- symbol.contrib.MultiBoxPrior
- symbol.contrib.MultiBoxTarget
- symbol.contrib.MultiProposal
- symbol.contrib.PSROIPooling
- symbol.contrib.Proposal
- symbol.contrib.ROIAlign
- symbol.contrib.RROIAlign
- symbol.contrib.SparseEmbedding
- symbol.contrib.SyncBatchNorm
- symbol.contrib.allclose
- symbol.contrib.arange_like
- symbol.contrib.backward_gradientmultiplier
- symbol.contrib.backward_hawkesll
- symbol.contrib.backward_index_copy
- symbol.contrib.backward_quadratic
- symbol.contrib.bipartite_matching
- symbol.contrib.boolean_mask
- symbol.contrib.box_decode
- symbol.contrib.box_encode
- symbol.contrib.box_iou
- symbol.contrib.box_nms
- symbol.contrib.box_non_maximum_suppression
- symbol.contrib.calibrate_entropy
- symbol.contrib.count_sketch
- symbol.contrib.ctc_loss
- symbol.contrib.dequantize
- symbol.contrib.dgl_adjacency
- symbol.contrib.dgl_csr_neighbor_non_uniform_sample
- symbol.contrib.dgl_csr_neighbor_uniform_sample
- symbol.contrib.dgl_graph_compact
- symbol.contrib.dgl_subgraph
- symbol.contrib.div_sqrt_dim
- symbol.contrib.edge_id
- symbol.contrib.fft
- symbol.contrib.getnnz
- symbol.contrib.gradientmultiplier
- symbol.contrib.group_adagrad_update
- symbol.contrib.hawkesll
- symbol.contrib.ifft
- symbol.contrib.index_array
- symbol.contrib.index_copy
- symbol.contrib.interleaved_matmul_encdec_qk
- symbol.contrib.interleaved_matmul_encdec_valatt
- symbol.contrib.interleaved_matmul_selfatt_qk
- symbol.contrib.interleaved_matmul_selfatt_valatt
- symbol.contrib.quadratic
- symbol.contrib.quantize
- symbol.contrib.quantize_v2
- symbol.contrib.quantized_act
- symbol.contrib.quantized_batch_norm
- symbol.contrib.quantized_concat
- symbol.contrib.quantized_conv
- symbol.contrib.quantized_elemwise_add
- symbol.contrib.quantized_elemwise_mul
- symbol.contrib.quantized_embedding
- symbol.contrib.quantized_flatten
- symbol.contrib.quantized_fully_connected
- symbol.contrib.quantized_pooling
- symbol.contrib.requantize
- symbol.contrib.round_ste
- symbol.contrib.sign_ste
-
- symbol.image
- symbol.image.adjust_lighting
- symbol.image.crop
- symbol.image.flip_left_right
- symbol.image.flip_top_bottom
- symbol.image.normalize
- symbol.image.random_brightness
- symbol.image.random_color_jitter
- symbol.image.random_contrast
- symbol.image.random_flip_left_right
- symbol.image.random_flip_top_bottom
- symbol.image.random_hue
- symbol.image.random_lighting
- symbol.image.random_saturation
- symbol.image.resize
- symbol.image.to_tensor
-
- symbol.linalg
- symbol.linalg.det
- symbol.linalg.extractdiag
- symbol.linalg.extracttrian
- symbol.linalg.gelqf
- symbol.linalg.gemm
- symbol.linalg.gemm2
- symbol.linalg.inverse
- symbol.linalg.makediag
- symbol.linalg.maketrian
- symbol.linalg.potrf
- symbol.linalg.potri
- symbol.linalg.slogdet
- symbol.linalg.sumlogdiag
- symbol.linalg.syevd
- symbol.linalg.syrk
- symbol.linalg.trmm
- symbol.linalg.trsm
-
- symbol.op
- symbol.op.Activation
- symbol.op.BatchNorm
- symbol.op.BatchNorm_v1
- symbol.op.BilinearSampler
- symbol.op.BlockGrad
- symbol.op.CTCLoss
- symbol.op.Cast
- symbol.op.Concat
- symbol.op.Convolution
- symbol.op.Convolution_v1
- symbol.op.Correlation
- symbol.op.Crop
- symbol.op.Custom
- symbol.op.Deconvolution
- symbol.op.Dropout
- symbol.op.ElementWiseSum
- symbol.op.Embedding
- symbol.op.Flatten
- symbol.op.FullyConnected
- symbol.op.GridGenerator
- symbol.op.GroupNorm
- symbol.op.IdentityAttachKLSparseReg
- symbol.op.InstanceNorm
- symbol.op.L2Normalization
- symbol.op.LRN
- symbol.op.LayerNorm
- symbol.op.LeakyReLU
- symbol.op.LinearRegressionOutput
- symbol.op.LogisticRegressionOutput
- symbol.op.MAERegressionOutput
- symbol.op.MakeLoss
- symbol.op.Pad
- symbol.op.Pooling
- symbol.op.Pooling_v1
- symbol.op.RNN
- symbol.op.ROIPooling
- symbol.op.Reshape
- symbol.op.SVMOutput
- symbol.op.SequenceLast
- symbol.op.SequenceMask
- symbol.op.SequenceReverse
- symbol.op.SliceChannel
- symbol.op.Softmax
- symbol.op.SoftmaxActivation
- symbol.op.SoftmaxOutput
- symbol.op.SpatialTransformer
- symbol.op.SwapAxis
- symbol.op.UpSampling
- symbol.op.abs
- symbol.op.adam_update
- symbol.op.add_n
- symbol.op.all_finite
- symbol.op.amp_cast
- symbol.op.amp_multicast
- symbol.op.arccos
- symbol.op.arccosh
- symbol.op.arcsin
- symbol.op.arcsinh
- symbol.op.arctan
- symbol.op.arctanh
- symbol.op.argmax
- symbol.op.argmax_channel
- symbol.op.argmin
- symbol.op.argsort
- symbol.op.batch_dot
- symbol.op.batch_take
- symbol.op.broadcast_add
- symbol.op.broadcast_axes
- symbol.op.broadcast_axis
- symbol.op.broadcast_div
- symbol.op.broadcast_equal
- symbol.op.broadcast_greater
- symbol.op.broadcast_greater_equal
- symbol.op.broadcast_hypot
- symbol.op.broadcast_lesser
- symbol.op.broadcast_lesser_equal
- symbol.op.broadcast_like
- symbol.op.broadcast_logical_and
- symbol.op.broadcast_logical_or
- symbol.op.broadcast_logical_xor
- symbol.op.broadcast_maximum
- symbol.op.broadcast_minimum
- symbol.op.broadcast_minus
- symbol.op.broadcast_mod
- symbol.op.broadcast_mul
- symbol.op.broadcast_not_equal
- symbol.op.broadcast_plus
- symbol.op.broadcast_power
- symbol.op.broadcast_sub
- symbol.op.broadcast_to
- symbol.op.cast_storage
- symbol.op.cbrt
- symbol.op.ceil
- symbol.op.choose_element_0index
- symbol.op.clip
- symbol.op.col2im
- symbol.op.cos
- symbol.op.cosh
- symbol.op.ctc_loss
- symbol.op.cumsum
- symbol.op.degrees
- symbol.op.depth_to_space
- symbol.op.diag
- symbol.op.dot
- symbol.op.elemwise_add
- symbol.op.elemwise_div
- symbol.op.elemwise_mul
- symbol.op.elemwise_sub
- symbol.op.erf
- symbol.op.erfinv
- symbol.op.exp
- symbol.op.expand_dims
- symbol.op.expm1
- symbol.op.fill_element_0index
- symbol.op.fix
- symbol.op.flip
- symbol.op.floor
- symbol.op.ftml_update
- symbol.op.ftrl_update
- symbol.op.gamma
- symbol.op.gammaln
- symbol.op.gather_nd
- symbol.op.hard_sigmoid
- symbol.op.identity
- symbol.op.im2col
- symbol.op.khatri_rao
- symbol.op.lamb_update_phase1
- symbol.op.lamb_update_phase2
- symbol.op.linalg_det
- symbol.op.linalg_extractdiag
- symbol.op.linalg_extracttrian
- symbol.op.linalg_gelqf
- symbol.op.linalg_gemm
- symbol.op.linalg_gemm2
- symbol.op.linalg_inverse
- symbol.op.linalg_makediag
- symbol.op.linalg_maketrian
- symbol.op.linalg_potrf
- symbol.op.linalg_potri
- symbol.op.linalg_slogdet
- symbol.op.linalg_sumlogdiag
- symbol.op.linalg_syrk
- symbol.op.linalg_trmm
- symbol.op.linalg_trsm
- symbol.op.log
- symbol.op.log10
- symbol.op.log1p
- symbol.op.log2
- symbol.op.log_softmax
- symbol.op.logical_not
- symbol.op.make_loss
- symbol.op.max
- symbol.op.max_axis
- symbol.op.mean
- symbol.op.min
- symbol.op.min_axis
- symbol.op.moments
- symbol.op.mp_lamb_update_phase1
- symbol.op.mp_lamb_update_phase2
- symbol.op.mp_nag_mom_update
- symbol.op.mp_sgd_mom_update
- symbol.op.mp_sgd_update
- symbol.op.multi_all_finite
- symbol.op.multi_lars
- symbol.op.multi_mp_sgd_mom_update
- symbol.op.multi_mp_sgd_update
- symbol.op.multi_sgd_mom_update
- symbol.op.multi_sgd_update
- symbol.op.multi_sum_sq
- symbol.op.nag_mom_update
- symbol.op.nanprod
- symbol.op.nansum
- symbol.op.negative
- symbol.op.norm
- symbol.op.normal
- symbol.op.one_hot
- symbol.op.ones_like
- symbol.op.pick
- symbol.op.preloaded_multi_mp_sgd_mom_update
- symbol.op.preloaded_multi_mp_sgd_update
- symbol.op.preloaded_multi_sgd_mom_update
- symbol.op.preloaded_multi_sgd_update
- symbol.op.prod
- symbol.op.radians
- symbol.op.random_exponential
- symbol.op.random_gamma
- symbol.op.random_generalized_negative_binomial
- symbol.op.random_negative_binomial
- symbol.op.random_normal
- symbol.op.random_pdf_dirichlet
- symbol.op.random_pdf_exponential
- symbol.op.random_pdf_gamma
- symbol.op.random_pdf_generalized_negative_binomial
- symbol.op.random_pdf_negative_binomial
- symbol.op.random_pdf_normal
- symbol.op.random_pdf_poisson
- symbol.op.random_pdf_uniform
- symbol.op.random_poisson
- symbol.op.random_randint
- symbol.op.random_uniform
- symbol.op.ravel_multi_index
- symbol.op.rcbrt
- symbol.op.reciprocal
- symbol.op.relu
- symbol.op.repeat
- symbol.op.reset_arrays
- symbol.op.reshape_like
- symbol.op.reverse
- symbol.op.rint
- symbol.op.rmsprop_update
- symbol.op.rmspropalex_update
- symbol.op.round
- symbol.op.rsqrt
- symbol.op.sample_exponential
- symbol.op.sample_gamma
- symbol.op.sample_generalized_negative_binomial
- symbol.op.sample_multinomial
- symbol.op.sample_negative_binomial
- symbol.op.sample_normal
- symbol.op.sample_poisson
- symbol.op.sample_uniform
- symbol.op.scatter_nd
- symbol.op.sgd_mom_update
- symbol.op.sgd_update
- symbol.op.shape_array
- symbol.op.shuffle
- symbol.op.sigmoid
- symbol.op.sign
- symbol.op.signsgd_update
- symbol.op.signum_update
- symbol.op.sin
- symbol.op.sinh
- symbol.op.size_array
- symbol.op.slice
- symbol.op.slice_axis
- symbol.op.slice_like
- symbol.op.smooth_l1
- symbol.op.softmax_cross_entropy
- symbol.op.softmin
- symbol.op.softsign
- symbol.op.sort
- symbol.op.space_to_depth
- symbol.op.split
- symbol.op.sqrt
- symbol.op.square
- symbol.op.squeeze
- symbol.op.stack
- symbol.op.stop_gradient
- symbol.op.sum
- symbol.op.sum_axis
- symbol.op.swapaxes
- symbol.op.take
- symbol.op.tan
- symbol.op.tanh
- symbol.op.tile
- symbol.op.topk
- symbol.op.transpose
- symbol.op.trunc
- symbol.op.uniform
- symbol.op.unravel_index
- symbol.op.where
- symbol.op.zeros_like
-
- symbol.random
- symbol.random.uniform
- symbol.random.normal
- symbol.random.randn
- symbol.random.poisson
- symbol.random.exponential
- symbol.random.gamma
- symbol.random.multinomial
- symbol.random.negative_binomial
- symbol.random.generalized_negative_binomial
- symbol.random.shuffle
- symbol.random.randint
- symbol.random.exponential_like
- symbol.random.gamma_like
- symbol.random.generalized_negative_binomial_like
- symbol.random.negative_binomial_like
- symbol.random.normal_like
- symbol.random.poisson_like
- symbol.random.uniform_like
- symbol.register
-
- symbol.sparse
- symbol.sparse.ElementWiseSum
- symbol.sparse.Embedding
- symbol.sparse.FullyConnected
- symbol.sparse.LinearRegressionOutput
- symbol.sparse.LogisticRegressionOutput
- symbol.sparse.MAERegressionOutput
- symbol.sparse.abs
- symbol.sparse.adagrad_update
- symbol.sparse.adam_update
- symbol.sparse.add_n
- symbol.sparse.arccos
- symbol.sparse.arccosh
- symbol.sparse.arcsin
- symbol.sparse.arcsinh
- symbol.sparse.arctan
- symbol.sparse.arctanh
- symbol.sparse.broadcast_add
- symbol.sparse.broadcast_div
- symbol.sparse.broadcast_minus
- symbol.sparse.broadcast_mul
- symbol.sparse.broadcast_plus
- symbol.sparse.broadcast_sub
- symbol.sparse.cast_storage
- symbol.sparse.cbrt
- symbol.sparse.ceil
- symbol.sparse.clip
- symbol.sparse.concat
- symbol.sparse.cos
- symbol.sparse.cosh
- symbol.sparse.degrees
- symbol.sparse.dot
- symbol.sparse.elemwise_add
- symbol.sparse.elemwise_div
- symbol.sparse.elemwise_mul
- symbol.sparse.elemwise_sub
- symbol.sparse.exp
- symbol.sparse.expm1
- symbol.sparse.fix
- symbol.sparse.floor
- symbol.sparse.ftrl_update
- symbol.sparse.gamma
- symbol.sparse.gammaln
- symbol.sparse.log
- symbol.sparse.log10
- symbol.sparse.log1p
- symbol.sparse.log2
- symbol.sparse.make_loss
- symbol.sparse.mean
- symbol.sparse.negative
- symbol.sparse.norm
- symbol.sparse.radians
- symbol.sparse.relu
- symbol.sparse.retain
- symbol.sparse.rint
- symbol.sparse.round
- symbol.sparse.rsqrt
- symbol.sparse.sgd_mom_update
- symbol.sparse.sgd_update
- symbol.sparse.sigmoid
- symbol.sparse.sign
- symbol.sparse.sin
- symbol.sparse.sinh
- symbol.sparse.slice
- symbol.sparse.sqrt
- symbol.sparse.square
- symbol.sparse.stop_gradient
- symbol.sparse.sum
- symbol.sparse.tan
- symbol.sparse.tanh
- symbol.sparse.trunc
- symbol.sparse.where
- symbol.sparse.zeros_like
- symbol.Activation
- symbol.BatchNorm
- symbol.BatchNorm_v1
- symbol.BilinearSampler
- symbol.BlockGrad
- symbol.CTCLoss
- symbol.Cast
- symbol.Concat
- symbol.Convolution
- symbol.Convolution_v1
- symbol.Correlation
- symbol.Crop
- symbol.Custom
- symbol.Deconvolution
- symbol.Dropout
- symbol.ElementWiseSum
- symbol.Embedding
- symbol.Flatten
- symbol.FullyConnected
- symbol.GridGenerator
- symbol.GroupNorm
- symbol.IdentityAttachKLSparseReg
- symbol.InstanceNorm
- symbol.L2Normalization
- symbol.LRN
- symbol.LayerNorm
- symbol.LeakyReLU
- symbol.LinearRegressionOutput
- symbol.LogisticRegressionOutput
- symbol.MAERegressionOutput
- symbol.MakeLoss
- symbol.Pad
- symbol.Pooling
- symbol.Pooling_v1
- symbol.RNN
- symbol.ROIPooling
- symbol.Reshape
- symbol.SVMOutput
- symbol.SequenceLast
- symbol.SequenceMask
- symbol.SequenceReverse
- symbol.SliceChannel
- symbol.Softmax
- symbol.SoftmaxActivation
- symbol.SoftmaxOutput
- symbol.SpatialTransformer
- symbol.SwapAxis
- symbol.UpSampling
- symbol.abs
- symbol.adam_update
- symbol.add_n
- symbol.all_finite
- symbol.amp_cast
- symbol.amp_multicast
- symbol.arccos
- symbol.arccosh
- symbol.arcsin
- symbol.arcsinh
- symbol.arctan
- symbol.arctanh
- symbol.argmax
- symbol.argmax_channel
- symbol.argmin
- symbol.argsort
- symbol.batch_dot
- symbol.batch_take
- symbol.broadcast_add
- symbol.broadcast_axes
- symbol.broadcast_axis
- symbol.broadcast_div
- symbol.broadcast_equal
- symbol.broadcast_greater
- symbol.broadcast_greater_equal
- symbol.broadcast_hypot
- symbol.broadcast_lesser
- symbol.broadcast_lesser_equal
- symbol.broadcast_like
- symbol.broadcast_logical_and
- symbol.broadcast_logical_or
- symbol.broadcast_logical_xor
- symbol.broadcast_maximum
- symbol.broadcast_minimum
- symbol.broadcast_minus
- symbol.broadcast_mod
- symbol.broadcast_mul
- symbol.broadcast_not_equal
- symbol.broadcast_plus
- symbol.broadcast_power
- symbol.broadcast_sub
- symbol.broadcast_to
- symbol.cast_storage
- symbol.cbrt
- symbol.ceil
- symbol.choose_element_0index
- symbol.clip
- symbol.col2im
- symbol.cos
- symbol.cosh
- symbol.ctc_loss
- symbol.cumsum
- symbol.degrees
- symbol.depth_to_space
- symbol.diag
- symbol.dot
- symbol.elemwise_add
- symbol.elemwise_div
- symbol.elemwise_mul
- symbol.elemwise_sub
- symbol.erf
- symbol.erfinv
- symbol.exp
- symbol.expand_dims
- symbol.expm1
- symbol.fill_element_0index
- symbol.fix
- symbol.flip
- symbol.floor
- symbol.ftml_update
- symbol.ftrl_update
- symbol.gamma
- symbol.gammaln
- symbol.gather_nd
- symbol.hard_sigmoid
- symbol.identity
- symbol.im2col
- symbol.khatri_rao
- symbol.lamb_update_phase1
- symbol.lamb_update_phase2
- symbol.linalg_det
- symbol.linalg_extractdiag
- symbol.linalg_extracttrian
- symbol.linalg_gelqf
- symbol.linalg_gemm
- symbol.linalg_gemm2
- symbol.linalg_inverse
- symbol.linalg_makediag
- symbol.linalg_maketrian
- symbol.linalg_potrf
- symbol.linalg_potri
- symbol.linalg_slogdet
- symbol.linalg_sumlogdiag
- symbol.linalg_syrk
- symbol.linalg_trmm
- symbol.linalg_trsm
- symbol.log
- symbol.log10
- symbol.log1p
- symbol.log2
- symbol.log_softmax
- symbol.logical_not
- symbol.make_loss
- symbol.max
- symbol.max_axis
- symbol.mean
- symbol.min
- symbol.min_axis
- symbol.moments
- symbol.mp_lamb_update_phase1
- symbol.mp_lamb_update_phase2
- symbol.mp_nag_mom_update
- symbol.mp_sgd_mom_update
- symbol.mp_sgd_update
- symbol.multi_all_finite
- symbol.multi_lars
- symbol.multi_mp_sgd_mom_update
- symbol.multi_mp_sgd_update
- symbol.multi_sgd_mom_update
- symbol.multi_sgd_update
- symbol.multi_sum_sq
- symbol.nag_mom_update
- symbol.nanprod
- symbol.nansum
- symbol.negative
- symbol.norm
- symbol.normal
- symbol.one_hot
- symbol.ones_like
- symbol.pick
- symbol.preloaded_multi_mp_sgd_mom_update
- symbol.preloaded_multi_mp_sgd_update
- symbol.preloaded_multi_sgd_mom_update
- symbol.preloaded_multi_sgd_update
- symbol.prod
- symbol.radians
- symbol.random_exponential
- symbol.random_gamma
- symbol.random_generalized_negative_binomial
- symbol.random_negative_binomial
- symbol.random_normal
- symbol.random_pdf_dirichlet
- symbol.random_pdf_exponential
- symbol.random_pdf_gamma
- symbol.random_pdf_generalized_negative_binomial
- symbol.random_pdf_negative_binomial
- symbol.random_pdf_normal
- symbol.random_pdf_poisson
- symbol.random_pdf_uniform
- symbol.random_poisson
- symbol.random_randint
- symbol.random_uniform
- symbol.ravel_multi_index
- symbol.rcbrt
- symbol.reciprocal
- symbol.relu
- symbol.repeat
- symbol.reset_arrays
- symbol.reshape_like
- symbol.reverse
- symbol.rint
- symbol.rmsprop_update
- symbol.rmspropalex_update
- symbol.round
- symbol.rsqrt
- symbol.sample_exponential
- symbol.sample_gamma
- symbol.sample_generalized_negative_binomial
- symbol.sample_multinomial
- symbol.sample_negative_binomial
- symbol.sample_normal
- symbol.sample_poisson
- symbol.sample_uniform
- symbol.scatter_nd
- symbol.sgd_mom_update
- symbol.sgd_update
- symbol.shape_array
- symbol.shuffle
- symbol.sigmoid
- symbol.sign
- symbol.signsgd_update
- symbol.signum_update
- symbol.sin
- symbol.sinh
- symbol.size_array
- symbol.slice
- symbol.slice_axis
- symbol.slice_like
- symbol.smooth_l1
- symbol.softmax_cross_entropy
- symbol.softmin
- symbol.softsign
- symbol.sort
- symbol.space_to_depth
- symbol.split
- symbol.sqrt
- symbol.square
- symbol.squeeze
- symbol.stack
- symbol.stop_gradient
- symbol.sum
- symbol.sum_axis
- symbol.swapaxes
- symbol.take
- symbol.tan
- symbol.tanh
- symbol.tile
- symbol.topk
- symbol.transpose
- symbol.trunc
- symbol.uniform
- symbol.unravel_index
- symbol.where
- symbol.zeros_like
- symbol.var
- symbol.Variable
- symbol.Group
- symbol.load
- symbol.load_json
- symbol.pow
- symbol.power
- symbol.maximum
- symbol.minimum
- symbol.hypot
- symbol.eye
- symbol.zeros
- symbol.ones
- symbol.full
- symbol.arange
- symbol.linspace
- symbol.histogram
- symbol.split_v2
-
- contrib.ndarray
- contrib.ndarray.AdaptiveAvgPooling2D
- contrib.ndarray.BilinearResize2D
- contrib.ndarray.CTCLoss
- contrib.ndarray.DeformableConvolution
- contrib.ndarray.DeformablePSROIPooling
- contrib.ndarray.ModulatedDeformableConvolution
- contrib.ndarray.MultiBoxDetection
- contrib.ndarray.MultiBoxPrior
- contrib.ndarray.MultiBoxTarget
- contrib.ndarray.MultiProposal
- contrib.ndarray.PSROIPooling
- contrib.ndarray.Proposal
- contrib.ndarray.ROIAlign
- contrib.ndarray.RROIAlign
- contrib.ndarray.SparseEmbedding
- contrib.ndarray.SyncBatchNorm
- contrib.ndarray.allclose
- contrib.ndarray.arange_like
- contrib.ndarray.backward_gradientmultiplier
- contrib.ndarray.backward_hawkesll
- contrib.ndarray.backward_index_copy
- contrib.ndarray.backward_quadratic
- contrib.ndarray.bipartite_matching
- contrib.ndarray.boolean_mask
- contrib.ndarray.box_decode
- contrib.ndarray.box_encode
- contrib.ndarray.box_iou
- contrib.ndarray.box_nms
- contrib.ndarray.box_non_maximum_suppression
- contrib.ndarray.calibrate_entropy
- contrib.ndarray.count_sketch
- contrib.ndarray.ctc_loss
- contrib.ndarray.dequantize
- contrib.ndarray.dgl_adjacency
- contrib.ndarray.dgl_csr_neighbor_non_uniform_sample
- contrib.ndarray.dgl_csr_neighbor_uniform_sample
- contrib.ndarray.dgl_graph_compact
- contrib.ndarray.dgl_subgraph
- contrib.ndarray.div_sqrt_dim
- contrib.ndarray.edge_id
- contrib.ndarray.fft
- contrib.ndarray.getnnz
- contrib.ndarray.gradientmultiplier
- contrib.ndarray.group_adagrad_update
- contrib.ndarray.hawkesll
- contrib.ndarray.ifft
- contrib.ndarray.index_array
- contrib.ndarray.index_copy
- contrib.ndarray.interleaved_matmul_encdec_qk
- contrib.ndarray.interleaved_matmul_encdec_valatt
- contrib.ndarray.interleaved_matmul_selfatt_qk
- contrib.ndarray.interleaved_matmul_selfatt_valatt
- contrib.ndarray.quadratic
- contrib.ndarray.quantize
- contrib.ndarray.quantize_v2
- contrib.ndarray.quantized_act
- contrib.ndarray.quantized_batch_norm
- contrib.ndarray.quantized_concat
- contrib.ndarray.quantized_conv
- contrib.ndarray.quantized_elemwise_add
- contrib.ndarray.quantized_elemwise_mul
- contrib.ndarray.quantized_embedding
- contrib.ndarray.quantized_flatten
- contrib.ndarray.quantized_fully_connected
- contrib.ndarray.quantized_pooling
- contrib.ndarray.requantize
- contrib.ndarray.round_ste
- contrib.ndarray.sign_ste
-
- contrib.symbol
- contrib.symbol.AdaptiveAvgPooling2D
- contrib.symbol.BilinearResize2D
- contrib.symbol.CTCLoss
- contrib.symbol.DeformableConvolution
- contrib.symbol.DeformablePSROIPooling
- contrib.symbol.ModulatedDeformableConvolution
- contrib.symbol.MultiBoxDetection
- contrib.symbol.MultiBoxPrior
- contrib.symbol.MultiBoxTarget
- contrib.symbol.MultiProposal
- contrib.symbol.PSROIPooling
- contrib.symbol.Proposal
- contrib.symbol.ROIAlign
- contrib.symbol.RROIAlign
- contrib.symbol.SparseEmbedding
- contrib.symbol.SyncBatchNorm
- contrib.symbol.allclose
- contrib.symbol.arange_like
- contrib.symbol.backward_gradientmultiplier
- contrib.symbol.backward_hawkesll
- contrib.symbol.backward_index_copy
- contrib.symbol.backward_quadratic
- contrib.symbol.bipartite_matching
- contrib.symbol.boolean_mask
- contrib.symbol.box_decode
- contrib.symbol.box_encode
- contrib.symbol.box_iou
- contrib.symbol.box_nms
- contrib.symbol.box_non_maximum_suppression
- contrib.symbol.calibrate_entropy
- contrib.symbol.count_sketch
- contrib.symbol.ctc_loss
- contrib.symbol.dequantize
- contrib.symbol.dgl_adjacency
- contrib.symbol.dgl_csr_neighbor_non_uniform_sample
- contrib.symbol.dgl_csr_neighbor_uniform_sample
- contrib.symbol.dgl_graph_compact
- contrib.symbol.dgl_subgraph
- contrib.symbol.div_sqrt_dim
- contrib.symbol.edge_id
- contrib.symbol.fft
- contrib.symbol.getnnz
- contrib.symbol.gradientmultiplier
- contrib.symbol.group_adagrad_update
- contrib.symbol.hawkesll
- contrib.symbol.ifft
- contrib.symbol.index_array
- contrib.symbol.index_copy
- contrib.symbol.interleaved_matmul_encdec_qk
- contrib.symbol.interleaved_matmul_encdec_valatt
- contrib.symbol.interleaved_matmul_selfatt_qk
- contrib.symbol.interleaved_matmul_selfatt_valatt
- contrib.symbol.quadratic
- contrib.symbol.quantize
- contrib.symbol.quantize_v2
- contrib.symbol.quantized_act
- contrib.symbol.quantized_batch_norm
- contrib.symbol.quantized_concat
- contrib.symbol.quantized_conv
- contrib.symbol.quantized_elemwise_add
- contrib.symbol.quantized_elemwise_mul
- contrib.symbol.quantized_embedding
- contrib.symbol.quantized_flatten
- contrib.symbol.quantized_fully_connected
- contrib.symbol.quantized_pooling
- contrib.symbol.requantize
- contrib.symbol.round_ste
- contrib.symbol.sign_ste
- contrib.text
- mxnet.attribute
- mxnet.base
- mxnet.callback
- mxnet.context
- mxnet.engine
- mxnet.executor
- mxnet.executor_manager
- mxnet.image
- mxnet.io
- mxnet.kvstore_server
- mxnet.libinfo
- mxnet.log
- mxnet.model
- mxnet.monitor
- mxnet.name
- mxnet.notebook
- mxnet.operator
- mxnet.profiler
- mxnet.random
- mxnet.recordio
- mxnet.registry
- mxnet.rtc
- mxnet.runtime
- mxnet.test_utils
- mxnet.torch
- mxnet.util
- mxnet.visualization
mxnet.module / module
mxnet.module¶
A module is like a FeedForward model. But we would like to make it easier to compose, similar to Torch modules.
Classes
|
The base class of a module. |
|
This module helps to deal efficiently with varying-length inputs. |
|
Module is a basic module that wrap a Symbol. |
|
A convenient module class that implements many of the module APIs as empty functions. |
|
A convenient module class that implements many of the module APIs as empty functions. |
|
A SequentialModule is a container module that can chain multiple modules together. |
-
class
mxnet.module.
BaseModule
(logger=<module 'logging' from '/work/conda_env/lib/python3.8/logging/__init__.py'>)[source]¶ Bases:
object
The base class of a module.
A module represents a computation component. One can think of module as a computation machine. A module can execute forward and backward passes and update parameters in a model. We aim to make the APIs easy to use, especially in the case when we need to use the imperative API to work with multiple modules (e.g. stochastic depth network).
A module has several states:
Initial state: Memory is not allocated yet, so the module is not ready for computation yet.
Binded: Shapes for inputs, outputs, and parameters are all known, memory has been allocated, and the module is ready for computation.
Parameters are initialized: For modules with parameters, doing computation before initializing the parameters might result in undefined outputs.
Optimizer is installed: An optimizer can be installed to a module. After this, the parameters of the module can be updated according to the optimizer after gradients are computed (forward-backward).
Methods
backward
([out_grads])Backward computation.
bind
(data_shapes[, label_shapes, …])Binds the symbols to construct executors.
fit
(train_data[, eval_data, eval_metric, …])Trains the module parameters.
forward
(data_batch[, is_train])Forward computation.
forward_backward
(data_batch)A convenient function that calls both
forward
andbackward
.get_input_grads
([merge_multi_context])Gets the gradients to the inputs, computed in the previous backward computation.
get_outputs
([merge_multi_context])Gets outputs of the previous forward computation.
Gets parameters, those are potentially copies of the actual parameters used to do computation on the device.
get_states
([merge_multi_context])Gets states from all devices
init_optimizer
([kvstore, optimizer, …])Installs and initializes optimizers, as well as initialize kvstore for
init_params
([initializer, arg_params, …])Initializes the parameters and auxiliary states.
install_monitor
(mon)Installs monitor on all executors.
iter_predict
(eval_data[, num_batch, reset, …])Iterates over predictions.
load_params
(fname)Loads model parameters from file.
predict
(eval_data[, num_batch, …])Runs prediction and collects the outputs.
prepare
(data_batch[, sparse_row_id_fn])Prepares the module for processing a data batch.
save_params
(fname)Saves model parameters to file.
score
(eval_data, eval_metric[, num_batch, …])Runs prediction on
eval_data
and evaluates the performance according to the giveneval_metric
.set_params
(arg_params, aux_params[, …])Assigns parameter and aux state values.
set_states
([states, value])Sets value for states.
update
()Updates parameters according to the installed optimizer and the gradients computed in the previous forward-backward batch.
update_metric
(eval_metric, labels[, pre_sliced])Evaluates and accumulates evaluation metric on outputs of the last forward computation.
Attributes
A list of names for data required by this module.
A list of (name, shape) pairs specifying the data inputs to this module.
A list of (name, shape) pairs specifying the label inputs to this module.
A list of names for the outputs of this module.
A list of (name, shape) pairs specifying the outputs of this module.
Gets the symbol associated with this module.
In order for a module to interact with others, it must be able to report the following information in its initial state (before binding):
data_names: list of type string indicating the names of the required input data.
output_names: list of type string indicating the names of the required outputs.
After binding, a module should be able to report the following richer information:
- state information
binded: bool, indicates whether the memory buffers needed for computation have been allocated.
for_training: whether the module is bound for training.
params_initialized: bool, indicates whether the parameters of this module have been initialized.
optimizer_initialized: bool, indicates whether an optimizer is defined and initialized.
inputs_need_grad: bool, indicates whether gradients with respect to the input data are needed. Might be useful when implementing composition of modules.
- input/output information
data_shapes: a list of (name, shape). In theory, since the memory is allocated, we could directly provide the data arrays. But in the case of data parallelism, the data arrays might not be of the same shape as viewed from the external world.
label_shapes: a list of (name, shape). This might be [] if the module does not need labels (e.g. it does not contains a loss function at the top), or a module is not bound for training.
output_shapes: a list of (name, shape) for outputs of the module.
- parameters (for modules with parameters)
get_params(): return a tuple (arg_params, aux_params). Each of those is a dictionary of name to
NDArray
mapping. Those NDArray always lives on CPU. The actual parameters used for computing might live on other devices (GPUs), this function will retrieve (a copy of) the latest parameters.set_params(arg_params, aux_params)
: assign parameters to the devices doing the computation.init_params(...)
: a more flexible interface to assign or initialize the parameters.
- setup
bind(): prepare environment for computation.
init_optimizer(): install optimizer for parameter updating.
prepare(): prepare the module based on the current data batch.
- computation
forward(data_batch): forward operation.
backward(out_grads=None): backward operation.
update(): update parameters according to installed optimizer.
get_outputs(): get outputs of the previous forward operation.
get_input_grads(): get the gradients with respect to the inputs computed in the previous backward operation.
update_metric(metric, labels, pre_sliced=False): update performance metric for the previous forward computed results.
- other properties (mostly for backward compatibility)
symbol: the underlying symbolic graph for this module (if any) This property is not necessarily constant. For example, for BucketingModule, this property is simply the current symbol being used. For other modules, this value might not be well defined.
When those intermediate-level API are implemented properly, the following high-level API will be automatically available for a module:
fit: train the module parameters on a data set.
predict: run prediction on a data set and collect outputs.
score: run prediction on a data set and evaluate performance.
Examples
>>> # An example of creating a mxnet module. >>> import mxnet as mx >>> data = mx.symbol.Variable('data') >>> fc1 = mx.symbol.FullyConnected(data, name='fc1', num_hidden=128) >>> act1 = mx.symbol.Activation(fc1, name='relu1', act_type="relu") >>> fc2 = mx.symbol.FullyConnected(act1, name = 'fc2', num_hidden = 64) >>> act2 = mx.symbol.Activation(fc2, name='relu2', act_type="relu") >>> fc3 = mx.symbol.FullyConnected(act2, name='fc3', num_hidden=10) >>> out = mx.symbol.SoftmaxOutput(fc3, name = 'softmax') >>> mod = mx.mod.Module(out)
-
backward
(out_grads=None)[source]¶ Backward computation.
- Parameters
out_grads (NDArray or list of NDArray, optional) – Gradient on the outputs to be propagated back. This parameter is only needed when bind is called on outputs that are not a loss function.
Examples
>>> # An example of backward computation. >>> mod.backward() >>> print mod.get_input_grads()[0].asnumpy() [[[ 1.10182791e-05 5.12257748e-06 4.01927764e-06 8.32566820e-06 -1.59775993e-06 7.24269375e-06 7.28067835e-06 -1.65902311e-05 5.46342608e-06 8.44196393e-07] ...]]
-
bind
(data_shapes, label_shapes=None, for_training=True, inputs_need_grad=False, force_rebind=False, shared_module=None, grad_req='write')[source]¶ Binds the symbols to construct executors. This is necessary before one can perform computation with the module.
- Parameters
data_shapes (list of (str, tuple) or DataDesc objects) – Typically is
data_iter.provide_data
. Can also be a list of (data name, data shape).label_shapes (list of (str, tuple) or DataDesc objects) – Typically is
data_iter.provide_label
. Can also be a list of (label name, label shape).for_training (bool) – Default is
True
. Whether the executors should be bind for training.inputs_need_grad (bool) – Default is
False
. Whether the gradients to the input data need to be computed. Typically this is not needed. But this might be needed when implementing composition of modules.force_rebind (bool) – Default is
False
. This function does nothing if the executors are already bound. But with thisTrue
, the executors will be forced to rebind.shared_module (Module) – Default is
None
. This is used in bucketing. When notNone
, the shared module essentially corresponds to a different bucket – a module with different symbol but with the same sets of parameters (e.g. unrolled RNNs with different lengths).grad_req (str, list of str, dict of str to str) – Requirement for gradient accumulation. Can be ‘write’, ‘add’, or ‘null’ (default to ‘write’). Can be specified globally (str) or for each argument (list, dict).
Examples
>>> # An example of binding symbols. >>> mod.bind(data_shapes=[('data', (1, 10, 10))]) >>> # Assume train_iter is already created. >>> mod.bind(data_shapes=train_iter.provide_data, label_shapes=train_iter.provide_label)
-
property
data_names
¶ A list of names for data required by this module.
-
property
data_shapes
¶ A list of (name, shape) pairs specifying the data inputs to this module.
-
fit
(train_data, eval_data=None, eval_metric='acc', epoch_end_callback=None, batch_end_callback=None, kvstore='local', optimizer='sgd', optimizer_params=(('learning_rate', 0.01), ), eval_end_callback=None, eval_batch_end_callback=None, initializer=<mxnet.initializer.Uniform object>, arg_params=None, aux_params=None, allow_missing=False, force_rebind=False, force_init=False, begin_epoch=0, num_epoch=None, validation_metric=None, monitor=None, sparse_row_id_fn=None)[source]¶ Trains the module parameters.
Checkout Module Tutorial to see an end-to-end use-case.
- Parameters
train_data (DataIter) – Train DataIter.
eval_data (DataIter) – If not
None
, will be used as validation set and the performance after each epoch will be evaluated.eval_metric (str or EvalMetric) – Defaults to ‘accuracy’. The performance measure used to display during training. Other possible predefined metrics are: ‘ce’ (CrossEntropy), ‘f1’, ‘mae’, ‘mse’, ‘rmse’, ‘top_k_accuracy’.
epoch_end_callback (function or list of functions) – Each callback will be called with the current epoch, symbol, arg_params and aux_params.
batch_end_callback (function or list of function) – Each callback will be called with a BatchEndParam.
kvstore (str or KVStore) – Defaults to ‘local’.
optimizer (str or Optimizer) – Defaults to ‘sgd’.
optimizer_params (dict) – Defaults to
(('learning_rate', 0.01),)
. The parameters for the optimizer constructor. The default value is not a dict, just to avoid pylint warning on dangerous default values.eval_end_callback (function or list of function) – These will be called at the end of each full evaluation, with the metrics over the entire evaluation set.
eval_batch_end_callback (function or list of function) – These will be called at the end of each mini-batch during evaluation.
initializer (Initializer) – The initializer is called to initialize the module parameters when they are not already initialized.
arg_params (dict) – Defaults to
None
, if notNone
, should be existing parameters from a trained model or loaded from a checkpoint (previously saved model). In this case, the value here will be used to initialize the module parameters, unless they are already initialized by the user via a call to init_params or fit. arg_params has a higher priority than initializer.aux_params (dict) – Defaults to
None
. Similar to arg_params, except for auxiliary states.allow_missing (bool) – Defaults to
False
. Indicates whether to allow missing parameters when arg_params and aux_params are notNone
. If this isTrue
, then the missing parameters will be initialized via the initializer.force_rebind (bool) – Defaults to
False
. Whether to force rebinding the executors if already bound.force_init (bool) – Defaults to
False
. Indicates whether to force initialization even if the parameters are already initialized.begin_epoch (int) – Defaults to 0. Indicates the starting epoch. Usually, if resumed from a checkpoint saved at a previous training phase at epoch N, then this value should be N+1.
num_epoch (int) – Number of epochs for training.
sparse_row_id_fn (A callback function) – The function takes data_batch as an input and returns a dict of str -> NDArray. The resulting dict is used for pulling row_sparse parameters from the kvstore, where the str key is the name of the param, and the value is the row id of the param to pull.
Examples
>>> # An example of using fit for training. >>> # Assume training dataIter and validation dataIter are ready >>> # Assume loading a previously checkpointed model >>> sym, arg_params, aux_params = mx.model.load_checkpoint(model_prefix, 3) >>> mod.fit(train_data=train_dataiter, eval_data=val_dataiter, optimizer='sgd', ... optimizer_params={'learning_rate':0.01, 'momentum': 0.9}, ... arg_params=arg_params, aux_params=aux_params, ... eval_metric='acc', num_epoch=10, begin_epoch=3)
-
forward
(data_batch, is_train=None)[source]¶ Forward computation. It supports data batches with different shapes, such as different batch sizes or different image sizes. If reshaping of data batch relates to modification of symbol or module, such as changing image layout ordering or switching from training to predicting, module rebinding is required.
- Parameters
data_batch (DataBatch) – Could be anything with similar API implemented.
is_train (bool) – Default is
None
, which means is_train takes the value ofself.for_training
.
Examples
>>> import mxnet as mx >>> from collections import namedtuple >>> Batch = namedtuple('Batch', ['data']) >>> data = mx.sym.Variable('data') >>> out = data * 2 >>> mod = mx.mod.Module(symbol=out, label_names=None) >>> mod.bind(data_shapes=[('data', (1, 10))]) >>> mod.init_params() >>> data1 = [mx.nd.ones((1, 10))] >>> mod.forward(Batch(data1)) >>> print mod.get_outputs()[0].asnumpy() [[ 2. 2. 2. 2. 2. 2. 2. 2. 2. 2.]] >>> # Forward with data batch of different shape >>> data2 = [mx.nd.ones((3, 5))] >>> mod.forward(Batch(data2)) >>> print mod.get_outputs()[0].asnumpy() [[ 2. 2. 2. 2. 2.] [ 2. 2. 2. 2. 2.] [ 2. 2. 2. 2. 2.]]
-
get_input_grads
(merge_multi_context=True)[source]¶ Gets the gradients to the inputs, computed in the previous backward computation.
If merge_multi_context is
True
, it is like[grad1, grad2]
. Otherwise, it is like[[grad1_dev1, grad1_dev2], [grad2_dev1, grad2_dev2]]
. All the output elements have type NDArray. When merge_multi_context isFalse
, those NDArray instances might live on different devices.- Parameters
merge_multi_context (bool) – Defaults to
True
. In the case when data-parallelism is used, the gradients will be collected from multiple devices. ATrue
value indicates that we should merge the collected results so that they look like from a single executor.- Returns
Input gradients.
- Return type
list of NDArray or list of list of NDArray
Examples
>>> # An example of getting input gradients. >>> print mod.get_input_grads()[0].asnumpy() [[[ 1.10182791e-05 5.12257748e-06 4.01927764e-06 8.32566820e-06 -1.59775993e-06 7.24269375e-06 7.28067835e-06 -1.65902311e-05 5.46342608e-06 8.44196393e-07] ...]]
-
get_outputs
(merge_multi_context=True)[source]¶ Gets outputs of the previous forward computation.
If merge_multi_context is
True
, it is like[out1, out2]
. Otherwise, it returns out put of form[[out1_dev1, out1_dev2], [out2_dev1, out2_dev2]]
. All the output elements have type NDArray. When merge_multi_context isFalse
, those NDArray instances might live on different devices.- Parameters
merge_multi_context (bool) – Defaults to
True
. In the case when data-parallelism is used, the outputs will be collected from multiple devices. ATrue
value indicates that we should merge the collected results so that they look like from a single executor.- Returns
Output
- Return type
list of NDArray or list of list of NDArray.
Examples
>>> # An example of getting forward output. >>> print mod.get_outputs()[0].asnumpy() [[ 0.09999977 0.10000153 0.10000716 0.10000195 0.09999853 0.09999743 0.10000272 0.10000113 0.09999088 0.09999888]]
-
get_params
()[source]¶ Gets parameters, those are potentially copies of the actual parameters used to do computation on the device.
- Returns
A pair of dictionaries each mapping parameter names to NDArray values.
- Return type
(arg_params, aux_params)
Examples
>>> # An example of getting module parameters. >>> print mod.get_params() ({'fc2_weight': <NDArray 64x128 @cpu(0)>, 'fc1_weight': <NDArray 128x100 @cpu(0)>, 'fc3_bias': <NDArray 10 @cpu(0)>, 'fc3_weight': <NDArray 10x64 @cpu(0)>, 'fc2_bias': <NDArray 64 @cpu(0)>, 'fc1_bias': <NDArray 128 @cpu(0)>}, {})
-
get_states
(merge_multi_context=True)[source]¶ Gets states from all devices
If merge_multi_context is
True
, returns output of form[out1, out2]
. Otherwise, it returns output of the form[[out1_dev1, out1_dev2], [out2_dev1, out2_dev2]]
. All output elements are NDArray.- Parameters
merge_multi_context (bool) – Defaults to
True
. In the case when data-parallelism is used, the states will be collected from multiple devices. ATrue
value indicates that we should merge the collected results so that they look like from a single executor.- Returns
- Return type
A list of
NDArray
or a list of list ofNDArray
.
-
init_optimizer
(kvstore='local', optimizer='sgd', optimizer_params=(('learning_rate', 0.01), ), force_init=False)[source]¶ - Installs and initializes optimizers, as well as initialize kvstore for
distributed training
- Parameters
kvstore (str or KVStore) – Defaults to ‘local’.
optimizer (str or Optimizer) – Defaults to ‘sgd’.
optimizer_params (dict) – Defaults to
(('learning_rate', 0.01),)
. The default value is not a dictionary, just to avoid pylint warning of dangerous default values.force_init (bool) – Defaults to
False
, indicates whether to force re-initializing an optimizer if it is already installed.
Examples
>>> # An example of initializing optimizer. >>> mod.init_optimizer(optimizer='sgd', optimizer_params=(('learning_rate', 0.005),))
-
init_params
(initializer=<mxnet.initializer.Uniform object>, arg_params=None, aux_params=None, allow_missing=False, force_init=False, allow_extra=False)[source]¶ Initializes the parameters and auxiliary states.
- Parameters
initializer (Initializer) – Called to initialize parameters if needed.
arg_params (dict) – If not
None
, should be a dictionary of existing arg_params. Initialization will be copied from that.aux_params (dict) – If not
None
, should be a dictionary of existing aux_params. Initialization will be copied from that.allow_missing (bool) – If
True
, params could contain missing values, and the initializer will be called to fill those missing params.force_init (bool) – If
True
, force_init will force re-initialize even if already initialized.allow_extra (boolean, optional) – Whether allow extra parameters that are not needed by symbol. If this is True, no error will be thrown when arg_params or aux_params contain extra parameters that is not needed by the executor.
Examples
>>> # An example of initializing module parameters. >>> mod.init_params()
-
iter_predict
(eval_data, num_batch=None, reset=True, sparse_row_id_fn=None)[source]¶ Iterates over predictions.
Examples
>>> for pred, i_batch, batch in module.iter_predict(eval_data): ... # pred is a list of outputs from the module ... # i_batch is a integer ... # batch is the data batch from the data iterator
- Parameters
eval_data (DataIter) – Evaluation data to run prediction on.
num_batch (int) – Default is
None
, indicating running all the batches in the data iterator.reset (bool) – Default is
True
, indicating whether we should reset the data iter before start doing prediction.sparse_row_id_fn (A callback function) – The function takes data_batch as an input and returns a dict of str -> NDArray. The resulting dict is used for pulling row_sparse parameters from the kvstore, where the str key is the name of the param, and the value is the row id of the param to pull.
-
property
label_shapes
¶ A list of (name, shape) pairs specifying the label inputs to this module. If this module does not accept labels – either it is a module without loss function, or it is not bound for training, then this should return an empty list
[]
.
-
load_params
(fname)[source]¶ Loads model parameters from file.
- Parameters
fname (str) – Path to input param file.
Examples
>>> # An example of loading module parameters. >>> mod.load_params('myfile')
-
property
output_names
¶ A list of names for the outputs of this module.
-
property
output_shapes
¶ A list of (name, shape) pairs specifying the outputs of this module.
-
predict
(eval_data, num_batch=None, merge_batches=True, reset=True, always_output_list=False, sparse_row_id_fn=None)[source]¶ Runs prediction and collects the outputs.
When merge_batches is
True
(by default), the return value will be a list[out1, out2, out3]
, where each element is formed by concatenating the outputs for all the mini-batches. When always_output_list isFalse
(as by default), then in the case of a single output, out1 is returned instead of[out1]
.When merge_batches is
False
, the return value will be a nested list like[[out1_batch1, out2_batch1], [out1_batch2], ...]
. This mode is useful because in some cases (e.g. bucketing), the module does not necessarily produce the same number of outputs.The objects in the results have type NDArray. If you need to work with a numpy array, just call
.asnumpy()
on each NDArray.- Parameters
eval_data (DataIter or NDArray or numpy array) – Evaluation data to run prediction on.
num_batch (int) – Defaults to
None
, indicates running all the batches in the data iterator.merge_batches (bool) – Defaults to
True
, see above for return values.reset (bool) – Defaults to
True
, indicates whether we should reset the data iter before doing prediction.always_output_list (bool) – Defaults to
False
, see above for return values.sparse_row_id_fn (A callback function) – The function takes data_batch as an input and returns a dict of str -> NDArray. The resulting dict is used for pulling row_sparse parameters from the kvstore, where the str key is the name of the param, and the value is the row id of the param to pull.
- Returns
Prediction results.
- Return type
list of NDArray or list of list of NDArray
Examples
>>> # An example of using `predict` for prediction. >>> # Predict on the first 10 batches of val_dataiter >>> mod.predict(eval_data=val_dataiter, num_batch=10)
-
prepare
(data_batch, sparse_row_id_fn=None)[source]¶ Prepares the module for processing a data batch.
Usually involves switching bucket and reshaping. For modules that contain row_sparse parameters in KVStore, it prepares the row_sparse parameters based on the sparse_row_id_fn.
When KVStore is used to update parameters for multi-device or multi-machine training, a copy of the parameters are stored in KVStore. Note that for row_sparse parameters, the update() updates the copy of parameters in KVStore, but doesn’t broadcast the updated parameters to all devices / machines. The prepare function is used to broadcast row_sparse parameters with the next batch of data.
- Parameters
data_batch (DataBatch) – The current batch of data for forward computation.
sparse_row_id_fn (A callback function) – The function takes data_batch as an input and returns a dict of str -> NDArray. The resulting dict is used for pulling row_sparse parameters from the kvstore, where the str key is the name of the param, and the value is the row id of the param to pull.
-
save_params
(fname)[source]¶ Saves model parameters to file.
- Parameters
fname (str) – Path to output param file.
Examples
>>> # An example of saving module parameters. >>> mod.save_params('myfile')
-
score
(eval_data, eval_metric, num_batch=None, batch_end_callback=None, score_end_callback=None, reset=True, epoch=0, sparse_row_id_fn=None)[source]¶ Runs prediction on
eval_data
and evaluates the performance according to the giveneval_metric
.Checkout Module Tutorial to see an end-to-end use-case.
- Parameters
eval_data (DataIter) – Evaluation data to run prediction on.
eval_metric (EvalMetric or list of EvalMetrics) – Evaluation metric to use.
num_batch (int) – Number of batches to run. Defaults to
None
, indicating run until the DataIter finishes.batch_end_callback (function) – Could also be a list of functions.
reset (bool) – Defaults to
True
. Indicates whether we should reset eval_data before starting evaluating.epoch (int) – Defaults to 0. For compatibility, this will be passed to callbacks (if any). During training, this will correspond to the training epoch number.
sparse_row_id_fn (A callback function) – The function takes data_batch as an input and returns a dict of str -> NDArray. The resulting dict is used for pulling row_sparse parameters from the kvstore, where the str key is the name of the param, and the value is the row id of the param to pull.
Examples
>>> # An example of using score for prediction. >>> # Evaluate accuracy on val_dataiter >>> metric = mx.metric.Accuracy() >>> mod.score(val_dataiter, metric) >>> mod.score(val_dataiter, ['mse', 'acc'])
-
set_params
(arg_params, aux_params, allow_missing=False, force_init=True, allow_extra=False)[source]¶ Assigns parameter and aux state values.
- Parameters
arg_params (dict) – Dictionary of name to value (NDArray) mapping.
aux_params (dict) – Dictionary of name to value (NDArray) mapping.
allow_missing (bool) – If
True
, params could contain missing values, and the initializer will be called to fill those missing params.force_init (bool) – If
True
, will force re-initialize even if already initialized.allow_extra (boolean, optional) – Whether allow extra parameters that are not needed by symbol. If this is True, no error will be thrown when arg_params or aux_params contain extra parameters that is not needed by the executor.
Examples
>>> # An example of setting module parameters. >>> sym, arg_params, aux_params = mx.model.load_checkpoint(model_prefix, n_epoch_load) >>> mod.set_params(arg_params=arg_params, aux_params=aux_params)
-
set_states
(states=None, value=None)[source]¶ Sets value for states. Only one of states & value can be specified.
- Parameters
states (list of list of NDArray) – Source states arrays formatted like
[[state1_dev1, state1_dev2], [state2_dev1, state2_dev2]]
.value (number) – A single scalar value for all state arrays.
-
property
symbol
¶ Gets the symbol associated with this module.
Except for Module, for other types of modules (e.g. BucketingModule), this property might not be a constant throughout its life time. Some modules might not even be associated with any symbols.
-
update
()[source]¶ Updates parameters according to the installed optimizer and the gradients computed in the previous forward-backward batch.
When KVStore is used to update parameters for multi-device or multi-machine training, a copy of the parameters are stored in KVStore. Note that for row_sparse parameters, this function does update the copy of parameters in KVStore, but doesn’t broadcast the updated parameters to all devices / machines. Please call prepare to broadcast row_sparse parameters with the next batch of data.
Examples
>>> # An example of updating module parameters. >>> mod.init_optimizer(kvstore='local', optimizer='sgd', ... optimizer_params=(('learning_rate', 0.01), )) >>> mod.backward() >>> mod.update() >>> print mod.get_params()[0]['fc3_weight'].asnumpy() [[ 5.86930104e-03 5.28078526e-03 -8.88729654e-03 -1.08308345e-03 6.13054074e-03 4.27560415e-03 1.53817423e-03 4.62131854e-03 4.69872449e-03 -2.42400169e-03 9.94111411e-04 1.12386420e-03 ...]]
-
update_metric
(eval_metric, labels, pre_sliced=False)[source]¶ Evaluates and accumulates evaluation metric on outputs of the last forward computation.
- Parameters
eval_metric (EvalMetric) – Evaluation metric to use.
labels (list of NDArray if pre_sliced parameter is set to False,) – list of lists of NDArray otherwise. Typically data_batch.label.
pre_sliced (bool) – Whether the labels are already sliced per device (default: False).
Examples
>>> # An example of updating evaluation metric. >>> mod.forward(data_batch) >>> mod.update_metric(metric, data_batch.label)
-
class
mxnet.module.
BucketingModule
(sym_gen, default_bucket_key=None, logger=<module 'logging' from '/work/conda_env/lib/python3.8/logging/__init__.py'>, context=cpu(0), work_load_list=None, fixed_param_names=None, state_names=None, group2ctxs=None, compression_params=None)[source]¶ Bases:
mxnet.module.base_module.BaseModule
This module helps to deal efficiently with varying-length inputs.
- Parameters
sym_gen (function) – A function when called with a bucket key, returns a triple
(symbol, data_names, label_names)
.default_bucket_key (str (or any python object)) – The key for the default bucket.
logger (Logger) –
context (Context or list of Context) – Defaults to
mx.cpu()
work_load_list (list of number) – Defaults to
None
, indicating uniform workload.fixed_param_names (list of str) – Defaults to
None
, indicating no network parameters are fixed.state_names (list of str) – States are similar to data and label, but not provided by data iterator. Instead they are initialized to 0 and can be set by set_states()
group2ctxs (dict of str to context or list of context,) – or list of dict of str to context Default is None. Mapping the ctx_group attribute to the context assignment.
compression_params (dict) – Specifies type of gradient compression and additional arguments depending on the type of compression being used. For example, 2bit compression requires a threshold. Arguments would then be {‘type’:’2bit’, ‘threshold’:0.5} See mxnet.KVStore.set_gradient_compression method for more details on gradient compression.
Methods
backward
([out_grads])Backward computation.
bind
(data_shapes[, label_shapes, …])Binding for a BucketingModule means setting up the buckets and binding the executor for the default bucket key.
forward
(data_batch[, is_train])Forward computation.
get_input_grads
([merge_multi_context])Gets the gradients with respect to the inputs of the module.
get_outputs
([merge_multi_context])Gets outputs from a previous forward computation.
Gets current parameters.
get_states
([merge_multi_context])Gets states from all devices.
init_optimizer
([kvstore, optimizer, …])Installs and initializes optimizers.
init_params
([initializer, arg_params, …])Initializes parameters.
install_monitor
(mon)Installs monitor on all executors
load
(prefix, epoch[, sym_gen, …])Creates a model from previously saved checkpoint.
load_dict
([sym_dict, sym_gen, …])Creates a model from a dict mapping bucket_key to symbols and shared arg_params and aux_params.
prepare
(data_batch[, sparse_row_id_fn])Prepares the module for processing a data batch.
save_checkpoint
(prefix, epoch[, remove_amp_cast])Saves current progress to checkpoint for all buckets in BucketingModule Use mx.callback.module_checkpoint as epoch_end_callback to save during training.
set_params
(arg_params, aux_params[, …])Assigns parameters and aux state values.
set_states
([states, value])Sets value for states.
switch_bucket
(bucket_key, data_shapes[, …])Switches to a different bucket.
update
()Updates parameters according to installed optimizer and the gradient computed in the previous forward-backward cycle.
update_metric
(eval_metric, labels[, pre_sliced])Evaluates and accumulates evaluation metric on outputs of the last forward computation.
Attributes
A list of names for data required by this module.
Get data shapes.
Get label shapes.
A list of names for the outputs of this module.
Gets output shapes.
The symbol of the current bucket being used.
-
bind
(data_shapes, label_shapes=None, for_training=True, inputs_need_grad=False, force_rebind=False, shared_module=None, grad_req='write')[source]¶ Binding for a BucketingModule means setting up the buckets and binding the executor for the default bucket key. Executors corresponding to other keys are bound afterwards with switch_bucket.
- Parameters
data_shapes (list of (str, tuple)) – This should correspond to the symbol for the default bucket.
label_shapes (list of (str, tuple)) – This should correspond to the symbol for the default bucket.
for_training (bool) – Default is
True
.inputs_need_grad (bool) – Default is
False
.force_rebind (bool) – Default is
False
.shared_module (BucketingModule) – Default is
None
. This value is currently not used.grad_req (str, list of str, dict of str to str) – Requirement for gradient accumulation. Can be ‘write’, ‘add’, or ‘null’ (default to ‘write’). Can be specified globally (str) or for each argument (list, dict).
bucket_key (str (or any python object)) – bucket key for binding. by default use the default_bucket_key
-
property
data_names
¶ A list of names for data required by this module.
-
property
data_shapes
¶ Get data shapes.
- Returns
- Return type
A list of (name, shape) pairs.
-
forward
(data_batch, is_train=None)[source]¶ Forward computation.
- Parameters
data_batch (DataBatch) –
is_train (bool) – Defaults to
None
, in which case is_train is take asself.for_training
.
-
get_input_grads
(merge_multi_context=True)[source]¶ Gets the gradients with respect to the inputs of the module.
- Parameters
merge_multi_context (bool) – Defaults to
True
. In the case when data-parallelism is used, the outputs will be collected from multiple devices. ATrue
value indicate that we should merge the collected results so that they look like from a single executor.- Returns
If merge_multi_context is
True
, it is like[grad1, grad2]
. Otherwise, it is like[[grad1_dev1, grad1_dev2], [grad2_dev1, grad2_dev2]]
. All the output elements are NDArray.- Return type
list of NDArrays or list of list of NDArrays
-
get_outputs
(merge_multi_context=True)[source]¶ Gets outputs from a previous forward computation.
- Parameters
merge_multi_context (bool) – Defaults to
True
. In the case when data-parallelism is used, the outputs will be collected from multiple devices. ATrue
value indicate that we should merge the collected results so that they look like from a single executor.- Returns
If merge_multi_context is
True
, it is like[out1, out2]
. Otherwise, it is like[[out1_dev1, out1_dev2], [out2_dev1, out2_dev2]]
. All the output elements are numpy arrays.- Return type
list of numpy arrays or list of list of numpy arrays
-
get_params
()[source]¶ Gets current parameters.
- Returns
A pair of dictionaries each mapping parameter names to NDArray values.
- Return type
(arg_params, aux_params)
-
get_states
(merge_multi_context=True)[source]¶ Gets states from all devices.
- Parameters
merge_multi_context (bool) – Default is True. In the case when data-parallelism is used, the states will be collected from multiple devices. A True value indicate that we should merge the collected results so that they look like from a single executor.
- Returns
If merge_multi_context is
True
, it is like[out1, out2]
. Otherwise, it is like[[out1_dev1, out1_dev2], [out2_dev1, out2_dev2]]
. All the output elements are NDArray.- Return type
list of NDArrays or list of list of NDArrays
-
init_optimizer
(kvstore='local', optimizer='sgd', optimizer_params=(('learning_rate', 0.01), ), force_init=False)[source]¶ Installs and initializes optimizers.
- Parameters
kvstore (str or KVStore) – Defaults to ‘local’.
optimizer (str or Optimizer) – Defaults to ‘sgd’
optimizer_params (dict) – Defaults to ((‘learning_rate’, 0.01),). The default value is not a dictionary, just to avoid pylint warning of dangerous default values.
force_init (bool) – Defaults to
False
, indicating whether we should force re-initializing the optimizer in the case an optimizer is already installed.
-
init_params
(initializer=<mxnet.initializer.Uniform object>, arg_params=None, aux_params=None, allow_missing=False, force_init=False, allow_extra=False)[source]¶ Initializes parameters.
- Parameters
initializer (Initializer) –
arg_params (dict) – Defaults to
None
. Existing parameters. This has higher priority than initializer.aux_params (dict) – Defaults to
None
. Existing auxiliary states. This has higher priority than initializer.allow_missing (bool) – Allow missing values in arg_params and aux_params (if not
None
). In this case, missing values will be filled with initializer.force_init (bool) – Defaults to
False
.allow_extra (boolean, optional) – Whether allow extra parameters that are not needed by symbol. If this is True, no error will be thrown when arg_params or aux_params contain extra parameters that is not needed by the executor.
-
property
label_shapes
¶ Get label shapes.
- Returns
The return value could be
None
if the module does not need labels, or if the module is not bound for training (in this case, label information is not available).- Return type
A list of (name, shape) pairs.
-
static
load
(prefix, epoch, sym_gen=None, default_bucket_key=None, **kwargs)[source]¶ Creates a model from previously saved checkpoint.
- Parameters
prefix (str) – path prefix of saved model files. You should have “prefix-symbol.json”, “prefix-xxxx.params”, and optionally “prefix-xxxx.states”, where xxxx is the epoch number.
epoch (int) – epoch to load.
sym_gen (function) – A function when called with a bucket key, returns a triple
(symbol, data_names, label_names)
. provide sym_gen which was used when saving bucketing module.logger (Logger) – Default is logging.
context (Context or list of Context) – Default is
cpu()
.work_load_list (list of number) – Default
None
, indicating uniform workload.fixed_param_names (list of str) – Default
None
, indicating no network parameters are fixed.state_names (list of str) – States are similar to data and label, but not provided by data iterator. Instead they are initialized to 0 and can be set by set_states()
group2ctxs (dict of str to context or list of context,) – or list of dict of str to context Default is None. Mapping the ctx_group attribute to the context assignment.
compression_params (dict) – Specifies type of gradient compression and additional arguments depending on the type of compression being used. For example, 2bit compression requires a threshold. Arguments would then be {‘type’:’2bit’, ‘threshold’:0.5} See mxnet.KVStore.set_gradient_compression method for more details on gradient compression.
-
static
load_dict
(sym_dict=None, sym_gen=None, default_bucket_key=None, arg_params=None, aux_params=None, **kwargs)[source]¶ Creates a model from a dict mapping bucket_key to symbols and shared arg_params and aux_params.
- Parameters
sym_dict (dict mapping bucket_key to symbol) – Dict mapping bucket key to symbol
sym_gen (function) – A function when called with a bucket key, returns a triple
(symbol, data_names, label_names)
. provide sym_gen which was used when saving bucketing module.default_bucket_key (str (or any python object)) – The key for the default bucket.
arg_params (dict) – Required for loading the BucketingModule. Dict of name to parameter ndarrays.
aux_params (dict) – Required for loading the BucketingModule. Dict of name to auxiliary state ndarrays.
logger (Logger) – Default is logging.
context (Context or list of Context) – Default is
cpu()
.work_load_list (list of number) – Default
None
, indicating uniform workload.fixed_param_names (list of str) – Default
None
, indicating no network parameters are fixed.state_names (list of str) – States are similar to data and label, but not provided by data iterator. Instead they are initialized to 0 and can be set by set_states()
group2ctxs (dict of str to context or list of context,) – or list of dict of str to context Default is None. Mapping the ctx_group attribute to the context assignment.
compression_params (dict) – Specifies type of gradient compression and additional arguments depending on the type of compression being used. For example, 2bit compression requires a threshold. Arguments would then be {‘type’:’2bit’, ‘threshold’:0.5} See mxnet.KVStore.set_gradient_compression method for more details on gradient compression.
-
property
output_names
¶ A list of names for the outputs of this module.
-
property
output_shapes
¶ Gets output shapes.
- Returns
- Return type
A list of (name, shape) pairs.
-
prepare
(data_batch, sparse_row_id_fn=None)[source]¶ Prepares the module for processing a data batch.
Usually involves switching bucket and reshaping. For modules that contain row_sparse parameters in KVStore, it prepares the row_sparse parameters based on the sparse_row_id_fn.
- Parameters
data_batch (DataBatch) – The current batch of data for forward computation.
sparse_row_id_fn (A callback function) – The function takes data_batch as an input and returns a dict of str -> NDArray. The resulting dict is used for pulling row_sparse parameters from the kvstore, where the str key is the name of the param, and the value is the row id of the param to pull.
-
save_checkpoint
(prefix, epoch, remove_amp_cast=False)[source]¶ Saves current progress to checkpoint for all buckets in BucketingModule Use mx.callback.module_checkpoint as epoch_end_callback to save during training.
- Parameters
prefix (str) – The file prefix to checkpoint to.
epoch (int) – The current epoch number.
-
set_params
(arg_params, aux_params, allow_missing=False, force_init=True, allow_extra=False)[source]¶ Assigns parameters and aux state values.
- Parameters
arg_params (dict) – Dictionary of name to value (NDArray) mapping.
aux_params (dict) – Dictionary of name to value (NDArray) mapping.
allow_missing (bool) – If true, params could contain missing values, and the initializer will be called to fill those missing params.
force_init (bool) – If true, will force re-initialize even if already initialized.
allow_extra (boolean, optional) – Whether allow extra parameters that are not needed by symbol. If this is True, no error will be thrown when arg_params or aux_params contain extra parameters that is not needed by the executor.
Examples
>>> # An example of setting module parameters. >>> sym, arg_params, aux_params = mx.model.load_checkpoint(model_prefix, n_epoch_load) >>> mod.set_params(arg_params=arg_params, aux_params=aux_params)
-
set_states
(states=None, value=None)[source]¶ Sets value for states. Only one of states & values can be specified.
- Parameters
states (list of list of NDArrays) – Source states arrays formatted like
[[state1_dev1, state1_dev2], [state2_dev1, state2_dev2]]
.value (number) – A single scalar value for all state arrays.
-
switch_bucket
(bucket_key, data_shapes, label_shapes=None)[source]¶ Switches to a different bucket. This will change
self.curr_module
.- Parameters
bucket_key (str (or any python object)) – The key of the target bucket.
data_shapes (list of (str, tuple)) – Typically
data_batch.provide_data
.label_shapes (list of (str, tuple)) – Typically
data_batch.provide_label
.
-
property
symbol
¶ The symbol of the current bucket being used.
-
update
()[source]¶ Updates parameters according to installed optimizer and the gradient computed in the previous forward-backward cycle.
When KVStore is used to update parameters for multi-device or multi-machine training, a copy of the parameters are stored in KVStore. Note that for row_sparse parameters, this function does update the copy of parameters in KVStore, but doesn’t broadcast the updated parameters to all devices / machines. Please call prepare to broadcast row_sparse parameters with the next batch of data.
-
update_metric
(eval_metric, labels, pre_sliced=False)[source]¶ Evaluates and accumulates evaluation metric on outputs of the last forward computation.
- Parameters
eval_metric (EvalMetric) –
labels (list of NDArray) – Typically
data_batch.label
.
-
class
mxnet.module.
Module
(symbol, data_names=('data', ), label_names=('softmax_label', ), logger=<module 'logging' from '/work/conda_env/lib/python3.8/logging/__init__.py'>, context=cpu(0), work_load_list=None, fixed_param_names=None, state_names=None, group2ctxs=None, compression_params=None)[source]¶ Bases:
mxnet.module.base_module.BaseModule
Module is a basic module that wrap a Symbol. It is functionally the same as the FeedForward model, except under the module API.
- Parameters
symbol (Symbol) –
data_names (list of str) – Defaults to (‘data’) for a typical model used in image classification.
label_names (list of str) – Defaults to (‘softmax_label’) for a typical model used in image classification.
logger (Logger) – Defaults to logging.
context (Context or list of Context) – Defaults to
mx.cpu()
.work_load_list (list of number) – Default
None
, indicating uniform workload.fixed_param_names (list of str) – Default
None
, indicating no network parameters are fixed.state_names (list of str) – states are similar to data and label, but not provided by data iterator. Instead they are initialized to 0 and can be set by set_states().
group2ctxs (dict of str to context or list of context,) – or list of dict of str to context Default is None. Mapping the ctx_group attribute to the context assignment.
compression_params (dict) – Specifies type of gradient compression and additional arguments depending on the type of compression being used. For example, 2bit compression requires a threshold. Arguments would then be {‘type’:’2bit’, ‘threshold’:0.5} See mxnet.KVStore.set_gradient_compression method for more details on gradient compression.
Methods
backward
([out_grads])Backward computation.
bind
(data_shapes[, label_shapes, …])Binds the symbols to construct executors.
borrow_optimizer
(shared_module)Borrows optimizer from a shared module.
forward
(data_batch[, is_train])Forward computation.
get_input_grads
([merge_multi_context])Gets the gradients with respect to the inputs of the module.
get_outputs
([merge_multi_context])Gets outputs of the previous forward computation.
Gets current parameters.
get_states
([merge_multi_context])Gets states from all devices.
init_optimizer
([kvstore, optimizer, …])Installs and initializes optimizers.
init_params
([initializer, arg_params, …])Initializes the parameters and auxiliary states.
install_monitor
(mon)Installs monitor on all executors.
load
(prefix, epoch[, load_optimizer_states])Creates a model from previously saved checkpoint.
load_optimizer_states
(fname)Loads optimizer (updater) state from a file.
prepare
(data_batch[, sparse_row_id_fn])Prepares the module for processing a data batch.
reshape
(data_shapes[, label_shapes])Reshapes the module for new input shapes.
save_checkpoint
(prefix, epoch[, …])Saves current progress to checkpoint.
save_optimizer_states
(fname)Saves optimizer (updater) state to a file.
set_params
(arg_params, aux_params[, …])Assigns parameter and aux state values.
set_states
([states, value])Sets value for states.
update
()Updates parameters according to the installed optimizer and the gradients computed in the previous forward-backward batch.
update_metric
(eval_metric, labels[, pre_sliced])Evaluates and accumulates evaluation metric on outputs of the last forward computation.
Attributes
A list of names for data required by this module.
Gets data shapes.
A list of names for labels required by this module.
Gets label shapes.
A list of names for the outputs of this module.
Gets output shapes.
-
backward
(out_grads=None)[source]¶ Backward computation.
See also
- Parameters
out_grads (NDArray or list of NDArray, optional) – Gradient on the outputs to be propagated back. This parameter is only needed when bind is called on outputs that are not a loss function.
-
bind
(data_shapes, label_shapes=None, for_training=True, inputs_need_grad=False, force_rebind=False, shared_module=None, grad_req='write')[source]¶ Binds the symbols to construct executors. This is necessary before one can perform computation with the module.
- Parameters
data_shapes (list of (str, tuple)) – Typically is
data_iter.provide_data
.label_shapes (list of (str, tuple)) – Typically is
data_iter.provide_label
.for_training (bool) – Default is
True
. Whether the executors should be bound for training.inputs_need_grad (bool) – Default is
False
. Whether the gradients to the input data need to be computed. Typically this is not needed. But this might be needed when implementing composition of modules.force_rebind (bool) – Default is
False
. This function does nothing if the executors are already bound. But with thisTrue
, the executors will be forced to rebind.shared_module (Module) – Default is
None
. This is used in bucketing. When notNone
, the shared module essentially corresponds to a different bucket – a module with different symbol but with the same sets of parameters (e.g. unrolled RNNs with different lengths).
-
borrow_optimizer
(shared_module)[source]¶ Borrows optimizer from a shared module. Used in bucketing, where exactly the same optimizer (esp. kvstore) is used.
- Parameters
shared_module (Module) –
-
property
data_names
¶ A list of names for data required by this module.
-
property
data_shapes
¶ Gets data shapes.
- Returns
- Return type
A list of (name, shape) pairs.
-
forward
(data_batch, is_train=None)[source]¶ Forward computation. It supports data batches with different shapes, such as different batch sizes or different image sizes. If reshaping of data batch relates to modification of symbol or module, such as changing image layout ordering or switching from training to predicting, module rebinding is required.
See also
- Parameters
data_batch (DataBatch) – Could be anything with similar API implemented.
is_train (bool) – Default is
None
, which meansis_train
takes the value ofself.for_training
.
-
get_input_grads
(merge_multi_context=True)[source]¶ Gets the gradients with respect to the inputs of the module.
If
merge_multi_context
isTrue
, it is like[grad1, grad2]
. Otherwise, it is like[[grad1_dev1, grad1_dev2], [grad2_dev1, grad2_dev2]]
. All the output elements are NDArray.- Parameters
merge_multi_context (bool) – Default is
True
. In the case when data-parallelism is used, the outputs will be collected from multiple devices. ATrue
value indicate that we should merge the collected results so that they look like from a single executor.- Returns
Input gradients
- Return type
list of NDArray or list of list of NDArray
-
get_outputs
(merge_multi_context=True)[source]¶ Gets outputs of the previous forward computation.
If
merge_multi_context
isTrue
, it is like[out1, out2]
. Otherwise, it is like[[out1_dev1, out1_dev2], [out2_dev1, out2_dev2]]
. All the output elements are NDArray. When merge_multi_context is False, those NDArray might live on different devices.- Parameters
merge_multi_context (bool) – Default is
True
. In the case when data-parallelism is used, the outputs will be collected from multiple devices. ATrue
value indicate that we should merge the collected results so that they look like from a single executor.- Returns
Output.
- Return type
list of NDArray or list of list of NDArray
-
get_params
()[source]¶ Gets current parameters.
- Returns
A pair of dictionaries each mapping parameter names to NDArray values.
- Return type
(arg_params, aux_params)
-
get_states
(merge_multi_context=True)[source]¶ Gets states from all devices.
If merge_multi_context is
True
, it is like[out1, out2]
. Otherwise, it is like[[out1_dev1, out1_dev2], [out2_dev1, out2_dev2]]
. All the output elements are NDArray.- Parameters
merge_multi_context (bool) – Default is
True
. In the case when data-parallelism is used, the states will be collected from multiple devices. ATrue
value indicate that we should merge the collected results so that they look like from a single executor.- Returns
States
- Return type
list of NDArray or list of list of NDArray
-
init_optimizer
(kvstore='local', optimizer='sgd', optimizer_params=(('learning_rate', 0.01), ), force_init=False)[source]¶ Installs and initializes optimizers.
- Parameters
kvstore (str or KVStore) – Default ‘local’.
optimizer (str or Optimizer) – Default ‘sgd’
optimizer_params (dict) – Default ((‘learning_rate’, 0.01),). The default value is not a dictionary, just to avoid pylint warning of dangerous default values.
force_init (bool) – Default
False
, indicating whether we should force re-initializing the optimizer in the case an optimizer is already installed.
-
init_params
(initializer=<mxnet.initializer.Uniform object>, arg_params=None, aux_params=None, allow_missing=False, force_init=False, allow_extra=False)[source]¶ Initializes the parameters and auxiliary states.
- Parameters
initializer (Initializer) – Called to initialize parameters if needed.
arg_params (dict) – If not
None
, should be a dictionary of existing arg_params. Initialization will be copied from that.aux_params (dict) – If not
None
, should be a dictionary of existing aux_params. Initialization will be copied from that.allow_missing (bool) – If
True
, params could contain missing values, and the initializer will be called to fill those missing params.force_init (bool) – If
True
, will force re-initialize even if already initialized.allow_extra (boolean, optional) – Whether allow extra parameters that are not needed by symbol. If this is True, no error will be thrown when arg_params or aux_params contain extra parameters that is not needed by the executor.
-
property
label_names
¶ A list of names for labels required by this module.
-
property
label_shapes
¶ Gets label shapes.
- Returns
The return value could be
None
if the module does not need labels, or if the module is not bound for training (in this case, label information is not available).- Return type
A list of (name, shape) pairs.
-
static
load
(prefix, epoch, load_optimizer_states=False, **kwargs)[source]¶ Creates a model from previously saved checkpoint.
- Parameters
prefix (str) – path prefix of saved model files. You should have “prefix-symbol.json”, “prefix-xxxx.params”, and optionally “prefix-xxxx.states”, where xxxx is the epoch number.
epoch (int) – epoch to load.
load_optimizer_states (bool) – whether to load optimizer states. Checkpoint needs to have been made with save_optimizer_states=True.
data_names (list of str) – Default is (‘data’) for a typical model used in image classification.
label_names (list of str) – Default is (‘softmax_label’) for a typical model used in image classification.
logger (Logger) – Default is logging.
context (Context or list of Context) – Default is
cpu()
.work_load_list (list of number) – Default
None
, indicating uniform workload.fixed_param_names (list of str) – Default
None
, indicating no network parameters are fixed.
-
load_optimizer_states
(fname)[source]¶ Loads optimizer (updater) state from a file.
- Parameters
fname (str) – Path to input states file.
-
property
output_names
¶ A list of names for the outputs of this module.
-
property
output_shapes
¶ Gets output shapes.
- Returns
- Return type
A list of (name, shape) pairs.
-
prepare
(data_batch, sparse_row_id_fn=None)[source]¶ Prepares the module for processing a data batch.
Usually involves switching bucket and reshaping. For modules that contain row_sparse parameters in KVStore, it prepares the row_sparse parameters based on the sparse_row_id_fn.
When KVStore is used to update parameters for multi-device or multi-machine training, a copy of the parameters are stored in KVStore. Note that for row_sparse parameters, the update() updates the copy of parameters in KVStore, but doesn’t broadcast the updated parameters to all devices / machines. The prepare function is used to broadcast row_sparse parameters with the next batch of data.
- Parameters
data_batch (DataBatch) – The current batch of data for forward computation.
sparse_row_id_fn (A callback function) – The function takes data_batch as an input and returns a dict of str -> NDArray. The resulting dict is used for pulling row_sparse parameters from the kvstore, where the str key is the name of the param, and the value is the row id of the param to pull.
-
reshape
(data_shapes, label_shapes=None)[source]¶ Reshapes the module for new input shapes.
- Parameters
data_shapes (list of (str, tuple)) – Typically is
data_iter.provide_data
.label_shapes (list of (str, tuple)) – Typically is
data_iter.provide_label
.
-
save_checkpoint
(prefix, epoch, save_optimizer_states=False, remove_amp_cast=True)[source]¶ Saves current progress to checkpoint. Use mx.callback.module_checkpoint as epoch_end_callback to save during training.
- Parameters
prefix (str) – The file prefix to checkpoint to.
epoch (int) – The current epoch number.
save_optimizer_states (bool) – Whether to save optimizer states to continue training.
-
save_optimizer_states
(fname)[source]¶ Saves optimizer (updater) state to a file.
- Parameters
fname (str) – Path to output states file.
-
set_params
(arg_params, aux_params, allow_missing=False, force_init=True, allow_extra=False)[source]¶ Assigns parameter and aux state values.
- Parameters
arg_params (dict) – Dictionary of name to NDArray.
aux_params (dict) – Dictionary of name to NDArray.
allow_missing (bool) – If
True
, params could contain missing values, and the initializer will be called to fill those missing params.force_init (bool) – If
True
, will force re-initialize even if already initialized.allow_extra (boolean, optional) – Whether allow extra parameters that are not needed by symbol. If this is True, no error will be thrown when arg_params or aux_params contain extra parameters that is not needed by the executor.
Examples
>>> # An example of setting module parameters. >>> sym, arg_params, aux_params = mx.model.load_checkpoint(model_prefix, n_epoch_load) >>> mod.set_params(arg_params=arg_params, aux_params=aux_params)
-
set_states
(states=None, value=None)[source]¶ Sets value for states. Only one of the states & value can be specified.
- Parameters
states (list of list of NDArrays) – source states arrays formatted like
[[state1_dev1, state1_dev2], [state2_dev1, state2_dev2]]
.value (number) – a single scalar value for all state arrays.
-
update
()[source]¶ Updates parameters according to the installed optimizer and the gradients computed in the previous forward-backward batch.
When KVStore is used to update parameters for multi-device or multi-machine training, a copy of the parameters are stored in KVStore. Note that for row_sparse parameters, this function does update the copy of parameters in KVStore, but doesn’t broadcast the updated parameters to all devices / machines. Please call prepare to broadcast row_sparse parameters with the next batch of data.
See also
-
update_metric
(eval_metric, labels, pre_sliced=False)[source]¶ Evaluates and accumulates evaluation metric on outputs of the last forward computation.
See also
- Parameters
eval_metric (EvalMetric) – Evaluation metric to use.
labels (list of NDArray if pre_sliced parameter is set to False,) – list of lists of NDArray otherwise. Typically data_batch.label.
pre_sliced (bool) – Whether the labels are already sliced per device (default: False).
-
class
mxnet.module.
PythonLossModule
(name='pyloss', data_names=('data', ), label_names=('softmax_label', ), logger=<module 'logging' from '/work/conda_env/lib/python3.8/logging/__init__.py'>, grad_func=None)[source]¶ Bases:
mxnet.module.python_module.PythonModule
A convenient module class that implements many of the module APIs as empty functions.
- Parameters
name (str) – Names of the module. The outputs will be named [name + ‘_output’].
data_names (list of str) – Defaults to
['data']
. Names of the data expected by this module. Should be a list of only one name.label_names (list of str) – Default
['softmax_label']
. Names of the labels expected by the module. Should be a list of only one name.grad_func (function) – Optional. If not
None
, should be a function that takes scores and labels, both of type NDArray, and return the gradients with respect to the scores according to this loss function. The return value could be a numpy array or an NDArray.
Methods
backward
([out_grads])Backward computation.
forward
(data_batch[, is_train])Forward computation.
get_input_grads
([merge_multi_context])Gets the gradients to the inputs, computed in the previous backward computation.
get_outputs
([merge_multi_context])Gets outputs of the previous forward computation.
install_monitor
(mon)Installs monitor on all executors.
-
backward
(out_grads=None)[source]¶ Backward computation.
- Parameters
out_grads (NDArray or list of NDArray, optional) – Gradient on the outputs to be propagated back. This parameter is only needed when bind is called on outputs that are not a loss function.
-
forward
(data_batch, is_train=None)[source]¶ Forward computation. Here we do nothing but to keep a reference to the scores and the labels so that we can do backward computation.
- Parameters
data_batch (DataBatch) – Could be anything with similar API implemented.
is_train (bool) – Default is
None
, which means is_train takes the value ofself.for_training
.
-
get_input_grads
(merge_multi_context=True)[source]¶ Gets the gradients to the inputs, computed in the previous backward computation.
- Parameters
merge_multi_context (bool) – Should always be
True
because we do not use multiple context for computation.
-
get_outputs
(merge_multi_context=True)[source]¶ Gets outputs of the previous forward computation. As a output loss module, we treat the inputs to this module as scores, and simply return them.
- Parameters
merge_multi_context (bool) – Should always be
True
, because we do not use multiple contexts for computing.
-
class
mxnet.module.
PythonModule
(data_names, label_names, output_names, logger=<module 'logging' from '/work/conda_env/lib/python3.8/logging/__init__.py'>)[source]¶ Bases:
mxnet.module.base_module.BaseModule
A convenient module class that implements many of the module APIs as empty functions.
- Parameters
data_names (list of str) – Names of the data expected by the module.
label_names (list of str) – Names of the labels expected by the module. Could be
None
if the module does not need labels.output_names (list of str) – Names of the outputs.
Methods
bind
(data_shapes[, label_shapes, …])Binds the symbols to construct executors.
Gets parameters, those are potentially copies of the actual parameters used to do computation on the device.
init_optimizer
([kvstore, optimizer, …])Installs and initializes optimizers.
init_params
([initializer, arg_params, …])Initializes the parameters and auxiliary states.
update
()Updates parameters according to the installed optimizer and the gradients computed in the previous forward-backward batch.
update_metric
(eval_metric, labels[, pre_sliced])Evaluates and accumulates evaluation metric on outputs of the last forward computation.
Attributes
A list of names for data required by this module.
A list of (name, shape) pairs specifying the data inputs to this module.
A list of (name, shape) pairs specifying the label inputs to this module.
A list of names for the outputs of this module.
A list of (name, shape) pairs specifying the outputs of this module.
-
bind
(data_shapes, label_shapes=None, for_training=True, inputs_need_grad=False, force_rebind=False, shared_module=None, grad_req='write')[source]¶ Binds the symbols to construct executors. This is necessary before one can perform computation with the module.
- Parameters
data_shapes (list of (str, tuple)) – Typically is
data_iter.provide_data
.label_shapes (list of (str, tuple)) – Typically is
data_iter.provide_label
.for_training (bool) – Default is
True
. Whether the executors should be bind for training.inputs_need_grad (bool) – Default is
False
. Whether the gradients to the input data need to be computed. Typically this is not needed. But this might be needed when implementing composition of modules.force_rebind (bool) – Default is
False
. This function does nothing if the executors are already bound. But with thisTrue
, the executors will be forced to rebind.shared_module (Module) – Default is
None
. This is used in bucketing. When notNone
, the shared module essentially corresponds to a different bucket – a module with different symbol but with the same sets of parameters (e.g. unrolled RNNs with different lengths).grad_req (str, list of str, dict of str to str) – Requirement for gradient accumulation. Can be ‘write’, ‘add’, or ‘null’ (default to ‘write’). Can be specified globally (str) or for each argument (list, dict).
-
property
data_names
¶ A list of names for data required by this module.
-
property
data_shapes
¶ A list of (name, shape) pairs specifying the data inputs to this module.
-
get_params
()[source]¶ Gets parameters, those are potentially copies of the actual parameters used to do computation on the device. Subclass should override this method if contains parameters.
- Returns
- Return type
({}, {})
, a pair of empty dict.
-
init_optimizer
(kvstore='local', optimizer='sgd', optimizer_params=(('learning_rate', 0.01), ), force_init=False)[source]¶ Installs and initializes optimizers. By default we do nothing. Subclass should override this method if needed.
- Parameters
kvstore (str or KVStore) – Default ‘local’.
optimizer (str or Optimizer) – Default ‘sgd’
optimizer_params (dict) – Default ((‘learning_rate’, 0.01),). The default value is not a dictionary, just to avoid pylint warning of dangerous default values.
force_init (bool) – Default False, indicating whether we should force re-initializing the optimizer in the case an optimizer is already installed.
-
init_params
(initializer=<mxnet.initializer.Uniform object>, arg_params=None, aux_params=None, allow_missing=False, force_init=False, allow_extra=False)[source]¶ Initializes the parameters and auxiliary states. By default this function does nothing. Subclass should override this method if contains parameters.
- Parameters
initializer (Initializer) – Called to initialize parameters if needed.
arg_params (dict) – If not
None
, should be a dictionary of existing arg_params. Initialization will be copied from that.aux_params (dict) – If not
None
, should be a dictionary of existing aux_params. Initialization will be copied from that.allow_missing (bool) – If
True
, params could contain missing values, and the initializer will be called to fill those missing params.force_init (bool) – If
True
, will force re-initialize even if already initialized.allow_extra (boolean, optional) – Whether allow extra parameters that are not needed by symbol. If this is True, no error will be thrown when arg_params or aux_params contain extra parameters that is not needed by the executor.
-
property
label_shapes
¶ A list of (name, shape) pairs specifying the label inputs to this module. If this module does not accept labels – either it is a module without loss function, or it is not bound for training, then this should return an empty list
[]`
.
-
property
output_names
¶ A list of names for the outputs of this module.
-
property
output_shapes
¶ A list of (name, shape) pairs specifying the outputs of this module.
-
update
()[source]¶ Updates parameters according to the installed optimizer and the gradients computed in the previous forward-backward batch. Currently we do nothing here. Subclass should override this method if contains parameters.
-
update_metric
(eval_metric, labels, pre_sliced=False)[source]¶ Evaluates and accumulates evaluation metric on outputs of the last forward computation. Subclass should override this method if needed.
- Parameters
eval_metric (EvalMetric) –
labels (list of NDArray) – Typically
data_batch.label
.
-
class
mxnet.module.
SequentialModule
(logger=<module 'logging' from '/work/conda_env/lib/python3.8/logging/__init__.py'>)[source]¶ Bases:
mxnet.module.base_module.BaseModule
A SequentialModule is a container module that can chain multiple modules together.
Note
Building a computation graph with this kind of imperative container is less flexible and less efficient than the symbolic graph. So, this should be only used as a handy utility.
Methods
add
(module, **kwargs)Add a module to the chain.
backward
([out_grads])Backward computation.
bind
(data_shapes[, label_shapes, …])Binds the symbols to construct executors.
forward
(data_batch[, is_train])Forward computation.
get_input_grads
([merge_multi_context])Gets the gradients with respect to the inputs of the module.
get_outputs
([merge_multi_context])Gets outputs from a previous forward computation.
Gets current parameters.
init_optimizer
([kvstore, optimizer, …])Installs and initializes optimizers.
init_params
([initializer, arg_params, …])Initializes parameters.
install_monitor
(mon)Installs monitor on all executors.
update
()Updates parameters according to installed optimizer and the gradient computed in the previous forward-backward cycle.
update_metric
(eval_metric, labels[, pre_sliced])Evaluates and accumulates evaluation metric on outputs of the last forward computation.
Attributes
A list of names for data required by this module.
Gets data shapes.
Gets label shapes.
A list of names for the outputs of this module.
Gets output shapes.
-
add
(module, **kwargs)[source]¶ Add a module to the chain.
- Parameters
module (BaseModule) – The new module to add.
kwargs (
**keywords
) –All the keyword arguments are saved as meta information for the added module. The currently known meta includes
- take_labels: indicating whether the module expect to
take labels when doing computation. Note any module in the chain can take labels (not necessarily only the top most one), and they all take the same labels passed from the original data batch for the SequentialModule.
- Returns
This function returns self to allow us to easily chain a series of add calls.
- Return type
self
Examples
>>> # An example of addinging two modules to a chain. >>> seq_mod = mx.mod.SequentialModule() >>> seq_mod.add(mod1) >>> seq_mod.add(mod2)
-
bind
(data_shapes, label_shapes=None, for_training=True, inputs_need_grad=False, force_rebind=False, shared_module=None, grad_req='write')[source]¶ Binds the symbols to construct executors. This is necessary before one can perform computation with the module.
- Parameters
data_shapes (list of (str, tuple)) – Typically is data_iter.provide_data.
label_shapes (list of (str, tuple)) – Typically is data_iter.provide_label.
for_training (bool) – Default is
True
. Whether the executors should be bind for training.inputs_need_grad (bool) – Default is
False
. Whether the gradients to the input data need to be computed. Typically this is not needed. But this might be needed when implementing composition of modules.force_rebind (bool) – Default is
False
. This function does nothing if the executors are already bound. But with thisTrue
, the executors will be forced to rebind.shared_module (Module) – Default is
None
. Currently shared module is not supported for SequentialModule.grad_req (str, list of str, dict of str to str) – Requirement for gradient accumulation. Can be ‘write’, ‘add’, or ‘null’ (default to ‘write’). Can be specified globally (str) or for each argument (list, dict).
-
property
data_names
¶ A list of names for data required by this module.
-
property
data_shapes
¶ Gets data shapes.
- Returns
A list of (name, shape) pairs. The data shapes of the first module is the data shape of a SequentialModule.
- Return type
list
-
forward
(data_batch, is_train=None)[source]¶ Forward computation.
- Parameters
data_batch (DataBatch) –
is_train (bool) – Default is
None
, in which case is_train is take asself.for_training
.
-
get_input_grads
(merge_multi_context=True)[source]¶ Gets the gradients with respect to the inputs of the module.
- Parameters
merge_multi_context (bool) – Default is
True
. In the case when data-parallelism is used, the outputs will be collected from multiple devices. ATrue
value indicate that we should merge the collected results so that they look like from a single executor.- Returns
If merge_multi_context is
True
, it is like[grad1, grad2]
. Otherwise, it is like[[grad1_dev1, grad1_dev2], [grad2_dev1, grad2_dev2]]
. All the output elements are NDArray.- Return type
list of NDArrays or list of list of NDArrays
-
get_outputs
(merge_multi_context=True)[source]¶ Gets outputs from a previous forward computation.
- Parameters
merge_multi_context (bool) – Default is
True
. In the case when data-parallelism is used, the outputs will be collected from multiple devices. ATrue
value indicate that we should merge the collected results so that they look like from a single executor.- Returns
If merge_multi_context is
True
, it is like[out1, out2]
. Otherwise, it is like[[out1_dev1, out1_dev2], [out2_dev1, out2_dev2]]
. All the output elements are numpy arrays.- Return type
list of NDArray or list of list of NDArray
-
get_params
()[source]¶ Gets current parameters.
- Returns
A pair of dictionaries each mapping parameter names to NDArray values. This is a merged dictionary of all the parameters in the modules.
- Return type
(arg_params, aux_params)
-
init_optimizer
(kvstore='local', optimizer='sgd', optimizer_params=(('learning_rate', 0.01), ), force_init=False)[source]¶ Installs and initializes optimizers.
- Parameters
kvstore (str or KVStore) – Default ‘local’.
optimizer (str or Optimizer) – Default ‘sgd’
optimizer_params (dict) – Default
(('learning_rate', 0.01),)
. The default value is not a dictionary, just to avoid pylint warning of dangerous default values.force_init (bool) – Default
False
, indicating whether we should force re-initializing the optimizer in the case an optimizer is already installed.
-
init_params
(initializer=<mxnet.initializer.Uniform object>, arg_params=None, aux_params=None, allow_missing=False, force_init=False, allow_extra=False)[source]¶ Initializes parameters.
- Parameters
initializer (Initializer) –
arg_params (dict) – Default
None
. Existing parameters. This has higher priority than initializer.aux_params (dict) – Default
None
. Existing auxiliary states. This has higher priority than initializer.allow_missing (bool) – Allow missing values in arg_params and aux_params (if not
None
). In this case, missing values will be filled with initializer.force_init (bool) – Default
False
.allow_extra (boolean, optional) – Whether allow extra parameters that are not needed by symbol. If this is True, no error will be thrown when arg_params or aux_params contain extra parameters that is not needed by the executor.
-
property
label_shapes
¶ Gets label shapes.
- Returns
A list of (name, shape) pairs. The return value could be None if the module does not need labels, or if the module is not bound for training (in this case, label information is not available).
- Return type
list
-
property
output_names
¶ A list of names for the outputs of this module.
-
property
output_shapes
¶ Gets output shapes.
- Returns
A list of (name, shape) pairs. The output shapes of the last module is the output shape of a SequentialModule.
- Return type
list
-
update
()[source]¶ Updates parameters according to installed optimizer and the gradient computed in the previous forward-backward cycle.
-
update_metric
(eval_metric, labels, pre_sliced=False)[source]¶ Evaluates and accumulates evaluation metric on outputs of the last forward computation.
- Parameters
eval_metric (EvalMetric) –
labels (list of NDArray) – Typically
data_batch.label
.
-
此页内容是否对您有帮助
感谢反馈!