tf.compat.v1.train.inverse_time

Applies linear cosine decay to the learning rate.

tf.compat.v1.train.linear_cosine_decay(
    learning_rate, global_step, decay_steps, num_periods=0.5, alpha=0.0, beta=0.001,
    name=None
)

Note that linear cosine decay is more aggressive than cosine decay and larger initial learning rates can typically be used.

When training a model, it is often recommended to lower the learning rate as the training progresses. This function applies a linear cosine decay function to a provided initial learning rate. It requires a global_step value to compute the decayed learning rate. You can just pass a TensorFlow variable that you increment at each training step.

The function returns the decayed learning rate. It is computed as:

global_step = min(global_step, decay_steps)
linear_decay = (decay_steps - global_step) / decay_steps)
cosine_decay = 0.5 * (
    1 + cos(pi * 2 * num_periods * global_step / decay_steps))
decayed = (alpha + linear_decay) * cosine_decay + beta
decayed_learning_rate = learning_rate * decayed

Example usage:

decay_steps = 1000
lr_decayed = linear_cosine_decay(learning_rate, global_step, decay_steps)

Args
`learning_rate`	A scalar `float32` or `float64` Tensor or a Python number. The initial learning rate.
`global_step`	A scalar `int32` or `int64` `Tensor` or a Python number. Global step to use for the decay computation.
`decay_steps`	A scalar `int32` or `int64` `Tensor` or a Python number. Number of steps to decay over.
`num_periods`	Number of periods in the cosine part of the decay. See computation above.
`alpha`	See computation above.
`beta`	See computation above.
`name`	String. Optional name of the operation. Defaults to 'LinearCosineDecay'.

Returns
A scalar `Tensor` of the same type as `learning_rate`. The decayed learning rate.

Raises
`ValueError`	if `global_step` is not supplied.

References:

Neural Optimizer Search with Reinforcement Learning: Bello et al., 2017 (pdf) Stochastic Gradient Descent with Warm Restarts: Loshchilov et al., 2017 (pdf)

Eager Compatibility

When eager execution is enabled, this function returns a function which in turn returns the decayed learning rate Tensor. This can be useful for changing the learning rate value across different invocations of optimizer functions.

TensorFlow

tf

tf.audio

tf.autograph

tf.bitwise

tf.compat

tf.config

tf.data

tf.debugging

tf.distribute

tf.dtypes

tf.errors

tf.estimator

tf.experimental

tf.feature_column

tf.graph_util

tf.image

tf.initializers

tf.io

tf.keras

tf.linalg

tf.lite

tf.lookup

tf.losses

tf.math

tf.metrics

tf.nest

tf.nn

tf.optimizers

tf.quantization

tf.queue

tf.ragged

tf.random

tf.raw_ops

tf.saved_model

tf.sets

tf.signal

tf.sparse

tf.strings

tf.summary

tf.sysconfig

tf.test

tf.tpu

tf.train

tf.version

tf.xla

tf.compat / v1 / v1.train.inverse_time_decay

Example usage:

Args

Returns

Raises

References:

Eager Compatibility