tfmri.optimize.gradient_descent
tfmri.optimize.gradient_descent¶
- gradient_descent(value_and_gradients_function, initial_position, step_size, max_iterations=50, grad_tolerance=1e-08, x_tolerance=0, f_relative_tolerance=0, f_absolute_tolerance=0, name=None)[source]¶
Applies gradient descent to minimize a differentiable function.
- Parameters
value_and_gradients_function –
A callable that accepts a point as a real/complex tf.Tensor and returns a tuple of tf.Tensor objects containing the value of the function (real dtype) and its gradient (real/complex dtype) at that point. The function to be minimized. The input should be of shape
[..., n]
, wheren
is the size of the domain of input points, and all others are batching dimensions. The first component of the return value should be a real tf.Tensor of matching shape[...]
. The second component (the gradient) should also be of shape[..., n]
like the input value to the function. Given a function definition that returns the value of the function to be minimized, the value and gradients function may be obtained using the tfmri.math.make_val_and_grad_fn decorator.initial_position –
A tf.Tensor of shape
[..., n]
. The starting point, or points when using batch dimensions, of the search procedure.step_size –
A scalar real tf.Tensor. The step size to use in the gradient descent update.
max_iterations –
A scalar integer tf.Tensor. The maximum number of gradient descent iterations.
grad_tolerance –
A scalar tf.Tensor of real dtype. Specifies the gradient tolerance for the procedure. If the supremum norm of the gradient vector is below this number, the algorithm is stopped.
x_tolerance –
A scalar tf.Tensor of real dtype. If the absolute change in the position between one iteration and the next is smaller than this number, the algorithm is stopped.
f_relative_tolerance –
A scalar tf.Tensor of real dtype. If the relative change in the objective value between one iteration and the next is smaller than this value, the algorithm is stopped.
f_absolute_tolerance –
A scalar tf.Tensor of real dtype. If the absolute change in the objective value between one iteration and the next is smaller than this value, the algorithm is stopped.
name – A str. The name of this operation.
- Returns
A namedtuple containing the following fields
converged
: A boolean tf.Tensor of shape[...]
indicating whether the minimum was found within tolerance for each batch member.num_iterations
: A scalar integer tf.Tensor containing the number of iterations of the GD update.objective_value
: A tf.Tensor of shape[...]
with the value of the objective function at theposition
. If the search converged, then this is the (local) minimum of the objective function.objective_gradient
: A tf.Tensor of shape[..., n]
containing the gradient of the objective function at theposition
. If the search converged the max-norm of this tensor should be below the tolerance.position
: A tf.Tensor of shape[..., n]
containing the last argument value found during the search. If the search converged, then this value is the argmin of the objective function.