tfmri.optimize.gradient_descent¶

gradient_descent(value_and_gradients_function, initial_position, step_size, max_iterations=50, grad_tolerance=1e-08, x_tolerance=0, f_relative_tolerance=0, f_absolute_tolerance=0, name=None)[source]¶

Applies gradient descent to minimize a differentiable function.

Parameters

value_and_gradients_function –
A callable that accepts a point as a real/complex tf.Tensor and returns a tuple of tf.Tensor objects containing the value of the function (real dtype) and its gradient (real/complex dtype) at that point. The function to be minimized. The input should be of shape [..., n], where n is the size of the domain of input points, and all others are batching dimensions. The first component of the return value should be a real tf.Tensor of matching shape [...]. The second component (the gradient) should also be of shape [..., n] like the input value to the function. Given a function definition that returns the value of the function to be minimized, the value and gradients function may be obtained using the tfmri.math.make_val_and_grad_fn decorator.
initial_position –
A tf.Tensor of shape [..., n]. The starting point, or points when using batch dimensions, of the search procedure.
step_size –
A scalar real tf.Tensor. The step size to use in the gradient descent update.
max_iterations –
A scalar integer tf.Tensor. The maximum number of gradient descent iterations.
grad_tolerance –
A scalar tf.Tensor of real dtype. Specifies the gradient tolerance for the procedure. If the supremum norm of the gradient vector is below this number, the algorithm is stopped.
x_tolerance –
A scalar tf.Tensor of real dtype. If the absolute change in the position between one iteration and the next is smaller than this number, the algorithm is stopped.
f_relative_tolerance –
A scalar tf.Tensor of real dtype. If the relative change in the objective value between one iteration and the next is smaller than this value, the algorithm is stopped.
f_absolute_tolerance –
A scalar tf.Tensor of real dtype. If the absolute change in the objective value between one iteration and the next is smaller than this value, the algorithm is stopped.
name – A str. The name of this operation.

Returns

A namedtuple containing the following fields

converged: A boolean tf.Tensor of shape [...] indicating whether the minimum was found within tolerance for each batch member.
num_iterations: A scalar integer tf.Tensor containing the number of iterations of the GD update.
objective_value: A tf.Tensor of shape [...] with the value of the objective function at the position. If the search converged, then this is the (local) minimum of the objective function.
objective_gradient: A tf.Tensor of shape [..., n] containing the gradient of the objective function at the position. If the search converged the max-norm of this tensor should be below the tolerance.
position: A tf.Tensor of shape [..., n] containing the last argument value found during the search. If the search converged, then this value is the argmin of the objective function.