I am teaching myself some coding, and as my first "big" project I tried implementing a Steepest Descent algorithm to minimize the Rosenbrock function:
$$f(x, y) = 100 (y - x^2)^2 + (1 - x)^2$$
The algorithm goes like this: We start with an initial guess \$x_0\$ (vector). We update the guess using the formula
$$x_{k+1} = x_k - alpha (\nabla f(x_k) \cdot \nabla f(x_k))$$
where alpha is to be chosen so that is satisfies the Armijo condition. We keep repeating until we reach a point where the gradient is less than 0.1 in both components.
Could you please tell me any ways in which I could improve my algorithm? In particular, I'm looking to increase its speed. From the current starting point, it takes about 30 seconds to run on my computer (16GM ram, i7 processor).
Remark: The reason I keep using np.array([[1, 2, 3]])
for vectors is so that I can transpose and matrix multiply them at will. I'm not sure if this is a good practice.
# This program uses the Steepest Descent Method to
# minimize the Rosenbrock function
import numpy as np
# Define the Rosenbrock Function
def f(x_k):
x, y = x_k[0, 0], x_k[0, 1]
return 100 * (y - x**2)**2 + (1 - x)**2
# Gradient of f
def gradient(x_k):
x, y = x_k[0, 0], x_k[0, 1]
return np.array([[-400*x*(y-x**2)-2*(1-x), 200*(y-x**2)]])
def main():
# Define the starting guess
x_k = np.array([[10, 5]])
# Define counter for number of steps
numSteps = 0
# Keep iterating until both components of the gradient are less than 0.1 in absolute value
while abs((gradient(x_k)[0, 0])) > 0.1 or abs((gradient(x_k))[0, 1]) > 0.1:
numSteps = numSteps + 1
# Step direction
p_k = - gradient(x_k)
gradTrans = - p_k.T
# Now we use a backtracking algorithm to find a step length
alpha = 1.0
ratio = 0.8
c = 0.01 # This is just a constant that is used in the algorithm
# This loop selects an alpha which satisfies the Armijo condition
while f(x_k + alpha * p_k) > f(x_k) + (alpha * c * (gradTrans @ p_k))[0, 0]:
alpha = ratio * alpha
x_k = x_k + alpha * p_k
print("The number of steps is: ", numSteps)
print("The final step is:", x_k)
print("The gradient is: ", gradient(x_k))
main()