Parameter-Shift Rule for Rotation gates
In this tutorial we use the Parameter Shift Rule (PSR) [1, 2] for evaluating the gradients of a variational quantum circuit with respect a variational parameter.
The parameter shift rule in a nutshell
Let's consider a parametrized circuit , in which we build up an unitary gate of the form:
which has at most two eigenvalues . Let's consider an observable and, finally, let be the state we obtain by applying to . We are interested in evaluating the gradients of the following expression:
where we specify that depends directly on the parameter . We are interested in this result because the expectation value of is typically involved in computing predictions in quantum machine learning problems. The PSR allows us to calculate the derivative of with respect to a evaluating twice more:
where is obtained as and . Finally, if we pick from the rotations generators we can use and .
Loading required features
For building the parameter_shift
method we need to use some qibo features in order
to implement the previously introduced mathematical components.
import qibo
import numpy as np
from qibo import hamiltonians, gates
from qibo.models import Circuit
from qibo.hamiltonians.abstract import AbstractHamiltonian
Now we have can write a parameter_shift
function, in which we take into account
an hamiltonian (which is our observable), the index which identify the target
variational parameter, the initial state of the circuit and the wigenvalues of
the target observable.
def parameter_shift(
circuit, hamiltonian, parameter_index, generator_eigenval, initial_state=None
):
# inheriting hamiltonian's backend
backend = hamiltonian.backend
# defining the shift according to the psr
s = np.pi / (4 * generator_eigenval)
# saving original parameters and making a copy
original = np.asarray(circuit.get_parameters()).copy()
shifted = original.copy()
# forward shift and evaluation
shifted[parameter_index] += s
circuit.set_parameters(shifted)
forward = hamiltonian.expectation(backend.execute_circuit(circuit=circuit, initial_state=initial_state).state())
# backward shift and evaluation
shifted[parameter_index] -= 2 * s
circuit.set_parameters(shifted)
backward = hamiltonian.expectation(backend.execute_circuit(circuit=circuit, initial_state=initial_state).state())
# restoring the original circuit
circuit.set_parameters(original)
return generator_eigenval * (forward - backward)
Now we have a parameter_shift
function and we can use it for calculating the
gradients of the expected value of on the final state with respect to .
In order to check the results, we compare them with the same variables evaluated
using the GradientTape()
module of tensorflow
.
For doing this, we need to load tensorflow
and to activate the appropriate
qibo
's backend.
# in order to see the difference with tf gradients
import tensorflow as tf
qibo.set_backend('tensorflow')
Now we can define the hamiltonian (in this case we use a Pauli Z as observable) and a parametrized circuit.
# defining an observable
def hamiltonian(nqubits = 1):
m0 = (1/nqubits)*hamiltonians.Z(nqubits).matrix
ham = hamiltonians.Hamiltonian(nqubits, m0)
return ham
# defining a dummy circuit
def circuit(nqubits = 1):
c = Circuit(nqubits = 1)
c.add(gates.RY(q = 0, theta = 0))
c.add(gates.RX(q = 0, theta = 0))
c.add(gates.M(0))
return c
This is the moment to write a function which returns the tensorflow
values of the gradients.
# using GradientTape to benchmark
def gradient_tape(params):
params = tf.Variable(params)
with tf.GradientTape() as tape:
c = circuit(nqubits = 1)
c.set_parameters(params)
h = hamiltonian()
expected_value = h.expectation(c.execute().state())
grads = tape.gradient(expected_value, [params])
return grads
In order to check the difference, we randomly generate some parameters and we impose them as variational parameters of the circuit.
# some parameters
test_params = np.random.randn(2)
c.set_parameters(test_params)
Here we are!
Now we can calculate the gradients using the two methods.
test_hamiltonian = hamiltonian()
# running the psr with respect to the two parameters
grad_0 = parameter_shift(circuit = c, hamiltonian = test_hamiltonian, parameter_index = 0, generator_eigenval = 0.5)
grad_1 = parameter_shift(circuit = c, hamiltonian = test_hamiltonian, parameter_index = 1, generator_eigenval = 0.5)
tf_grads = gradient_tape(test_params)
print('Test gradient with respect params[0] with PSR: ', grad_0.numpy())
print('Test gradient with respect params[0] with tf: ', tf_grads[0][0].numpy())
print('Test gradient with respect params[0] with PSR: ', grad_1.numpy())
print('Test gradient with respect params[0] with tf: ', tf_grads[0][1].numpy())
And the output should be similar to the following:
Test gradient with respect params[0] with PSR: 0.09416555057174314
Test gradient with respect params[0] with tf: 0.09416555057174325
Test gradient with respect params[0] with PSR: -0.033018344618441414
Test gradient with respect params[0] with tf: -0.033018344618441484
As you can see, the values are identical!
References
[1] Kosuke Mitarai, Makoto Negoro, Masahiro Kitagawa, Keisuke Fujii, Quantum Circuit Learning, (2018), arXiv:1803.00745v3
[2] Maria Schuld, Ville Bergholm, Christian Gogolin, Josh Izaac, Nathan Killoran, Evaluating analytic gradients on quantum hardware, (2018), arXiv:1811.11184v1