I am running some examples and tests on TensorFlow Quantum (TFQ) and I am struggling to perform a multi-class classification. I will used the MNIST classification example as base (https://www.tensorflow.org/quantum/tutorials/mnist), since this is where I am starting from too.
For binary classification I played with the different examples of classes and different gates combination, and the classification result is obtained by measuring a single readout qubit (qR)result, thus if qR=0 we classify with class 0 and if qR=1 then we have class 1.
I extended it to a multi-class problems, so we have a 4 classes (0,1,2,3). To do this I change the labels of the classes with tf.keras.utils.to_categorical(y_train)
, such that the labels get converted from single values to vectors (0 -> (1,0,0,0); 1-> (0,1,0,0); etc..), use tf.keras.losses.CategoricalHinge()
as loss of the model and create 4 readouts qubits, one for each class (M(qR0, qR1, qR2, qR3) = (0,0,1,0) -> class 2), and this works.
However, this method increases massively the size of the circuit. So what I want to do is to pass to TFQ only 2 readout qubits and use the combined measurement for the 4 classes classification (|00> = 0, |10> = 1, |01> = 2, |11> = 3). Ideally this would allow a 2^n multi-class classification, where n is the number of qubits. In Cirq I can achieved this output by performing a cirq.measure(qR0, qR1, key='measure')
on the two readout qubits. However I am struggling in passing such command to TFQ, since from what I understand it measures only the qubits that end with a single qubit Pauli gate.
So, is there something that I am missing in the functionalities of TFQ that allows such kind of measurements in the training process?
Starting with this snippet:
bit = cirq.GridQubit(0, 0)
symbols = sympy.symbols('x, y, z')
# !This is important!
ops = [-1.0 * cirq.Z(bit), cirq.X(bit) + 2.0 * cirq.Z(bit)]
# !This is important!
circuit_list = [
_gen_single_bit_rotation_problem(bit, symbols),
cirq.Circuit(
cirq.Z(bit) ** symbols[0],
cirq.X(bit) ** symbols[1],
cirq.Z(bit) ** symbols[2]
),
cirq.Circuit(
cirq.X(bit) ** symbols[0],
cirq.Z(bit) ** symbols[1],
cirq.X(bit) ** symbols[2]
)
]
expectation_layer = tfq.layers.Expectation()
output = expectation_layer(
circuit_list, symbol_names=symbols, operators = ops)
# Here output[i][j] corresponds to the expectation of all the ops
# in ops w.r.t circuits[i] where keras managed variables are
# placed in the symbols 'x', 'y', 'z'.
tf.shape(output)
Which I took from here: https://www.tensorflow.org/quantum/api_docs/python/tfq/layers/Expectation .
The shape of the output
tensor is [3, 2]
Where I have 3 different circuits and I took two expectation values over each circuit. The value at [1, 0]
of output
would be:
Then the value at [2, 1]
of output
would be:
The shape and contents of output
's values are partly dictated by the shape and contents of ops
. If I wanted to make the output shape [3, 3]
I could just add another valid cirq.PauliSum
object to the ops
list. In your case if you want the probability of getting 00, 01, 10, 11, on two particular cirq.GridQubit
s q0
and q1
you can do something like this:
def zero_proj(qubit):
return (1 + cirq.Z(qubit)) / 2
def one_proj(qubit):
return (1 - cirq.Z(qubit)) / 2
# ! This is important
ops = [
zero_proj(q0) * zero_proj(q1),
zero_proj(q0) * one_proj(q1),
one_proj(q0) * zero_proj(q1),
one_proj(q0)* one_proj(q1)
]
# ! This is important
Making the output shape of any layer that ingests ops
: [whatever_your_batch_size_is, 4]
. Does this help clear things up ?