python machine-learning euclidean-distance quantum-computing

Distance Estimation Using Quantum Computer

I did a small benchmarking to compare quantum version of algorithm to its classical version, and I found that quantum computing taking so much time in comparison to classical version.

I don't understand why its happening, it should be either similar to classical or better.

DataSet Explanation: 1 test data point and 3 training data points, dimension = 2. Goal: our goal is to classify the test data point in one of training data point category.

import matplotlib.pyplot as plt
import pandas as pd
from numpy import pi
from qiskit import Aer, execute
from qiskit import QuantumCircuit
from qiskit import QuantumRegister, ClassicalRegister
from qiskit import IBMQ
import os
import time

# IBMQ Configure
# IBMQ.save_account(os.environ.get('IBM'))
# IBMQ.load_account()
# provider = IBMQ.get_provider('ibm-q')
# qcomp = provider.get_backend('ibmq_16_melbourne')
##

fig, ax = plt.subplots()
ax.set(xlabel='Data Feature 1', ylabel='Data Feature 2')

# Get the data from the .csv file
data = pd.read_csv('data.csv',
                   usecols=['Feature 1', 'Feature 2', 'Class'])

# Create binary variables to filter data
isGreen = data['Class'] == 'Green'
isBlue = data['Class'] == 'Blue'
isBlack = data['Class'] == 'Black'

# Filter data
greenData = data[isGreen].drop(['Class'], axis=1)
blueData = data[isBlue].drop(['Class'], axis=1)
blackData = data[isBlack].drop(['Class'], axis=1)

# This is the point we need to classify
y_p = 0.141
x_p = -0.161

# Finding the x-coords of the centroids
xgc = sum(greenData['Feature 1']) / len(greenData['Feature 1'])
xbc = sum(blueData['Feature 1']) / len(blueData['Feature 1'])
xkc = sum(blackData['Feature 1']) / len(blackData['Feature 1'])

# Finding the y-coords of the centroids
ygc = sum(greenData['Feature 2']) / len(greenData['Feature 2'])
ybc = sum(blueData['Feature 2']) / len(blueData['Feature 2'])
ykc = sum(blackData['Feature 2']) / len(blackData['Feature 2'])

# Plotting the centroids
plt.plot(xgc, ygc, 'gx')
plt.plot(xbc, ybc, 'bx')
plt.plot(xkc, ykc, 'kx')

# Plotting the new data point
plt.plot(x_p, y_p, 'ro')

# Setting the axis ranges
plt.axis([-1, 1, -1, 1])

plt.show()

# Calculating theta and phi values
phi_list = [((x + 1) * pi / 2) for x in [x_p, xgc, xbc, xkc]]
theta_list = [((x + 1) * pi / 2) for x in [y_p, ygc, ybc, ykc]]

#----- quantum start time -------#
st = time.time()
# Create a 2 qubit QuantumRegister - two for the vectors, and
# one for the ancillary qubit
qreg = QuantumRegister(3)

# Create a one bit ClassicalRegister to hold the result
# of the measurements
creg = ClassicalRegister(1)

qc = QuantumCircuit(qreg, creg, name='qc')

# Get backend using the Aer provider
backend = Aer.get_backend('qasm_simulator')

# Create list to hold the results
results_list = []

# Estimating distances from the new point to the centroids
for i in range(1, 4):
    # Apply a Hadamard to the ancillary
    qc.h(qreg[2])

    # Encode new point and centroid
    qc.u(theta_list[0], phi_list[0], 0, qreg[0])
    qc.u(theta_list[i], phi_list[i], 0, qreg[1])

    # Perform controlled swap
    qc.cswap(qreg[2], qreg[0], qreg[1])
    # Apply second Hadamard to ancillary
    qc.h(qreg[2])

    # Measure ancillary
    qc.measure(qreg[2], creg[0])

    # run on quantum computer
    # job = execute(qc, backend=qcomp, shots=1024)
    # job_monitor(job)

    # Reset qubits
    qc.reset(qreg)

    # Register and execute job
    job = execute(qc, backend=backend, shots=1024)
    result = job.result().get_counts(qc)
    results_list.append(result['1'])

et = time.time()
# --------- end time ----------

print(results_list)
print('final circuit fig')
print(qc.draw())

# Create a list to hold the possible classes
class_list = ['Green', 'Blue', 'Black']

# Find out which class the new data point belongs to 
# according to our distance estimation algorithm
quantum_p_class = class_list[results_list.index(min(results_list))]

# Find out which class the new data point belongs to 
# according to classical euclidean distance calculation

# classical start time
cst = time.time()
distances_list = [((x_p - i[0]) ** 2 + (y_p - i[1]) ** 2) ** 0.5 for i in [(xgc, ygc), (xbc, ybc), (xkc, ykc)]]
cet = time.time()

classical_p_class = class_list[distances_list.index(min(distances_list))]


# Print time taken
print("classical time => ", cet-cst)
print("quantum time => ", et-st)

# Print results
print("""According to our distance algorithm, the new data point belongs to the""", quantum_p_class, 'class.\n')
print('Euclidean distances: ', distances_list, '\n')
print("""According to euclidean distance calculations, the new data point belongs to the""", classical_p_class,
      'class.')

Output:

classical time =>  **1.0967254638671875e-05**
quantum time =>  **0.2530648708343506**  // more time
According to our distance algorithm, the new data point belongs to the Blue class.

Euclidean distances:  [0.520285324797846, 0.4905204028376393, 0.7014755294377704] 

According to euclidean distance calculations, the new data point belongs to the Blue class.

I am not able to understand, why quantum computing taking so much time.

Solution

I am a physicist and programmer who has worked extensively on Qiskit. I have limited experience with things like machine learning but if I am not mistaken figure 13 on page 22 of this paper on Nearest-Neighbor methods is precisely the circuit you are creating.

You have a dramatic performance hit because you are simulating quantum hardware using classical algorithms. This is commented out:

# IBMQ Configure
# IBMQ.save_account(os.environ.get('IBM'))
# IBMQ.load_account()
# provider = IBMQ.get_provider('ibm-q')
# qcomp = provider.get_backend('ibmq_16_melbourne')

Where the "ibmq_16_melbourne" refers to a physical quantum computer with the ibm architecture which is partially documented here. That makes total sense because IBM restricts access for most accounts. That is why later on you have this:

# Get backend using the Aer provider
backend = Aer.get_backend('qasm_simulator')

"Aer" refers to the quantum computer simulation software which is running locally on your client-side computer. To my knowledge there is not yet something in qiskit which could simulate a specific physical quantum computer. That is what would presumably tell you what the simulated/theoretical speedup would be (despite the fact that it would take longer to simulate on a classical computer).

IMPORTANT: Many of the standards defined as part of the Qiskit ecosystem (like the OpenQASM format) are meant to be hardware agnostic. You can describe a circuit that has any two qubits interacting at any time. But the truth is a physical quantum computers of any size (in terms of 10+ qubits) will not have direct any-qubit-to-any-other-qubit connections. You have to swap things around in ways specific to that architecture (like the Melbourne 16-qubit architecture).