pythonnumpyperformancefizzbuzz# Accelerating FizzBuzz with numpy

**The FizzBuzz problem**: Given a natural number `N >= 0`

print a sequence of numbers `0 - N`

replacing every number that is divisible by `3`

with the word `fizz`

, every number that is divisible by `5`

with the word `buzz`

, and every number divisible by both `3`

and `5`

with the word `fizzbuzz`

.

Numpy isn't excellent for this (it doesn't offer much acceleration for strings), but I figured it shouldn't be too horrible and, perhaps, it can beat plain python if used wisely. To my surprise, however, the opposite is the case for a naive implementation. **Why is this, and how does one improve upon this?**

Here is the code that I used to generate the timings. It includes a pure-python reference implementation, a naive numpy implementation, and a `numba.jit`

variant, because I think it can act as a reasonable lower bound on performance.

```
import numpy as np
import matplotlib.pyplot as plt
import numba as nb
import timeit
def classic_fizzbuzz(N: int) -> str:
result = list()
for idx in range(N):
if idx % 3 == 0 and idx % 5 == 0:
result.append("fizzbuzz")
elif idx % 5 == 0:
result.append("buzz")
elif idx % 3 == 0:
result.append("fizz")
else:
result.append(str(idx))
return " ".join(result)
def numpy_fizzbuzz(N: int) -> str:
integers = np.arange(N)
result = np.arange(N).astype(str)
result = np.where(integers % 3 == 0, "fizz", result)
result = np.where(integers % 5 == 0, "buzz", result)
result = np.where(integers % 15 == 0, "fizzbuzz", result)
return " ".join(result)
@nb.jit(nopython=True)
def numba_fizzbuzz(N:int) -> str:
result = list()
for idx in range(N):
if idx % 3 == 0 and idx % 5 == 0:
result.append("fizzbuzz")
elif idx % 5 == 0:
result.append("buzz")
elif idx % 3 == 0:
result.append("fizz")
else:
result.append(str(idx))
return " ".join(result)
# do not measure initial compile time
numba_fizzbuzz(20)
def time_fizzbuzz(fn):
repeats = 100
times = list()
N_values = np.linspace(0, int(1e4), 100, dtype=int)
for idx in N_values:
N = int(idx)
times.append(timeit.timeit(lambda: fn(N), number=repeats) / repeats)
return N_values, times
fig, ax = plt.subplots(dpi=150) # web-quality
contendants = {
"classic": classic_fizzbuzz,
"numpy": numpy_fizzbuzz,
"numba": numba_fizzbuzz
}
for fn in contendants.values():
ax.plot(*time_fizzbuzz(fn))
ax.set_ylabel("Average Time (s)")
ax.set_label("Input")
fig.suptitle(
"Comparison of FizzBuzz implementations in Python.",
# fontsize="medium",
)
ax.set_title("Timings for various input N (averaged over 100 runs each)", fontsize="small")
ax.legend(
contendants.keys(),
loc="center left",
bbox_to_anchor=(1, 0.5),
title="Contestants",
)
fig.tight_layout()
fig.savefig("timings.png")
```

Solution

You can substantially improve the numpy version by using `np.select`

rather than create a parallel array of strings:

```
def numpy_select(N: int) -> str:
integers = np.arange(N)
return ' '.join(np.select([(integers%3 == 0) & (integers%5 == 0),
(integers%3 == 0), (integers%5 == 0)],
['fizzbuzz','fizz','buzz'], integers).tolist())
```

Your **Classic Python** version can be made a little faster like so:

```
def c2(N: int) -> str:
return ' '.join(['fizzbuzz' if x%15==0 else
'fizz' if x%3==0 else
'buzz' if x%5==0 else str(x) for x in range(N)])
```

Plugging those into your benchmark:

- Pydantic - apply validator on all fields of specific type
- Python program reacting to an event
- Convert date to datetime in Python
- python how to sum amount of a key based on for loop two lists
- Multidimensional indexing in Numpy with tuples as indices for certain axes
- Generating DKIM signatures VIA Python for Custom MTA
- pytorch split array by list of indices
- Creating new columns using for loop based on logical expression in Python
- Python print function stopped working after installing super-gradients in Google Colab
- How to extract the integer from the string '$61.52' to use in the arithmetic oprations in python?
- How can I select the proper openai.api_version?
- What's the use of /256 and %256 in it and how is the address divided into hi and lo bits?
- as_index=False groupBy doesn't work with count
- Python InfiniteTimer Test Code on counting Failed Loop fails
- PyCharm - no tests were found?
- Search results are not showing. And my URL Tab shows: "http://127.0.0.1:8000/search?" instead of this: "http://127.0.0.1:8000/search?q=name"
- Calculating Euler's number
- What does "RuntimeWarning: Enable tracemalloc to get the object allocation traceback" mean? I know how to fix, looking for the meaning
- Numpy, from a list of points, extract ones that have the same range of coordinates without for loop
- multiply and summing same named columns based on name in 2 different DF pandas python
- Reading files from a folder using python
- How to use match case with a class type
- Logging to python file doesn't overwrite file when using the mode='w' argument to FileHandler
- Extract a subset of key-value pairs from dictionary?
- Pandas Resample 2M and 3M for every month
- How to use multiprocessing queue in Python?
- Pandas dataframe and to_numeric: select column by index
- Cannot install TensorFlow 1.x
- How to get specific unique combinations of a dataframe using only dataframe operations?
- Scheduling a .py file on Task Scheduler in Windows 10