I am trialing the ascii_graph
package for Python. If I assemble the histogram data using numpy.arange
and zip
, the plotting fails. If I assemble the data from primitive literals, it succeeds. Can anyone please explain what the difference is?
import numpy as np
BinMid = np.arange(20) + 1 # Bin mid-points
BinEdge = np.arange(21) + 0.5
# Bin edges, used only in generating histogram
# counts (not shown in this sample code)
nDist = np.array( # Bin counts
[ 7083, 73485, 659204, 3511238, 10859771, 22162510,
34511661, 45891902, 55651178, 59153091, 56242073,
48598282, 37947325, 27541907, 19356046, 13630601,
8810979, 4262462, 1227506, 216751], dtype=np.int64 )
# Histogram data
histData = list( zip( BinMid.astype(str) , nDist ) )
# [('1', 7083),
# ('2', 73485),
# ('3', 659204),
# ('4', 3511238),
# ('5', 10859771),
# ('6', 22162510),
# ('7', 34511661),
# ('8', 45891902),
# ('9', 55651178),
# ('10', 59153091),
# ('11', 56242073),
# ('12', 48598282),
# ('13', 37947325),
# ('14', 27541907),
# ('15', 19356046),
# ('16', 13630601),
# ('17', 8810979),
# ('18', 4262462),
# ('19', 1227506),
# ('20', 216751)]
# Create ASCII histograph plotter
from ascii_graph import Pyasciigraph
graph = Pyasciigraph()
# FAILS: Plot using zip expression assigned to histData
#------------------------------------------------------
for line in graph.graph( "Test" ,
list( zip( BinMid.astype(str) , nDist ) ) ):
print(line)
for line in graph.graph( "Test" , histData ): print(line)
# Traceback (most recent call last):
# Cell In[139], line 1
# for line in graph.graph( "Test" , histData ): print(line)
# File ~\AppData\Local\anaconda3\envs\py39\lib\site-packages\ascii_graph\__init__.py:399 in graph
# san_data = self._sanitize_data(data)
# File ~\AppData\Local\anaconda3\envs\py39\lib\site-packages\ascii_graph\__init__.py:378 in _sanitize_data
# (self._sanitize_string(item[0]),
# File ~\AppData\Local\anaconda3\envs\py39\lib\site-packages\ascii_graph\__init__.py:351 in _sanitize_string
# return info
# UnboundLocalError: local variable 'info' referenced before assignment
# SUCCEEDS: Assign pimitive literals to histData
#-----------------------------------------------
histData = [ ('1', 7083),
('2', 73485),
('3', 659204),
('4', 3511238),
('5', 10859771),
('6', 22162510),
('7', 34511661),
('8', 45891902),
('9', 55651178),
('10', 59153091),
('11', 56242073),
('12', 48598282),
('13', 37947325),
('14', 27541907),
('15', 19356046),
('16', 13630601),
('17', 8810979),
('18', 4262462),
('19', 1227506),
('20', 216751) ]
for line in graph.graph( "Test" , histData ): print(line)
# Test
# ###############################################################################
# 7083 1
# 73485 2
# 659204 3
# ███ 3511238 4
# ███████████ 10859771 5
# ████████████████████████ 22162510 6
# █████████████████████████████████████ 34511661 7
# ██████████████████████████████████████████████████ 45891902 8
# █████████████████████████████████████████████████████████████ 55651178 9
# █████████████████████████████████████████████████████████████████ 59153091 10
# █████████████████████████████████████████████████████████████ 56242073 11
# █████████████████████████████████████████████████████ 48598282 12
# █████████████████████████████████████████ 37947325 13
# ██████████████████████████████ 27541907 14
# █████████████████████ 19356046 15
# ██████████████ 13630601 16
# █████████ 8810979 17
# ████ 4262462 18
# █ 1227506 19
# 216751 20
Afternote
Based on Nick ODell's response, the following works:
import numpy as np
BinMidStr = [ str(i+1) for i in range(20) ] # Bin edges
nDist = np.array( # Bin counts
[ 7083, 73485, 659204, 3511238, 10859771, 22162510,
34511661, 45891902, 55651178, 59153091, 56242073,
48598282, 37947325, 27541907, 19356046, 13630601,
8810979, 4262462, 1227506, 216751], dtype=np.int64 )
# Histogram data
histData = list( zip( BinMidStr , nDist ) )
# Create ASCII histograph plotter
from ascii_graph import Pyasciigraph
graph = Pyasciigraph()
# Plot code pattern #1
for line in graph.graph( "Test" ,
list( zip( BinMidStr , nDist ) ) ):
print(line)
# Plot code pattern #2
for line in graph.graph( "Test" , histData ): print(line)
# Plot code pattern #3 for when labels are in integer form
BinMid = [ i+1 for i in range(20) ] # Bin edges
BinMidStr = [ str(i) for i in BinMid ]
for line in graph.graph( "Test" ,
list( zip( BinMidStr , nDist ) ) ):
print(line)
If you work a lot in NumPy and have you bin labels in the form if NumPy integers, be aware that the following almost looks like it creates native (non-NumPy) Python string labels, but it actually creates one string representing the entire array should be displayed:
# Plot code pattern #4 (nonfunctional) for when labels are in
# NumPy integer form
BinMid = np.arange(20) + 1 # Bin edges
BinMidStr = np.array_str( BinMid )
# '[ 1 2 3 4 5 6 7 8 9 10 11
# 12 13 14 15 16 17 18 19 20]'
BinMidStr = np.array_str( BinMid.astype('str') )
# "['1' '2' '3' '4' '5' '6' '7' '8' '9' '10' '11' '12'
# '13' '14' '15' '16'\n '17' '18' '19' '20']"
I find it odd that Pyasciigraph.graph()
accepts an array of NumPy datatype for the numerical bar sizes, but not Nump strings for the bar labels. Another thing I am puzzled by is the lack of a function prototype for the Pyasciigraph.graph()
method. While I still consider myself new to Python, most packages I've used provide Python-like documentation with function prototypes and explanations of the input and output arguments.
I wish there were standard streamlined ways to convert between arrays of native Python and NumPy data types. Going from NumPy to native Python seems trickier, as there are probably fewer cases in which people want that. Afternote: Based on this Q&A, it seems that MyNParray.tolist()
is the standard streamline idiom to convert NumPy array of NumPy data types to a native Python array of Python data types. It is even better than [ Element.item() for Element in MyNParray ]
. The latter doesn't work on a NumPy array of NumPy strings.
It looks like the types of those strings ends up being numpy.str_
rather than str
.
>>> histData = list( zip( BinMid.astype(str) , nDist ) )
>>> print(type(histData[0][0]))
<class 'numpy.str_'>
In comparison to the literal definition:
>>> histData2 = [('1', 7083), ('2', 73485)]
>>> print(type(histData2[0][0]))
<class 'str'>
I would suggest using an approach that gives you str
objects.
histData = list(zip(map(str, BinMid), nDist))
With this, I am able to make the graph from the NumPy array.