Search code examples
pythonarraysnumpystructured-array

Grabbing the name of a numpy structured array column when passed as a parameter to a function


When trying to use a numpy.array (a structured numpy array) I know I can pull up a column by doing something like array["col"]. From my understanding, this is because of numpy.dtype.names and the nature of structured arrays. However, when passing said array into a function, I don't get the column name when using numpy.dtype.name, I get something like "strxxx". If it helps to know, this particular array was created using numpy.genfromtxt() and a csv file. For example, the below code

def empty_check(param):
for ind in param:
    # Ignore future warning for comparison just for this instance
    with warnings.catch_warnings():
        warnings.simplefilter(action='ignore', category=FutureWarning)
        if ind == '':
            print("Please fill out required data in", param.dtype.name, "column")

results in:

Please fill out required data in str544 column

Would anyone have an idea as to why str544 comes up instead of the name of the column?

Setup Python 3.7.0 numpy 1.15.4 IDE: PyCharm 2018.3


Solution

  • You're making a small mistake here. You're assuming that dtype.name and dtype.names return the same thing. They do not.

    From the docs

    name A bit-width name for this data-type.

    names Ordered list of field names, or None if there are no fields.


    So what you're seeing is the bit-width name for the data type of your field. However, if you called dtype.names, you would be returned None, as the single field you have passed does not have any fields to return.


    As far as I know, there is not a way to infer the name of a field without access to the structured array that contains it. You will most likely have to pass the field name as a parameter to your empty_check function.