Given a StructType schema I want to be able to define
def foo(row: schema)
return row.field
and have PyCharm recognize the fields of row, but PyCharm does not recognize 'schema' as a type. Inlining makes no difference. (I'm using Python 3.8)
It's not technically correct; row is a Row, but it works just fine thanks to duck typing:
from dataclasses import dataclass
@dataclass
class HintedRow:
x: int
y: str
def foo(row: HintedRow):
return row.x
df.rdd.map(foo)
Now you can use it in unit tests like so and pyspark will not complain because HintedRow's properties are the same as those of the Row:
test_row = HintedRow(x=1, y='bar')
assert foo(test_row) == 1