I hear this term used a lot, something about a build OOMing or running out of memory; what does that mean? I'm speaking from the context of running a dataset build in Transforms Python or Transforms SQL.
OOM == OutOfMemory
This is caused in the case of a Transform by the JVM trying to allocate more memory in heap than it has available or can free using GC (Garbage Collection) This can happen e.g. in your driver when materializing huge query plans, or in executor when dealing with very large array columns or other such data that can't fit into memory.
This can also happen when JVM + non-JVM is using more non-heap memory than is available through the combination of non-used heap memory and memoryOverhead (applies both for driver and executor).
This can be caused by not having enough main JVM memory, memoryOverhead, or doing things like using too much Python memory (e.g. on your driver when using .collect()
or on an executor when using UDFs)