Where is my sparkDF.persist(DISK_ONLY) data stored?

I want to understand more about the persisting strategy of hadoop out of spark.

When I persist a dataframe with the DISK_ONLY-strategy where is my data stored (path/folder...)? And where do I specify this location?

Solution

To sum it up for my YARN environment:

With the guide of @stefanobaghino i was able to just go one step further in the code where the yarn config is loaded.

val localDirs = Option(conf.getenv("LOCAL_DIRS")).getOrElse("")

which is set in the yarn.nodemanager.local-dirs option in yarn-default.xml

The background for my question is, that caused by the error

2018-01-23 16:57:35,229 WARN org.apache.hadoop.yarn.server.nodemanager.DirectoryCollection: Directory /data/1/yarn/local error, used space above threshold of 98.5%, removing from list of valid directories

my spark-job got killed sometimes and I'd like to understand whether this disk is also used for my persisted data while running the job (which is actually a massive amount).

So it turns out that this is exactly the folder where the data goes to when persisting it with a DISK-strategy.

Thanks a lot for all your helpful guidance in this problem!