Using spark.read.from("xml").option("recursiveFileLookup", "true") for xml files in subdirectories

I would like to recursively load all files that are in xml format into my dataframe in a directory that has additional subdirectories. With other file formats (txt, parquet,..) the code seems to work.

df = (
    spark.read
    .format("xml")
    .option("rowTag", "library")
    .option("wholetext", "true")
    .option("recursiveFileLookup","true")
    .option("pathGlobFilter", "*.xml")
    .load("path/to/dir")
)

I have tested this code with different file formats, but xml files are not found.

Solution

Looks like I found an answer right away, although it may not be entirely satisfactory. Basically, I have found two possibilities:

Change format from "xml" to "text".
This allows recursive reading, but unfortunately the content of the xml file is not read in as nicely as before.

df = (
    spark.read
    .format("text")
    .option("rowTag", "library")
    .option("wholetext", "true")
    .option("recursiveFileLookup","true")
    .option("pathGlobFilter", "*.xml")
    .load("path/to/dir")
)

Append a glob pattern to the path at the load option.

df = (
    spark.read
    .format("xml")
    .option("rowTag", "library")
    .option("wholetext", "true")
    .load("path/to/dir/**/*.xml")
)

This makes the two options "recursiveFileLookup" and "pathGlobFilter" unnecessary.
** in the glob pattern searches recursively through all directories and
*.xml searches for files ending with .xml.