Search code examples
buildcmake

Can I have a cmake script to process multiple files over nested directory structure with the same custom program?


I have a directory tree with the files of the certain known type (say .in) placed at various folders within that tree. The task is to call the known custom program (say process_it), once per each file of the agreed type, passing this file as a parameter:

process_it a/b/c/file.in

and it will produce a/b/c/file.out.

We are already using cmake for a normal C++ build of this project. Is it possible, and how, to extend CMakeLists.txt and get this kind of processing?

I currently simply wrote an python script for iterating the folders and calling the processing program. This script can be added to CMakeLists.txt as a custom command, and works well, producing files in the ${OUTPUT} folder, initially missing:

add_custom_command(OUTPUT ${OUTPUT}
   COMMAND python ARGS scan_current_dir_and_do_naive_processing.py
   WORKING_DIRECTORY ${CMAKE_SOURCE_DIR} )

However the script takes very long time to run as it is single-threaded and does not care about reusing the existing output when the input did not change.


Solution

  • You may collect list of files for process using file(GLOB_RECURSE):

    file(GLOB_RECURSE
        # Result variable
        list_in_files
        # Return paths relative current source directory
         RELATIVE ${CMAKE_CURRENT_SOURCE_DIRECTORY}
        # Collect all ".in" files in the current source directory
        "${CMAKE_CURRENT_SOURCE_DIRECTORY}/*.in"
    )
    

    then for every file found create a custom command with appropriate OUTPUT and DEPENDS:

    foreach(in_file ${list_in_files})
        # Obtain relative path of the output file.
        string(REGEX REPLACE "in$" "out" out_file ${in_file})
        # Create custom command which generates {out_file} from the {in_file}
        add_custom_command(OUTPUT ${CMAKE_CURRENT_BINARY_DIR}/${out_file}
            COMMAND <process-file> <...arguments...>
            DEPENDS ${CMAKE_CURRENT_SOURCE_DIR}/${in_file}
        )
    endforeach()
    

    Note, that you need to have some target(s) to "consume" OUTPUT files from these custom commands. (Otherwise, why do you ever need these files?). If you want just to have them for some reason, you may create the following custom target:

    add_custom_target(generate_all_files
        # build this target by default (on `make all` or `make`)
        ALL
        DEPENDS ${list_out_files}
    )
    

    This assumes that you add line

     # add absolute path of the output file to the list
     list(APPEND list_out_files "${CMAKE_CURRENT_BINARY_DIR}/${out_file}")
    

    to the previous for loop.


    If you want CMake to automatically discover new .in files (and deleted ones), then you may use additional CONFIGURE_DEPENDS for file(GLOB_RECURSE). This is supported since CMake 3.12.

    Note, that this feature may do not work with some generators, and it will consume time for scanning the directory every time you build the project.