Search code examples
statastata-macros

Looping through folder to generate and save graph for each file


Below is one of my csv files created from dataex for a reproducible example:

clear

input str32 eventname str10 scrapedate float(average thpercentile v5 v6)
"EventName" "2015-12-15"  136.9255     83.2 104.875    148.75
"EventName" "2015-12-16"  130.4555    78.55      99    138.22
"EventName" "2015-12-17" 123.66705     72.7   90.25     131.2
"EventName" "2015-12-18" 116.45757   64.855   78.55     119.5
"EventName" "2015-12-19" 108.63446 60.56333    72.7 119.07333
"EventName" "2015-12-20"  94.97125    55.15   69.77    112.48
end

Thanks to the answer to my previous question, I was able to adapt my code to loop through the directory "I:\Games CSVs\" and read in each csv file using:

insheet using "`file'", comma clear

Then create a new variable to change the data format to how I want it and generate the line graph.

Here is my code:

local foodir "I:\Games CSVs\" 
local files : dir "`foodir'" files "*.csv"
cd "`foodir'"
local i = 0
foreach file of local files {
    local ++i
    insheet using "`file'", comma clear
    generate ScrapeDate = daily(scrapedate, "YMD")
    format ScrapeDate %tdYY-NN-DD 
    line average thpercentile v5 v6 ScrapeDate, name("graph`i'", replace) ///
    scale(*.7) ///
    local filename = substr("`file'", 1, strlen("`file'")-4) ///
    title(filename) ///
    ytitle("Price in US$") ///
    legend(size(small)) 
}

The problematic line is the following:

local filename = substr("`file'", 1, strlen("`file'")-4)` 
title(filename)

I also tried:

generate filename = substr("`file'", 1, strlen("`file'")-4)` 
title(filename)

I have the following problems:

  1. The file is being titled as filename.csv and I want the suffix to be removed.
  2. I also cannot figure out how to save the graphs on disk.

All the graphs (I have 52 of them) are flashing one after another. It would be ideal for me, if I could save all of them in a folder (I:\Graphs), with filename being the same as the filename.csv instead here being saved as filename.png or filename.jpeg or whatever format that I will be able to open.

I have read the documentation. I believe graph save mygraph replaces the graph if it exists, and since I'm looping through the directory, each time, the graph is going to be replaced since I'm not changing the name of the graph.


Solution

  • You need to correctly define the local macro filename before you use it. You also need to use the saving() and nodraw options in the line command:

    local foodir "I:\Games CSVs\"
    local foosavedir "I:\Graphs\"
    
    local files : dir "`foodir'" files "*.csv"
    cd "`foodir'"
    
    local i = 0
    foreach file of local files {
        local ++i
        insheet using "`file'", comma clear
        generate ScrapeDate = daily(scrapedate, "YMD")
        format ScrapeDate %tdYY-NN-DD 
        local filename = substr("`file'", 1, strrpos("`file'", ".")-1)
        line average thpercentile v5 v6 ScrapeDate, name("graph`i'", replace) ///
        saving("`foosavedir'`filename'.gph", replace) nodraw scale(*.7) title("`filename'") /// 
        ytitle("Price in US$") legend(size(small)) 
    }
    

    Note that in this way, the files will be saved in Stata's gph native format, which is always the best thing to do so you can edit them later if necessary.

    If you also want them in a different graphics format such as png, then you need to export each of them after the line command:

    graph export "`foosavedir'`filename'.png", name("graph`i'")
    

    In this case, you must not have specified the nodraw option in line.

    The graphs will be replaced if the option replace is specified, and the name of the saved/exported file conflicts with an existing file. If the filenames are unique you shouldn't have this problem.