Search code examples
javavisual-studio-codeapache-flinkembedded-resource

JAVA how to refer the file in Project


I am create a simple Java Project in my VS Code, and here is the project structure. enter image description here

I want to refer the wordcount.txt in my code, but it fail to find the file.

Here is my test code:

public class BatchJob {

public static void main(String[] args) throws Exception {
    // set up the batch execution environment
    final ExecutionEnvironment env = ExecutionEnvironment.getExecutionEnvironment();
    //URL url = BatchJob.class.getClassLoader().getResource("resources/wordcount.txt");
    DataSource<String> dataset = env.readTextFile("wordcount.txt");
    DataSet<Tuple2<String, Integer>> result = dataset.flatMap(new Tokenizer())
                                                    .filter(new FilterFunction<Tuple2<String, Integer>>(){
                                                        @Override
                                                        public boolean filter(Tuple2<String, Integer> arg0){
                                                            return arg0.f1 >0;
                                                        }
                                                    })
                                                    .groupBy(0)
                                                    .sum(1);
                                                    result.print();
    
}
public static class Tokenizer implements FlatMapFunction<String, Tuple2<String, Integer>>{
    @Override
    public void flatMap(String value, Collector<Tuple2<String, Integer>> out) throws Exception {
        String[] tokens = value.toLowerCase().split(",");
        for (String token : tokens) {
            if ( !token.isEmpty() && token.length() > 0) {
                out.collect(new Tuple2<String, Integer>(token, 1));
            }
        }
    }
}

}


Solution

  • Application resources will become embedded resources by the time of deployment, so it is wise to start accessing them as if they were, right now. An must be accessed by URL rather than file. See the info. page for embedded resource for how to form the URL.

    Thanks for your work, work with getResource. Here is the working code

    URL url = BatchJob.class.getClassLoader().getResource("wordcount.txt"); 
    DataSource<String> dataset = env.readTextFile(
        URLDecoder.decode(url.getFile(),"UTF-8") );
    

    Unfortunately, this fix goes wrong at url.getFile().

    Harking back to the bold part of the original advice.. ".. must be accessed by URL rather than file": This is not a suggestion or merely a good programming practice, it is a requirement. The thing is, once the app. is built, the resource will be inside a Jar and will not be a File any longer. It will not be accessible as a File. So while it might work when running it from the IDE (when the URL points to something that is a file), it will fail for the built Jar.