I am create a simple Java Project in my VS Code, and here is the project structure.
I want to refer the wordcount.txt
in my code, but it fail to find the file.
Here is my test code:
public class BatchJob {
public static void main(String[] args) throws Exception {
// set up the batch execution environment
final ExecutionEnvironment env = ExecutionEnvironment.getExecutionEnvironment();
//URL url = BatchJob.class.getClassLoader().getResource("resources/wordcount.txt");
DataSource<String> dataset = env.readTextFile("wordcount.txt");
DataSet<Tuple2<String, Integer>> result = dataset.flatMap(new Tokenizer())
.filter(new FilterFunction<Tuple2<String, Integer>>(){
@Override
public boolean filter(Tuple2<String, Integer> arg0){
return arg0.f1 >0;
}
})
.groupBy(0)
.sum(1);
result.print();
}
public static class Tokenizer implements FlatMapFunction<String, Tuple2<String, Integer>>{
@Override
public void flatMap(String value, Collector<Tuple2<String, Integer>> out) throws Exception {
String[] tokens = value.toLowerCase().split(",");
for (String token : tokens) {
if ( !token.isEmpty() && token.length() > 0) {
out.collect(new Tuple2<String, Integer>(token, 1));
}
}
}
}
}
Application resources will become embedded resources by the time of deployment, so it is wise to start accessing them as if they were, right now. An embedded-resource must be accessed by URL rather than file. See the info. page for embedded resource for how to form the URL.
Thanks for your work, work with
getResource
. Here is the working codeURL url = BatchJob.class.getClassLoader().getResource("wordcount.txt"); DataSource<String> dataset = env.readTextFile( URLDecoder.decode(url.getFile(),"UTF-8") );
Unfortunately, this fix goes wrong at url.getFile()
.
Harking back to the bold part of the original advice.. ".. must be accessed by URL rather than file": This is not a suggestion or merely a good programming practice, it is a requirement. The thing is, once the app. is built, the resource will be inside a Jar and will not be a File
any longer. It will not be accessible as a File
. So while it might work when running it from the IDE (when the URL points to something that is a file), it will fail for the built Jar.