Search code examples
javaiojsch

How to use Java JSch library to read remote file line by line?


I am trying to use Java to read a file line by line, which is very simple (there are multiple solutions for this on stackoverflow.com), but the caveat here is that the file is located on a remote server, and it is not possible to get a local copy (it is a massive collection of millions of Amazon reviews in a single .txt file).

JSch comes with two example classes that copy files to and from remote hosts, namely ScpTo and ScpFrom. I am interested in reading the file from the remote host line by line; ScpFrom would try to copy the whole thing into a local file, which would take ages.

Here is a link to ScpFrom: http://www.jcraft.com/jsch/examples/ScpFrom.java.html

I would try to cargo cult the code there and then modify it to read a remote file line by line rather than write to a local file, but most of the code is Greek to me once the author declares a byte array and starts reading bytes from the remote file. I'll admit this is something I have almost no understanding of; BufferedReader provides a much higher level interface. Essentially I want to do this: How to read a large text file line by line using Java?

except using a BufferReader that can also read remote files line by line, if provided the host name and user credentials (password, etc.), i.e. RemoteBufferReader?

This is the test code I've written; how do I read in the remote file line by line using JSCh?

public class test2
 {
    static String user = "myusername";
    static String host = "user@remotehost";
    static String password = "mypasswd";
    static String rfile = "/path/to/remote/file/on/remote/host";
    public static void main(String[] args) throws FileNotFoundException, IOException, JSchException
    {
        JSch jsch=new JSch();
        Session session=jsch.getSession(user, host, 22);
        session.setPassword(password);
        session.connect();
        // exec 'scp -f rfile' remotely
        String command="scp -f "+rfile;
        Channel channel=session.openChannel("exec");
        ((ChannelExec)channel).setCommand(command);

        // get I/O streams for remote scp
        OutputStream out=channel.getOutputStream();
        channel.connect()
        //no idea what to do next

    }
 }

Solution

  • To manipulate files through ssh, you're better off using sftp than scp or pure ssh. Jsch has built-in support for sftp. Once you've opened a session, do this to open an sftp channel:

    ChannelSftp sftp = (ChannelSftp) session.openChannel("sftp");
    

    Once you've opened an sftp channel, there are methods to read a remote file which let you access the file's content as an InputStream. You can convert that to a Reader if you need to read line-by-line:

    InputStream stream = sftp.get("/some/file");
    try {
        BufferedReader br = new BufferedReader(new InputStreamReader(stream));
        // read from br
    } finally {
        stream.close();
    }
    

    Using try with resources syntax, your code might look more like this:

    try (InputStream is = sftp.get("/some/file");
         InputStreamReader isr = new InputStreamReader(is);
         BufferedReader br = new BufferedReader(isr)) {
        // read from br
    }