java character-encoding ftp mainframe apache-commons-net

FTP from Java application to mainframe dataset - issue with open/close brackets

SOLUTION: On top of setting charset/codepage to cp037 as specified by Bruce Martin's answer, I also had to change a setting in my FTP logic (using apache commons net). I had to set the file type to EBCDIC. Here's some sample code to show what I was doing.

public FTPClient openFTPConnection() {
    String server = [server];
    int port = [port];
    int reply;
    FTPClient ftpClient = new FTPClient();
    ftpClient.addProtocolCommandListener(new PrintCommandListener(new PrintWriter(System.out)));
    try {
        ftpClient.connect(server, port);
        reply = ftpClient.getReplyCode();
        System.out.println(reply);
        if (!FTPReply.isPositiveCompletion(reply)) {
            System.out.println(reply);
            ftpClient.disconnect();
            throw new Exception("Exception in connecting to FTP Server");
        }
        ftpClient.login(user, pass);
        ftpClient.setFileType(FTP.EBCDIC_FILE_TYPE); 
        //Previously, this was set to FTP.ASCII_FILE_TYPE
        ftpClient.enterLocalPassiveMode();
    } catch (Exception e) {
        System.out.println("Error: " + e.getMessage());
        e.printStackTrace();
    }
    return ftpClient;
}
public List ftpStoreAuthData(FTPClient ftpClient) {
    try {
        String mainframeDataSet = [dataset];
        InputStream stream = ftpClient.retrieveFileStream(mainframeDataSet);
        logger.trace("Retrieving mainframe data set...");
        BufferedReader reader = new BufferedReader(new InputStreamReader(stream, "cp037"));
                                                   //Previously, this was set to "utf-8"
        logger.trace("Data set Retrieved!");
        String datasetText;
        while((datasetText = reader.readLine()) != null) {
             //do things with dataset records
        }
        return new ArrayList<>();
    }

I'm developing a REST application in Java. One of my REST endpoints makes an FTP request to mainframe and retrieves a dataset that has '[' and ']' characters in them.

I've tried using Apache Commons Net and Spring's sftpsessionfactory FTP libraries. In both cases, the open/close bracket characters are returned to my application as �s. As I'm reading in each record from the dataset that I retrieve, there's an if condition checking if a particular record contains '[' or ']' characters and obviously it fails.

As a sort of hack, I changed the if condition so that it checks for �s instead. This works and is a quick fix. The problem with this workaround is that it eventually becomes an issue when I end up FTPing the dataset back to the mainframe. I overwrite the �s with '[ ]'s, but when the file hits the mainframe the open bracket appears as 'Ý' and close bracket appears as '¨'. I use "HEX ON" command on mainframe to see what the difference is between the required open/close brackets and the open/close brackets that I'm sending it.

The open bracket I'm sending (Ý) has hex value AD. The open bracket the dataset requires has hex value BA.

The closing bracket I'm sending (¨) has hex value BD. The closing bracket the dataset requires has hex value BB.

How can I write the brackets to match the hex value required by the dataset? Also how can I make it so the brackets don't appear as �s when I retrieve the dataset? I heard the issue has something to do with different EBCDIC codepage conversions which is great, but I'm not sure how to resolve that.

Let me know if you need to see my FTP code. I can post it if necessary.

Solution

You can use the appropriate EBCDIC character set IBM037 / CP037 is US EBCDIC. There are a lot of others e.g. cp273 is used in Germany / Austria.

You can do

    Reader r = new InputStreamReader(in, "cp037");
    String s = new String(bytes, "cp037");
       // or for 3 spaces
    String spaces = new String(new byte[] {40, 40, 40}, "cp037");

to read an EBCDIC stream / convert an array bytes to text