Search code examples
javajdbccharacter-encodingcharacterucanaccess

Greek characters show as question marks or boxes when exporting from an old Access database


I'am trying to export the data from Access databases and save them as ascii in text files.I'm using UCanAccess JDBC driver, and when i write Greek characters inside the files,they are shown as question marks or boxes like this:

Boxes shown instead of Greek characters:

Boxes shown instead of Greek characters 2:

Here is the code I'am connecting with the database:

Properties props = new Properties();
props.put("charSet","UTF-8");

conn = DriverManager.getConnection("jdbc:ucanaccess://" + path, props);

dbConnectionData = conn.getMetaData();
dbResultSet = dbConnectionData.getTables(null, null, "%", null);

s =  conn.createStatement();

int i = 0;

while(dbResultSet.next()){
    numberOfTables++; 
}

dbResultSet = dbConnectionData.getTables(null, null, "%", null);
fileNames = new String[numberOfTables];

while(dbResultSet.next()){
    fileNames[i] = dbResultSet.getString(3);
    i++;
}

and here is the code where i execute the queries to extract the data i need:

DatabaseTable dbTab;

File file;
File dir;

FileOutputStream out;
Writer writer;

ResultSet rsSet;
ResultSetMetaData metaData;

int space;
int numberOfDash = -1;
int dashInLine = 0;
int startOfFile = 0;
int endOfFile;

int colsMax[];

try{
    int i = 0;

    dir = new File(path + "\\" + "Ascii-" + db.name);

    if(!dir.exists()){
        dir.mkdir();
    }

    file = new File(path + "\\" + dir.getName() + "\\" + db.fileNames[fileNumber] + ".txt");

    out = new FileOutputStream(file);
    writer = new OutputStreamWriter(out, StandardCharsets.UTF_8);

    dbTab = new DatabaseTable(db);

    for(i = 0; i < fileNumber; i++){
        rsSet = db.s.executeQuery("SELECT * FROM [" + db.fileNames[i] + "]");
        metaData = rsSet.getMetaData();
        startOfFile += metaData.getColumnCount();
    }

    rsSet = db.s.executeQuery("SELECT * FROM [" + db.fileNames[fileNumber] + "]");
    metaData = rsSet.getMetaData();

    endOfFile = startOfFile + metaData.getColumnCount();

and then i am using a writer to write the data in files.

When i import Access databases made with Access version 2007 or higher it works perfectly.I have this problem only with version 2003 and below.Does anyone have any ideas?


Solution

  • The troublesome files were apparently created using a very old version of Access that saved text fields using the current Windows code page. (Newer versions of Access save text fields as Unicode.) In your case the code page used was Windows-1253.

    To read the Greek text from those old files you can tell UCanAccess to decode text fields as Windows-1253 by creating a new class Windows1253Opener.java in your project ...

    package com.example.ucanaccessdemo;
    
    import java.io.File;
    import java.io.IOException;
    import java.nio.charset.Charset;
    
    import com.healthmarketscience.jackcess.Database;
    import com.healthmarketscience.jackcess.DatabaseBuilder;
    
    import net.ucanaccess.jdbc.JackcessOpenerInterface;
    
    public class Windows1253Opener implements JackcessOpenerInterface {
        public Database open(File fl, String pwd) throws IOException {
            DatabaseBuilder dbd = new DatabaseBuilder(fl);
            dbd.setAutoSync(false);
            dbd.setCharset(Charset.forName("cp1253"));
            return dbd.open();
        }
    }
    

    ... and append ;jackcessOpener=com.example.ucanaccessdemo.Windows1253Opener to your connection URL, e.g.,

    String connStr = "jdbc:ucanaccess://" + dbFileSpec 
            + ";jackcessOpener=com.example.ucanaccessdemo.Windows1253Opener";
    Connection conn = DriverManager.getConnection(connStr);