Search code examples
c#.netencodingoledb

OLEDB custom encoding


My goal

I need to create a .dbf file in a specific format, specified by a client. The format being dBase III .dbf with kamenicky encoding, using Integer, Character of various lengths and Double column types.

Problem

I got pretty much everything working, with only one hurdle in the way: the darn encoding refuses to work, in spite of a specific conversion table being written which switches original chars with those compatible with kamenicky encoding. This means that the output file ends up with, for example, HEX value of FF for a char which was specified as hex value of A0 in the imported string.

If you're going to (-1) the question, I would greatly appreciate information as to why are you doing so in comments - even a "You don't understand the issue sufficiently" would be of great help as I would know where to continue my research (as in, at very basics in that case)

I have kind of, sort of solved the problem (see comments), but the solution is flawed and doesn't actually answer the given question at all.

Question

How can I persuade the Jet.OLEDB provider to not mess with the encoding?

What have I tried

  • Using foxpro provider, which actually worked fine, except for the little detail that my client's software was unable to read the resulting .dbf file.

  • Inserting the data without using OleDbParameter (so the input wouldn't get properly escaped) to no avail

  • Setting a couple of different encodings via CharacterSet = xxx and some other connection string modifications that I don't quite recall right now, every time the output of A0 resulted in FF.

  • I have found an AutoTranslate property over here, but as far as I can tell it only works for SQL connections as Jet.OLEDB keeps giving me an ISAM error.

  • I have tried toying around with globalization settings, didn't help much.

Some code

Connection string:

"Provider=Microsoft.Jet.OLEDB.4.0; Data Source={0};Extended Properties=\"dBase III;\"";

Then data gets inserted using OleDbCommand, with the individual cells being filled with OleDbParameter class and constructed insert string. Might be quite useless, but here's the code:

private void insertRows(T[] data, OleDbConnection connection)
{
    using (OleDbCommand command = connection.CreateCommand())
    {
        for (int i = 0; i < data.Count(); i++)
        {
            constructParams(data[i], i, command);
            command.CommandText = constructInsert(i, _fileName);

            command.ExecuteNonQuery();
        }
    }
}

private void constructParams(T data, int index, OleDbCommand command)
{
    command.Parameters.Clear();
    foreach (PropertyInfo prop in _props)
    {
        if(_cols.ContainsKey(prop.Name))
        {
            command.Parameters.Add(new OleDbParameter("@" + prop.Name + index, prop.GetValue(data)));
        }
    }
}

private string constructInsert(int dataNum, string tableName)
{
    string insert = "INSERT INTO [" + tableName + "] (";
    foreach(string key in _cols.Keys)
    {
        insert += "[" + key + "],";
    }
    insert = insert.Remove(insert.Length - 1);
    insert += ") VALUES";

    insert += " (";
    foreach (string key in _cols.Keys)
    {
        insert += "@" + key + dataNum + ",";
    }

    insert = insert.Remove(insert.Length - 1);
    insert += ");";

    return insert;
}

Solution

  • Here is a quick something I tried and appears to be working with special Unicode characters and proper recognition of codepage 895 as you are trying to work with. This does use the VFP OleDb provider from Microsoft. However, I have as 4 parts.

    1. Create the table with explicit codepage reference which typically results in VFP readable format.

    2. To retain backward compatibility as you mentioned you needed in dBASE, use the COPY TO command to convert the VFP version table header to the older (and should be) dBASE recognized format

    3. Simple insert into the dBASE version of the table (also codepage 895)

    4. Retrieve all records and look at the Unicode results.

    // Connection to your data path, but explicitly referencing codepage 895 in connection string
    string connectionString = @"Provider=VFPOLEDB.1;Data Source=c:\\YourDataPath\\SomeSubFolder;CODEPAGE=895;";
    string ans = "";
    
    using (OleDbConnection connection = new OleDbConnection(connectionString))
    {
       // create table syntax for a free table (not part of a database) that is codepage 895.
       string cmd = "create table MyTest1 free codepage=895 ( oneColumn c(10) )";
       OleDbCommand command = new OleDbCommand(cmd, connection);
    
       connection.Open();
       command.ExecuteNonQuery();
    
       // Now, create a script to use the MyTest1 table and create MyTest2 which 
       // SHOULD BE recognized in dBASE format.
       string vfpScript = @"use MyTest1
                Copy to MyTest2 type foxplus";
    
    
       command.CommandType = CommandType.StoredProcedure;
       command.CommandText = "ExecScript";
       command.Parameters.Add("myScript", OleDbType.Char).Value = vfpScript;
       command.ExecuteNonQuery();
    
       // Simple insert into the 2nd instance of the table    
       command = new OleDbCommand("insert into Mytest2 ( oneColumn ) values ( ? )", connection);
       command.Parameters.AddWithValue("parmForColumn", "çšjír_Þ‰");
       command.ExecuteNonQuery();
    
       // Now, get the data back.
       command = new OleDbCommand("select * from Mytest2", connection);
       OleDbDataAdapter da = new OleDbDataAdapter(command);
       DataTable oTbl = new DataTable();
       da.Fill(oTbl);
    
       if (oTbl.Rows.Count != 0)
          // we should have one row, so get the string from the column
          // and it SHOULD loo like the Unicode sample I inserted above.
          ans = (string)oTbl.Rows[0]["oneColumn"];
    }
    

    Obviously you have code to cycle through all columns and set applicable parameters, so I leave that up to you.