Search code examples
ms-wordfieldmailmerge

Is it safe to change the data source format to OLE DB database file when using a htm/html as the datasource to remove the limitation of 62 fields?


I have a mail merge datasource which is in htm/html format and it contains 70 fields. Since there is a limitation of 62 fields for such datasources(Reference). Is it safe to change the datasource type to OLE DB database file in the confirm datasource dialog when selecting the datasource?

enter image description here


Solution

  • When you choose the "All web pages" type (and this is the default type in the case of an HTML file), you are in essence choosing a Word internal file converter to retrieve your data. The reason you end up with the concatenated columns is because

    1. The internal converter is not primarily designed to "read data sources". It's there to convert a document in HTML format into a document in Word format.
    2. Your HTML file contains a table in HTML format, so naturally, the converter tries to convert that into a Word table
    3. However, Word tables can only have 63 columns, whereas HTML tables can have more, so the converter has to deal with that somehow. In this case, it concatenates the column data so column 63 ends up containing all the remaining data in the row.
    4. Once the document is converted, Word uses the converted document as the data source. It's really no different from the situation where it uses a Word document as the data source.

    If your HTML file actually contained (say) 1 paragraph of 70 comma-delimited values for each row of data, rather than an HTML table row with td cells, Word would end up treating the data as 70 separate columns (but it would also probably ask for the column delimiter every time you used the file, and you would have to ensure that commas in the data were correctly quoted.

    In general, when you choose the "OLE DB Database Files" option, Word either knows of an OLE DB Provider type that can read the specified type of file, or it won't be able to read the file. In this case, what it tries to do is read the file using the Jet OLE DB provider (or in recent versions of Word, the ACE OLE DB provider).

    The Jet/ACE providers are one of the mechanisms used to read Access .mdb/.accdb data, but these providers can read a number of formats such as Excel workbook data and plain text file data, using a number of what Jet/ACE calls "Installable ISAMs (IISAMs).

    Since there is an IISAM for HTML format data, Word will try to get the data using that IISAM.

    In that case, as long as the IISAM can actually read the HTML (it may not be able to read more modern versions of HTML very well) it works much more like the case where Word gets data from Excel. For example, if your HTML file contained two tables, you may get to choose which table to read, cf. an Excel workbook with multiple worksheets and perhaps named ranges.

    Jet/ACE IISAMs generally do not support more than 255 columns. 70 shhould be fine. However, you may need to verify what the HTML IISAM does about

    1. Columns with mixed data types (for example where some rows have numbers in them and others have text). When the Excel IISAM finds such data in the first 8 rows (by default) it tries to choose a format - somtimes that can mean that cells with text are read as if they contained "0". FWIW I do not think the HTML IISAM does that, but I would check anyway.
    2. Columns with large amount of text, particularly if there is more than one such column. The IISAM is quite likely to truncate such columns to 255 characters or even less.
    3. Columns with non-ANSI data (non-ANSI Unicode text e.g. Arabic, Hindi or Chinese text.

    Other than delimited text files which will let you go over theat 255 limit if they are read by the internal converter, the only data source I know that will let WOrd see thousands of columns is SQL Server. Other servers with OLE DB providers such as MySQL might allow that too. If you have to use a very large number of columns, be aware that you may not see all the available field names in the relevant dropdowns in WOrd, but you should be able to insert the MERGEFIELD codes in manually in the usual way.