Search code examples
xmlxsltsgml

Converting legacy SGM to XML


I have a task at work that involves converting legacy SGM files into XML. The SGM files were created using 5 separate high level tags, the new DTD has about 8-12 top level tags that the old ones would need to be mapped to. There are some common tags between the 2 DTDs but there are enough differences that it doesn't make sense to just do manual copy and paste of data between the 2 DTDs.

In addition, there is linking information that needs to be translated between the legacy format into the newer format. I am currently leaning towards the following high level approach.

  1. Convert SGM to well formed XML
  2. Read in the XML files and create a mapping template for existing file types into the new file type. Fields for metadata will be used for each file, with defaults being used for the majority of the values. This file will be used to drive the final conversion into the target XML. I want to have a tool here is fairly bullet proof for data entry and uses drop down lists for the choices for the meta data so I am looking at the creation of a desktop application.
  3. Do a conversion of the XML using XSLT

I am curious if anyone else has experience with this type of conversion, does this high level approach seem viable, are there other ways to view this problem. Because of time limitations for myself I am looking at hiring another developer to do coding for this project. I have used XSLT but do not have recent experience with desktop application development and what languages provide a good interface to XSLT and can provide a good front end experience for the end user.

Appreciate whatever help and comments people can provide. Will be glad to provide further clarification on what I am looking for.


Solution

  • That is precisely how I would do it. You are really doing three different things here: Convert from SGML to XML, convert from XML to a different schema, and mix in new data. So doing it in three separate steps is the right way to do it.