Search code examples
c#edix12

Efficient EDI parsing into database in C#


3+ years ago we were asked to develop an EDI solution for a client as a matter of urgency.

They wanted full IP/control etc. of the solution and didn't want to use free open source solutions, pay large sums of money for the likes of BizTalk etc. or pay recurring fees to a VAN.

We did some research at the time and actually didn't find a lot of information regarding EDI formats, parsing etc. so our 2 man development team just jumped straight in and developed a solution in C#/ASP.Net. Due to the low number of EDI message transactions that would be taking place (100 or so a day) we adopted a RegEx process for parsing, validation and inserting into the database. This was done via a seperate C# app that was scheduled to run every few minutes and connect to the clients various providers FTP, AS2, EBMX comms and download data as well as upload any outbound EDI messages.

We then developed a web front-end that allowed the clients staff full access to the data with various revenue reports, ability to control the data as well as allow some of the clients agents to log in and also interact with the data and initiate invoice transactions too.

The client now wants some more EDI work done for another avenue of their business, however, this time the edi message transactions would leap into the 1000's. Our development teams concern is the use of RegEx. I read recently that using RegEx for EDI parsing has huge overheads and should be avoided.

The only reason we adopted it in the first place was an inexperience of not knowing what was the best to use. That said, RegEx has made managing edi message templates a breeze including validation within the templates. The client has added several more providers to their books and we were able to add the new message templates (with custom alterations) in minutes.

After much more research recently we found that most solutions parse EDI files into XML. Is there a reason for this? Is this just to adopt a more common format and/or avoid database access? Is it quicker to just parse XML over the flat file EDI messages?

We want the data elements from the EDI file to be in the database? Would we just parse the XML file instead? Isn't this just another step of processing that could be avoided?

I apologise for the generic nature of my question but I am having a hard time locating the answers.

Many thanks for your time.

NOTE: Our development team only use Microsoft products so please take this into account when giving feedback.


Solution

  • I suspect most developers who chose to write their own solution wrote their own classes for EDI to XML conversion because their end point integration supported XML (or they couldn't write to the db directly, or wanted to use XSLT to show the end user the data nicely). I've written parsers that "translated" into CSV and flat file formats, because that's what we needed to import. I've also written parsers to dump directly into a database. Parsing into XML usually represents a necessary step for some as a "middleware" kind of approach. If you don't need to do the intermediary step, then why should you? If you can write it out to the DB, by all means do so. You also didn't mention what documents you are doing, and I'm assuming you've built out the FA process in your application. RegEx should continue to work for you, and there's a lot of ways to skin the cat.

    With that said, my usual disclaimer applies. You are reinventing the wheel here. By miles. I understand your client's wishes, and glad you were able to meet the need. Frankly, I probably would have fired the client :) Since you only use Microsoft products, you've kind of hamstrung yourself. Looking around SO, BizTalk is more discussed than other packages. There's probably a reason for this, and as you found out, it's also very expensive. I'm a big fan of Liaison Delta - runs on Windows, uses Microsoft Foundation Classes at its core and allows you to translate any-to-any at a fraction of BizTalk's cost. Seems to me maintaining drag/drop "maps" is easier than maintaining thousands of lines of code, but hey, policy is policy :) Hope this helps.