algorithm search fault-tolerance fuzzy-comparison

What algorithms can I use to produce simple human-readable fault-tolerant strings?

Humans make mistakes. When you require them to provide some unique generated ID identifying some entity. For example: Order A: has id ABC1234 Order B: has id BCD1235 They can make typos, they can provide string for ex: A123, B123, 1 2 3, "Order id B 12/3" Then for automatic system its a challenge to identify the original ID. My questions is are there any known algorithms/techniques. To generate a

-unique human readable ID (not sha or md5) -with fault tolerance. That you can from a subset of character still decode the original id. -case insensitive

A visual example of fault tolerance are QR codes, when some part of qr code damaged you can still read the message.

The goals is to avoid tools/algorithms like for ex. elastic search, levenstein and increase the chance to decode the original id even when the customer makes a typo, and reduce the chance that some other "original id" will be provided.

Solution

Aside from error correction, the interesting part of this question is whether there are codes designed specifically for humans to read and transcribe.

In RFC 3548, some considerations are made for avoiding the use of easily-confused characters in base32 coding (1 and L, 0 and o). Human-oriented base-32 encoding has some variations on that concept.

For audio, the PGP Word List is designed to give each byte a distinct word; it helps to protect against errors by having two lists of 256 words, one used for even bytes, the other for odd bytes (so a missing byte or swapped bytes can be detected).

There was a discussion here on SO about human friendly, pronounceable IDs which might be interesting, work on pronounceable passwords (like Diceware) is somewhat related.

Metafilter also had a discussion about codes that are easy for humans to copy that provides a few more interesting references.