maththeoryreed-solomon# Can you please explain Reed Solomon encoding part's Identity matrix?

I am working on a object storage project where I need to understand Reed Solomon error correction algorithm,
I have gone through this Doc as a starter and also some thesis paper.

1. content.sakai.rutgers.edu

2. theseus.fi

but I can't seem to understand the lower part of the identity matrix (red box), where it is coming from. How this calculation is done?

Can anyone please explain this.

Solution

The encoding matrix is a 6 x 4 Vandermonde matrix using the evaluation points {0 1 2 3 4 5} modified so that the upper 4 x 4 portion of the matrix is the identity matrix. To create the matrix, a 6 x 4 Vandermonde matrix is generated (where matrix[r][c] = pow(r,c) ), then multiplied with the inverse of the upper 4 x 4 portion of the matrix to produce the encoding matrix. This is the equivalent of "systematic encoding" with Reed Solomon's "original view" as mentioned in the Wikipedia article you linked to above, which is different than Reed Solomon's "BCH view", which links 1. and 2. refer to. The Wikipedia's example systematic encoding matrix is a transposed version of the encoding matrix used in the question.

https://en.wikipedia.org/wiki/Vandermonde_matrix

The code to generate the encoding matrix is near the bottom of this github source file:

```
Vandermonde inverse upper encoding
matrix part of matrix matrix
01 00 00 00 01 00 00 00
01 01 01 01 01 00 00 00 00 01 00 00
01 02 04 08 x 7b 01 8e f4 = 00 00 01 00
01 03 05 0f 00 7a f4 8e 00 00 00 01
01 04 10 40 7a 7a 7a 7a 1b 1c 12 14
01 05 11 55 1c 1b 14 12
```

Decoding is performed in two steps. First, any missing rows of data are recovered, then any missing rows of parity are regenerated using the now recovered data.

For 2 missing rows of data, the 2 corresponding rows of the encoding matrix are removed, and the 4 x 4 sub-matrix inverted and used the multiply the 4 non-missing rows of data and parity to recover the 2 missing rows of data. If there is only 1 missing row of data, then a second data row is chosen as if it was missing, in order to generate the inverted matrix. The actual regeneration of data only requires the corresponding rows of the inverted matrix to be used for a matrix multiply.

Once the data is recovered, then any missing parity rows are regenerated from the now recovered data, using the corresponding rows of the encoding matrix.

Based on the data shown, the math is based on finite field GF(2^8) modulo 0x11D. For example, encoding using the last row of the encoding matrix with the last column of the data matrix is (0x1c·0x44)+(0x1b·0x48)+(0x14·0x4c)+(0x12·0x50) = 0x25 (using finite field math).

The question example doesn't make it clear that the 6 x 4 encode matrix can encode a 4 x n matrix, where n is the number of bytes per row. Example where n == 8:

```
encode data encoded data
01 00 00 00 31 32 33 34 35 36 37 38
00 01 00 00 31 32 33 34 35 36 37 38 41 42 43 44 45 46 47 48
00 00 01 00 x 41 42 43 44 45 46 47 48 = 49 4a 4b 4c 4d 4e 4f 50
00 00 00 01 49 4a 4b 4c 4d 4e 4f 50 51 52 53 54 55 56 57 58
1b 1c 12 14 51 52 53 54 55 56 57 58 e8 eb ea ed ec ef ee dc
1c 1b 14 12 f5 f6 f7 f0 f1 f2 f3 a1
assume rows 0 and 4 are erasures and deleted from the matrices:
00 01 00 00 41 42 43 44 45 46 47 48
00 00 01 00 49 4a 4b 4c 4d 4e 4f 50
00 00 00 01 51 52 53 54 55 56 57 58
1c 1b 14 12 f5 f6 f7 f0 f1 f2 f3 a1
invert encode sub-matrix:
inverse encode identity
46 68 8f a0 00 01 00 00 01 00 00 00
01 00 00 00 x 00 00 01 00 = 00 01 00 00
00 01 00 00 00 00 00 01 00 00 01 00
00 00 01 00 1c 1b 14 12 00 00 00 01
reconstruct data using sub-matrices:
inverse encoded data restored data
46 68 8f a0 41 42 43 44 45 46 47 48 31 32 33 34 35 36 37 38
01 00 00 00 x 49 4a 4b 4c 4d 4e 4f 50 = 41 42 43 44 45 46 47 48
00 01 00 00 51 52 53 54 55 56 57 58 49 4a 4b 4c 4d 4e 4f 50
00 00 01 00 f5 f6 f7 f0 f1 f2 f3 a1 51 52 53 54 55 56 57 58
The actual process only uses the rows of the matrices that correspond
to the erased rows that need to be reconstructed.
First data is reconstructed:
sub-inverse encoded data reconstructed data
41 42 43 44 45 46 47 48
46 68 8f a0 x 49 4a 4b 4c 4d 4e 4f 50 = 31 32 33 34 35 36 37 38
51 52 53 54 55 56 57 58
f5 f6 f7 f0 f1 f2 f3 a1
Once data is reconstructed, reconstruct parity
sub-encode data reconstruted parity
31 32 33 34 35 36 37 38
1b 1c 12 14 x 41 42 43 44 45 46 47 48 = e8 eb ea ed ec ef ee dc
49 4a 4b 4c 4d 4e 4f 50
51 52 53 54 55 56 57 58
```

One alternate to this approach is to use BCH view Reed Solomon. For an odd number of parities, such as RS(20,17) (3 parities), 2 matrix multiplies and one XOR is needed for encoding, and for a single erasure only XOR is needed. For e>1 erasures, a (e-1) by n matrix multiply is done, followed by an XOR. For an even number of parities, if an XOR is used as part of the encode, then a e by n matrix multiply is needed to correct, or the e x n matrix used for encode, allowing one XOR for the correction.

Another alternative is Raid 6, where "syndromes" are appended to the matrix of data, but do not form a proper code word. One of the syndrome rows, called P, is just XOR, the other row called Q, is successive powers of 2 in GF(2^8). For a 3 parity Raid 6, the third row is called R, is successive powers of 4 in GF(2^8). Unlike standard BCH view, if a Q or R row is lost, it has to be recalculated (as opposed to using XOR to correct it). By using a diagonal pattern, if 1 of n disks fail, then only 1/n of the Q's and R's need to be regenerated when the disk is replaced.

http://alamos.math.arizona.edu/RTG16/ECC/raid6.pdf

Note that this pdf file's alternate method uses the same finite field as the method above, GF(2^8) mod 0x11D, which may make it easier to compare the methods.

- Algorithm to locate local maxima
- unsigned long long int pow
- Simple statistics - Java packages for calculating mean, standard deviation, etc
- Arbitrary-precision arithmetic Explanation
- How do I calculate r-squared using Python and Numpy?
- Print all ways to sum n integers so that they total a given sum.
- Prove that (p → q) → ((r ∨ p) → (r ∨ q)) is a tautology without using truth table
- How to implement this Desmos roll function in Unity
- How to compute the centroid of a mesh with triangular faces?
- Different results for XIRR between Excel and ExcelFinancialFunctions 3.2.0
- Efficiently getting all divisors of a given number
- In JavaScript, why does zero divided by zero return NaN, but any other divided by zero return Infinity?
- Float number division with 3 decimal places
- Python fit line to high dimensional points and sample between them
- How can I compute a dual norm?
- How to pick between 2 numbers
- Fastest implementation of sine, cosine and square root in C++ (doesn't need to be much accurate)
- double and accuracy
- Is there a standard sign function (signum, sgn) in C/C++?
- I convert ASCII words into numbers but am stuck trying to decode them. How to convert 1=a, 2=b, 28=ab etc? (psudeocode okay)
- Is there such thing as math.substring in java?
- Count number of ways to pair integers 1-14 under constraint
- How to round up on a log10
- Bezier curve arc lengths
- Calculate cash flows given a target IRR
- Integration in Python Midpoint Calculation
- Generate P random N-dimensional points from list of ALL possible pairwise distances
- Grouping a list of integers with nearest values
- Branchless calculation for multiplying by the complement of a fraction with a flag
- Finding least number of moves