Search code examples
c#arrayscounter

C# encoding text RLE method


I'm trying to pack a given byte array by 'removing' repeated bytes, something like this:

  1. Entrance 255 1 1 4 4 4 4 4 200 15 10
  2. Output 1x255 2x1 5x4 1x200 1x15 1x10 => 255 1 1 5 4 200 15 10

If one byte is repeated more than 3 times I replace it with the counter.

I began with making a temporary byte list with no repeated values and list with numbers of appearances. I've got a problem with the counter though:

public static void compressBlock(List<byte> buffer)
    {
        byte marker = buffer.Last();

        int counter = 1;


        byte[] buffer_ar = new byte[buffer.Count];
        buffer_ar = buffer.ToArray();

        List<byte> temp = new List<byte>();
        List<int> tmp = new List<int>();


       int indeks = 0;
            while (true)
            {


                if (buffer_ar[indeks] == buffer_ar[indeks + 1])
                {
                    counter++;

                    if (buffer_ar[indeks] != buffer_ar[indeks + 1])
                    {
                        temp.Add(buffer_ar[indeks]);
                        tmp.Add(counter);
                        //counter = 1;
                    }


                }

                else
                {
                    //counter = 1;
                    temp.Add(buffer_ar[indeks]);
                    tmp.Add(counter);

                }

                indeks++;
                //counter = 1;

                if (buffer_ar.Length -1 <= indeks) { break; }

            }

As the output I have:

byte list: 255 1 4 200 15 10

int list: 1 2 6 6 6 6

I know I have to reset the counter at some point, but when I do that as the output of the int list I have: 1 1 1 1 1 1.

Could someone point me in the right direction to do that?


Solution

  • There are some issues with you implementation:

    1. Decode is impossible since different inputs like 1 1 1 1 and 4 1 produce the same output: 4 1

    2. What if the same items appears more than 255 (255 == Byte.MaxValue) times?

    3. Better use general IEnumerable<Byte> then concrete List<Byte>

    4. You don't need any buffer, just count the last item occurrence.

      public static IEnumerable<Byte> RleEncode(IEnumerable<Byte> source)
      {
          if (null == source)
              throw new ArgumentNullException("source");
      
          const int threshold = 3;
      
          Byte current = 0;
          int count = 0;
      
          foreach (var item in source) 
              if ((count == 0) || (current == item)) 
              {
                  current = item; 
                  count += 1;
              }
              else 
              { 
                  if (count <= threshold)
                      for (int i = 0; i < count; ++i)
                          yield return current;
                  else 
                  {
                      for (int i = 0; i < count / Byte.MaxValue; ++i) 
                      {
                          yield return Byte.MaxValue;
                          yield return current;
                      }
      
                      if (count % Byte.MaxValue != 0) 
                      {
                          yield return (Byte) (count % Byte.MaxValue);
                          yield return current;
                      }
                 }
      
                 current = item;
                 count = 1;
             }
      
             // Tail
             if (count <= threshold)
                 for (int i = 0; i < count; ++i)
                     yield return current;
             else 
             {
                 for (int i = 0; i < count / Byte.MaxValue; ++i) 
                 {
                     yield return Byte.MaxValue;
                     yield return current;
                 }
      
                 if (count % Byte.MaxValue != 0) 
                 {
                     yield return (Byte) (count % Byte.MaxValue);
                     yield return current;
                 }
             }
         }
      

    Test

      List<Byte> source = new List<Byte> {
        255, 1, 1, 4, 4, 4, 4, 4, 200, 15, 10
      };
    
      // 255 1 1 5 4 200 15 10 
      String test = String.Join(" ", RleEncode(source));