sorting assembly quicksort mergesort counting-sort

Should I use counting sort or any other alternative for sorting frequencies of symbols complexity wise

I will have an array of symbols(256 ascii symbols) and array of their frequncies(some of the symbols apper zero times).Would it be preferble complexity wise to use counting sort for sorting and what soultion will take more code lines (the code will be written in assembly, tasm).

Solution

Counting sort should be excellent if your inputs are long-ish (strings or buffers significantly longer than 256).

Would it be preferable complexity wise to use counting sort for sorting

It's certainly simple to implement, and has O(1) complexity. If large inputs are possible or common, counting sort is very good.

If small inputs are common, though, counting sort still has to spend time clearing the whole count array and scanning it again, and this cost doesn't scale down for smaller inputs.

Depending on the CPU, (e.g. fast memset to clear a count array), counting sort with 256 symbols might be good for inputs as small as 64. You mention TASM, so you're specifically talking about x86, and probably x86-16. Modern x86 has very fast memset, using either SSE stores or rep stosd. (256 or 512 bytes (for 16-bit counters) is large enough that using rep stos is not a terrible idea; the startup overhead is mostly amortized so it's close to the same speed as a vector loop.)

Below 64 elements, I'm not sure whether qsort or mergesort would do better. Below 16 or so elements (and as a base-case for qsort / merge-sort), you probably want InsertionSort for performance.

On modern x86 with SSSE3 (for pshufb), you can use SSE2 pminub / pmaxub as comparators in a sorting network with byte granularity (and yes, these instructions work in 16-bit mode). See Using SIMD Registers and Instructions to Enable Instruction-Level Parallelism in Sorting Algorithms for 32-bit elements, and also Fast in-register sort of bytes?.

Or use SIMD for a partial sort so there are fewer swaps for InsertionSort to do. Maybe just some load, pminub / pmaxub, and store, without much or any shuffling.

and what solution will take more code lines

In asm, lines of source code is the least useful measure. (Not every line assembles into an instruction; some are labels or directives).

Instruction count is sometimes relevant, but some instructions are slower than others, and it matters how you order them, and whether the input of one depends on the output of the other.

If you don't care about performance, but rather code-size, then you need to look at the byte count of the machine code. x86 instructions are variable length.

If you care only about code-size and not performance at all, you could consider bubble sort or jump-down sort. (Without the early-out check, just always loop the max times). See a hilariously-slow JumpDown sort in 19-bytes of x86-32 machine code. With only a few more bytes of code, it could swap without using xchg-with-mem (implicit lock prefix). A more "normal" bubble-sort implementation looks like this (TASM for 8-bit integers).

But you could also implement Insertion Sort with only a few more bytes of code, and it generally performs well (compared to other O(n^2) algorithms like bubble or selection)