I have to sort a large amount of data that can not fit in memory, and one thing could do this I know is "external sort". But I am wondering is that possible to mmap this large data file, and use 'qsort' as it is a 'normal data array'? If that's feasible, what's the differences with 'external sort'?
If the file will fit in a contiguous mapping in your address space, you can do this. If it won't, you can't.
As to the differences:
MAP_PRIVATE
, sorting will mutate the original file. A normal external sort won't (necessarily)MAP_PRIVATE
, you could crash at any time if the VM doesn't have room to duplicate the whole file. Again, a strictly external sort's memory requirements don't scale linearly with the data size.It is possible, it may fail unpredictably and unrecoverably, you almost certainly shouldn't do it.