Deduplicating lines in large file fails with sort and uniq

I have a large file which consists of one line of JSON per line for 1563888 lines. To deduplicate the lines among this file I have been using the shell one-liner sort myfile.json | uniq -u.

Is there an easy way for bash to handle such large files? Or is there a clean way to chunk the file up? I was using bash initially instead of Python as it seemed like an easier way to quickly verify things though now I am thinking about moving this task back to Python.

Solution

As per Kamil Cuk, let's try this solution:

sort -u myfile.json

Is the file really JSON? Sorting a JSON file can lead to dubious results. You may also try split'ing the file as suggested by Mark Setchell. You can then sort each split file, and sort the results. All sorts should be done with sort -u.

Please provide some sample from myfile.json if indeed it is a JSON file. Let's here about your results when you just use sort -u.