Duplicate lines should be printed the same number of times they occur in the input. Special care needs to be taken so that a file with a lot of duplicate lines does not use more memory than what is required for the number of unique lines.
I've tried all the collection interfaces but none seems to be working for this question :( Can someone please help me?? Thanks.
The code below is memory inefficient, as it stores duplicate lines in the PriorityQueue. Hope this helps
public static void doIt(BufferedReader r, PrintWriter w) throws IOException {
PriorityQueue<String> s=new PriorityQueue<String>();
String line;
int n=0;
while ((line = r.readLine()) != null) {
s.add(line);
n++;
while (n!=0) {
w.println(s.remove());
n--;
}
}
The ideal approach would be to use a sorted multiset, such as Guava's TreeMultiset.
If you're not allowed to use external libraries, you can replace s.add(line)
with s.add(line.intern())
. This tells the JVM to put a copy of each unique line
into the String
pool and share the same object among all the references.
Note that putting String
s into the pool may cause them to stick around for a long time, which can cause problems in long-running applications, so you don't want to do this casually in a production application, but for your homework problem it's fine. In the case of a production application, you'd want to put the String
s into a SortedMap
where the value is the count of times the line appeared, but this is more complicated to code correctly.