Search code examples
c#while-loopfilestreamstreamreaderargs

Comparing lines of string in one text file against another text file, and displaying an an error when a non-match occurs


I'm relatively new to C# and I'm trying to get my head around a problem that I believe should be pretty simple in concept, but I just cant get it.

I am currently, trying to display a message to the console when the program is run from the command line with two arguments, if a sequence ID does not exist inside a text file full of sequence ID's and DNA sequences against a query text file full of Sequence ID's. For example args[0] is a text file that contains 41534 lines of sequences which means I cannot load the entire file into memory.:

NR_118889.1 Amycolatopsis azurea strain NRRL 11412 16S ribosomal RNA, partial sequence GGTCTNATACCGGATATAACAACTCATGGCATGGTTGGTAGTGGAAAGCTCCGGCGT

NR_118899.1 Actinomyces bovis strain DSM 43014 16S ribosomal RNA, partial sequence GGGTGAGTAACACGTGAGTAACCTGCCCCNNACTTCTGGATAACCGCTTGAAAGGGTNGCTAATACGGGATATTTTGGCCTGCT

NR_074334.1 Archaeoglobus fulgidus DSM 4304 16S ribosomal RNA, complete sequence >NR_118873.1 Archaeoglobus fulgidus DSM 4304 strain VC-16 16S ribosomal RNA, complete sequence >NR_119237.1 Archaeoglobus fulgidus DSM 4304 strain VC-16 16S ribosomal RNA, complete sequence
ATTCTGGTTGATCCTGCCAGAGGCCGCTGCTATCCGGCTGGGACTAAGCCATGCGAGTCAAGGGGCTT

args[1] is a query text file with some sequence ID's:

NR_118889.1

NR_999999.1

NR_118899.1

NR_888888.1

So when the program is run, all I want are the sequence ID's that were not found in args[0] from args[1] to be displayed.

NR_999999.1 could not be found

NR_888888.1 could not be found

I know this probably super simple, and I have spent far too long on trying to figure this out by myself to the point where I want to ask for help.

Thank you in advance for any assistance.


Solution

  • var saved_ids = new List<String>();
    foreach (String args1line in File.ReadLines(args[1]))
                    {
    
                        foreach (String args2line in File.ReadLines(args[2]))
                        {
    
                            if (args1line.Contains(args2line))
                            {
                                saved_ids.Add(args2line);
    
    
                            }
    
    
    
                        }
    
                    }
    
                    using (System.IO.StreamReader sr1 = new System.IO.StreamReader(args[1]))
                            {
                                using (System.IO.StreamReader sr2 = new System.IO.StreamReader(args[2]))
                                {
    
    
                                    string line1, line2;
    
    
    
                                    while ((line1 = sr1.ReadLine()) != null)
                                      {
    
    
    
    
    
                                        while ((line2 = sr2.ReadLine()) != null)
                                         {
    
    
    
    
    
    
                                            if (line1.Contains(line2))
                                            {
    
                                                saved_ids.Add(line2);
                                                break;
    
    
                                            }
    
    
                                            if (!line1.StartsWith(">"))
                                            {
                                                break; 
                                            }
    
                                            if (saved_ids.Contains(line1))
                                            {
    
                                                break;
                                            }
    
                                            if (saved_ids.Contains(line2))
                                            {
                                                break;
                                            }
    
    
                                            if (!line1.Contains(line2))
                                            {
                                                saved_ids.Add(line2);
                                                WriteLine("The sequence ID {0} does not exist", line2);
    
    
    
                                            }
    
    
    
    
    
                                        }
    
    
    
    
    
    
    
    
                                        if (line2 == null)
                                        {
                                            sr2.DiscardBufferedData();
                                            sr2.BaseStream.Seek(0, System.IO.SeekOrigin.Begin);
                                            continue;
                                        }
                                    }
                                }
                            }