Search code examples

How to read and write more then 25000 records/lines into text file at a time?

I am connecting my application with stock market live data provider using web socket. So when market is live and socket is open then it's giving me nearly 45000 lines in a minute. at a time I am deserializing it line by line and then write that line into text file and also reading text file and removing first line of text file. So handling another process with socket becomes slow. So please can you help me that how should I perform that process very fast like nearly 25000 lines in a minute.

string filePath = @"D:\Aggregate_Minute_AAPL.txt";
var records = (from line in File.ReadLines(filePath).AsParallel()                    
               select line);
    List<string> str = records.ToList();
    str.ForEach(x =>
         string result = x;
         result = result.TrimStart('[').TrimEnd(']');
         var jsonString = Newtonsoft.Json.JsonConvert.DeserializeObject<List<LiveAMData>>(x);
         foreach (var item in jsonString)
             string value = "";
             string dirPath = @"D:\COMB1\MinuteAggregates";
             string[] fileNames = null;
             fileNames = System.IO.Directory.GetFiles(dirPath, item.sym+"_*.txt", System.IO.SearchOption.AllDirectories);
             if(fileNames.Length > 0)
                 string _fileName = fileNames[0];
                 var lineList = System.IO.File.ReadAllLines(_fileName).ToList();
                 var _item = lineList[lineList.Count - 1];
                 if (!_item.Contains(item.sym))
                      lineList.RemoveAt(lineList.Count - 1);
                 System.IO.File.WriteAllLines((_fileName), lineList.ToArray());
                 value = $"{item.sym},{item.s},{item.o},{item.h},{item.c},{item.l},{item.v}{Environment.NewLine}";
                 using (System.IO.StreamWriter sw = System.IO.File.AppendText(_fileName))

How to make process fast, if application perform this then it takes nearly 3000 to 4000 symbols. and if there is no any process then it executes 25000 lines per minute. So how to increase line execution time/process with all this code ?


  • First you need to cleanup you code to gain more visibility, i did a quick refactor and this is what i got

    const string FilePath = @"D:\Aggregate_Minute_AAPL.txt";
    class SomeClass
        public string Sym { get; set; }
        public string Other { get; set; }
    private void Something() {
            .Select(x => x.TrimStart('[').TrimEnd(']'))
    private const string DirPath = @"D:\COMB1\MinuteAggregates";
    private const string Separator = @",";
    private void WriteRecord(List<SomeClass> data)
        foreach (var item in data)
            var fileNames = Directory
                .GetFiles(DirPath, item.Sym+"_*.txt", SearchOption.AllDirectories);
            foreach (var fileName in fileNames)
                var fileLines = File.ReadAllLines(fileName)
                var lastLine = fileLines.Last();
                if (!lastLine.Contains(item.Sym))
                    fileLines.RemoveAt(fileLines.Count - 1);
                    new StringBuilder()
                File.WriteAllLines(fileName, fileLines);

    From here should be more easy to play with List.AsParallel to check how and with what parameters the code is faster.


    • You are opening the write file twice
    • The removes are also somewhat expensive, in the index 0 is more (however, if there are few elements this could not make much difference
    • if(fileNames.Length > 0) is useless, use a for, if the list is empty, then he for will simply skip
    • You can try StringBuilder instead string interpolation

    I hope this hints can help you to improve your time! and that i have not forgetting something.


    We have nearly 10,000 files in our directory. So when process is running then it's passing an error that The Process can not access the file because it is being used by another process

    Well, is there a possibility that in your process lines there is duplicated file names?

    If that is the case, you could try a simple approach, a retry after some milliseconds, something like

    private const int SleepMillis = 5;
    private const int MaxRetries = 3;
    public void WriteFile(string fileName, string[] fileLines, int retries = 0)
            File.WriteAllLines(fileName, fileLines);
        catch(Exception e) //Catch the special type if you can
            if (retries >= MaxRetries)
                Console.WriteLine("Too many tries with no success");
                throw; // rethrow exception
            WriteFile(fileName, fileLines, ++retries); // try again

    I tried to keep it simple, but there are some annotations: - If you can make your methods async, it could be an improvement by changing the sleep for a Task.Delay, but you need to know and understand well how async works - If the collision happens a lot, then you should try another approach, something like a concurrent map with semaphores

    Second edit

    In real scenario I am connecting to websocket and receiving 70,000 to 1 lac records on every minute and after that I am bifurcating those records with live streaming data and storing in it's own file. And that becomes slower when I am applying our concept with 11,000 files

    It is a hard problem, from what i understand, you're talking about 1166 records per second, at this size the little details can become big bottlenecks.

    At that phase i think it is better to think about other solutions, it could be so much I/O for the disk, could be many threads, or too few, network...

    You should start by profiling the app to check where the app is spending more time to focus in that area, how much resources is using? how much resources do you have? how is the memory, processor, garbage collector, network? do you have an SSD?

    You need a clear view of what is slowing you down so you can attack that directly, it will depend on a lot of things, it will be hard to help with that part :(.

    There are tons of tools for profile c# apps, and many ways to attack this problem (spread the charge in several servers, use something like redis to save data really quick, some event store so you can use events....