Search code examples
c#python-3.xdownload

How to download a csv file programmatically


I have a URL to a CSV file. The size of the file is 300kb with 2700 rows and 15 columns.

I have tried multiple things in Python and C# but ending with the exception

Remote end closed connection without response**

Things which I have tried

Python:

import pandas as pd
import numpy as np
import os

# Download CSV with read_csv
df = pd.read_csv('https://nsearchives.nseindia.com/products/content/sec_bhavdata_full_17072024.csv', low_memory=False)

And again in Python

import urllib.request

url = 'https://nsearchives.nseindia.com/products/content/sec_bhavdata_full_17072024.csv'
filename = 'large_file.csv'

def download_large_file(url, filename):
    with urllib.request.urlopen(url) as response, open(filename, 'wb') as out_file:
        while True:
            chunk = response.read(8192)  # Download in 8KB chunks
            if not chunk:
                break
            out_file.write(chunk)

download_large_file(url, filename)
print("File downloaded successfully!")

C#

using System.Net;

WebClient webClient = new WebClient();
webClient.DownloadFile("URL");

Solution

  • The problem is that server need some previous cookie should be in the request to serve you with the file here is a complete program in c# will give you that functionality

    using System.Diagnostics;
    using System.Net;
    using System.Net.Http.Headers;
     
    HttpClientHandler handler = new HttpClientHandler()
    {
        AllowAutoRedirect = true,
        UseCookies = true,
    };
    
    HttpClient client = new(handler);
    client.DefaultRequestHeaders.Accept.Clear();
    client.DefaultRequestHeaders.Accept.Add(new MediaTypeWithQualityHeaderValue("text/html"));
    client.DefaultRequestHeaders.Accept.Add(new MediaTypeWithQualityHeaderValue("text/csv"));
    client.DefaultRequestHeaders.Accept.Add(new MediaTypeWithQualityHeaderValue("*/*"));
    client.DefaultRequestHeaders.AcceptEncoding.Add(new("gzip"));
    
    
    // first vist the Main Page to obtain required cookies
    Console.Write("vist Main Page to obtain cookies...");
    var MainPage = new Uri(@"https://www.nseindia.com");
    var mainPageRes = await client.GetAsync(MainPage);
    
    if (!mainPageRes.IsSuccessStatusCode)
    {
        Console.WriteLine("Failed!");
        Console.WriteLine("can't obtains cookies form the main page");
        Console.WriteLine("status code: " + mainPageRes.StatusCode);
        return;
    }
    Console.WriteLine("done.");
    
    Console.Write("start to download csv file ....");
    var csvUri = new Uri(@"https://nsearchives.nseindia.com/products/content/sec_bhavdata_full_17072024.csv");
    var response = await client.GetAsync(csvUri);
    
    if (!response.IsSuccessStatusCode)
    {
        Console.WriteLine($"Faile.");
        Console.WriteLine("Can't download the file");
        Console.WriteLine("status code: " + response.StatusCode);
        Console.WriteLine(response.Headers);
        return;
    }
    
    Console.WriteLine("done.");
    
    var filename = "sec_bhavdata_full_17072024.csv";
    using var contentStreem = await response.Content.ReadAsStreamAsync();
    using var stream = new FileStream(filename, FileMode.Create, FileAccess.Write);
    
    Console.Write("start to save content to file....");
    await contentStreem.CopyToAsync(stream);
    Console.WriteLine("done");
    try
    {
        if (OperatingSystem.IsWindows())
        {
            Process.Start("explorer.exe", ".");
        }
    }
    finally
    {
        Console.WriteLine($"file name is {filename}");
    }