Search code examples
c#httpserverfile-monitoring

Http-Server, Monitoring a path/folder


I am not sure where to start, I managed the code from a template. With the below code I can download all files from an Http-server side. It would check if this is already downloaded and if it is then it would not take it from the site. I want to only download part of the files. And I am trying to think of an easy solution to achieve one of the following points:

  1. Get the last Modified data or last created time on the Server-Http. I understand how to do this from a folder, but I don't want to download the file and then check this, I need to do this on the server. Onlocal pc would be as FileInfo infoSource = new FileInfo(sourceDir); and then infoSource.CreationTime where sourceDir is the file path. Something similar possible on http?
  2. Get only the latest 10 files from the server site. No the latest, but latest 10.
  3. Monitor the server site so once there is a file MyFileName_Version put on the site, it would get the latest file with this naming convention.

Any of these ways would work for me, but I am still a newbie in these, so struggle here. Currently I have the following code:

using System;
using System.Diagnostics;
using System.IO;
using System.Net;
using System.Security.Cryptography.X509Certificates;
using System.Text.RegularExpressions;
using Topshelf;

namespace AutomaticUpgrades
{
    class Program
    {
        static void Main(string[] args)
        {
            // This path containts of the Site, Then binary-release/, The
            string url = "HTTP://LOCALHOUST:1000000";
           
            DownloadDataFromArtifactory(url);

        }

        private static void DownloadDataFromArtifactory(string url)
        {
           HttpWebRequest request = (HttpWebRequest)WebRequest.Create(url);
            using (HttpWebResponse response = (HttpWebResponse)request.GetResponse())
            {
                using (StreamReader reader = new StreamReader(response.GetResponseStream()))
                {
                    string html = reader.ReadToEnd();
                    Regex regex = new Regex(GetDirectoryListingRegexForUrl(url));
                    MatchCollection matches = regex.Matches(html);
                    if (matches.Count > 0)
                    {
                        WebClient webClient = new WebClient();
                        foreach (Match match in matches)
                        {
                            
                            if (match.Success)
                            {
                                Console.WriteLine(match.Groups["name"]);
                                //"C:\\Users\\RLEBEDEVS\\Desktop\\sourceFolder\\Http-server Download"
                                if (match.Groups["name"].Length > 5 
                                    && DupeFile(match.Groups["name"].ToString(), 
                                    "C:\\Users\\RLEBEDEVS\\Desktop\\sourceFolder\\Http-server Download")
                                    )
                                {
                                    webClient.DownloadFile("HTTP://LOCALHOUST:1000000" + match.Groups["name"], "C:\\Users\\RLEBEDEVS\\Desktop\\sourceFolder\\Http-server Download\\" + match.Groups["name"]);
                                }
                               

                            }
                        }
                        webClient.Dispose();
                    }
                }
            }
        }

        public static string GetDirectoryListingRegexForUrl(string url)
        {
            if (url.Equals("HTTP://LOCALHOUST:1000000"))
            {
                return "<a href=\".*\">(?<name>.*)</a>";
            }
            throw new NotSupportedException();
        }

        private static bool DupeFile(string httpFile, string folderLocation)
        {
            string[] files = System.IO.Directory.GetFiles(folderLocation);
            foreach (string s in files)
            {
                if (System.IO.Path.GetFileName(s).ToString() == httpFile)
                {
                    return false;
                }
            }

            return true;
        }

        
    }
}

Solution

  • After some days of getting into the 'HTTP-server mode', I got a valid solution for my question, hence posting it here. Moreover, I understood how API works and the question which I have asked, would not be fully clear, though you learn while you go.

    public async void GetPackages(string Feed, string Path, string filter, string retrievePath)
    {
        //Putting a constraint on the Year and Month the file is put on the Http-Site; 
        //Need to feed in Api Storage Path {Feed} and {Path} + a place where to download the latest Zip file
        // e.g. {downloadPath}
        int yearCheck = DateTime.Now.Year;
        int monthCheck = DateTime.Now.Month - 2;
        string uri = $"{_httpClient.BaseAddress}/api/storage/{Feed}/{Path}";
        string responseText;
        var artifacts = new List<Artifact>();
        //After this we need to access the ResApi to get the list of all files included in the {Feed} Directory
        //At the moment this gives an Error. Though Executes the script correctly      
            responseText = await _httpClient.GetStringAsync(uri);
            var response = JsonConvert.DeserializeObject<GetPackagesResponse>(responseText);
        if (response.Children.Count < 1)
            return;              
        //Looping through the Array Children to find all Zip files from the last 3 Months. 
        foreach (var item in response.Children)
        {
            if (item.Folder)
                continue;
    
            var package = item.Uri.TrimStart('/'); 
            var fullPath = $"{uri}{item.Uri}";
    
            //The URI which used for downloading the particular .zip file. 
            //Additionally we will filter by the last Modified time. Mainly we loop through each element, 
            //Check the lastModified field and then check if this agrees our criteria. 
            var downloadUri = $"{_httpClient.BaseAddress}/{Feed}/{Path}/";
            var lastModified = await GetLastModified(fullPath);
    
            //Last Modified field is checked againts the year (needs to be current year) and
            // needs to be created in the last two months specified in variables
            if((int)lastModified.Year == yearCheck && (int)lastModified.Month>monthCheck)
            artifacts.Add(
                new Artifact(package, downloadUri, lastModified)
            );
        }
    
        
        //Filtering a list only with the needed files for update. 
        foreach (var articact in artifacts)
        {
            if (articact.Package.ToString().Contains(filter))
            {
                filteredList.Add(articact);
            }
        }
    
        //Creating a new list which is sorted by the LastModified field. 
        List<Artifact> SortedList = filteredList.OrderByDescending(o => o.LastModified.DayOfYear).ToList();
    
        //Downloading all the Files which does match the criteria. We should only Retrieve one File Mainly
        //  ArtifactRetrieve.DownloadDataArtifactory(SortedList[0].DownloadUri, SortedList[0].Package);
        // This one would be the first element in the list and should be the latest one availble. 
        //for (int i=0; i< SortedList.Count; i++)
        ArtifactRetrieve.DownloadDataArtifactory(SortedList[0].DownloadUri, SortedList[0].Package, retrievePath);
       
    }
    

    My Conclusion:

    1. There is no way how to monitor a site other than putting a loop in your script, meaning if I am correct the System.FileWatcher can't be mimicked on the http-server. (possibly wrong assumption).
    2. I can check the dates and sort the retrieved data in a list. This allows more control to know which data I am downloading, either the latest or the earliest one.