Search code examples
c#algorithmhashuwpsubtitle

What's wrong with my OSDB hash algorithm?


I'm trying to write a c# algorithm to get a hash from an online video file to search for subtitles on (https://trac.opensubtitles.org/projects/opensubtitles/wiki/HashSourceCodes)

My idea is that the algorithm is fed a url to a video file, and returns the hash. Simple. Problem is, I'm not getting the right value back. According to the page I've linked to, this file should return 8e245d9679d31e12, but I'm getting 00c4fcb4aa6f763e. Here is my C#:

public static async Task<byte[]> ComputeMovieHash(string filename)
{
    long filesize = 0;

    //Get File Size
    HttpWebRequest req = (HttpWebRequest)WebRequest.Create(filename);
    req.Method = "HEAD";
    var resp = await req.GetResponseAsync();
    filesize = resp.ContentLength;
    long lhash = filesize;

    //Get first 64K bytes
    byte[] firstbytes = new byte[0];
    using (HttpClient client = new HttpClient())
    {
        client.DefaultRequestHeaders.Add("Range", "bytes=0-65536");
        using (HttpResponseMessage response = await client.GetAsync(filename))
        {
            Debug.WriteLine("getting first bytes (bytes=0-65536)");
            firstbytes = await response.Content.ReadAsByteArrayAsync();
        }
    }
    lhash += BitConverter.ToInt64(firstbytes, 0);

    //Get last 64K bytes
    byte[] lastbytes = new byte[0];
    using (HttpClient client = new HttpClient())
    {
        client.DefaultRequestHeaders.Add("Range", "bytes=" + (filesize - 65536) + "-" + filesize);
        using (HttpResponseMessage response = await client.GetAsync(filename))
        {
            Debug.WriteLine("getting last bytes (" + "bytes=" + (filesize - 65536) + "-" + filesize + ")");
            lastbytes = await response.Content.ReadAsByteArrayAsync();
        }
    }
    lhash += BitConverter.ToInt64(lastbytes, 0);

    //Return result
    byte[] result = BitConverter.GetBytes(lhash);
    Array.Reverse(result);
    Debug.WriteLine("RESULT=" + ToHexadecimal(result));
    return result;
}

What am I doing wrong?? I've compared it to the code given by opensubtitles.org, and it seems like it should have the same outcome :/


Solution

  • You have several errors in your code:

    1. Range bytes=0-65536 will return you 65537 bytes, which is one byte too much.

    2. You do not calculate 64 bit checksum, because BitConverter.ToInt64(firstbytes, 0) takes first 8 bytes and converts them to number, the rest 65536-8 bytes are completely ignored.

    Fixed version should be something like this:

    public static async Task<byte[]> ComputeMovieHash(string filename) {
        long filesize = 0;
    
        //Get File Size
        HttpWebRequest req = (HttpWebRequest) WebRequest.Create(filename);
        req.Method = "HEAD";
        var resp = await req.GetResponseAsync();
        filesize = resp.ContentLength;
        long lhash = filesize;
    
        //Get first 64K bytes
        byte[] firstbytes;
        using (HttpClient client = new HttpClient()) {
            client.DefaultRequestHeaders.Add("Range", "bytes=0-65535");
            using (HttpResponseMessage response = await client.GetAsync(filename)) {
                Debug.WriteLine("getting first bytes (bytes=0-65535)");
                firstbytes = await response.Content.ReadAsByteArrayAsync();
            }
        }
        for (int i = 0; i < firstbytes.Length; i += sizeof (long)) {
            lhash += BitConverter.ToInt64(firstbytes, i);
        }
    
        //Get last 64K bytes
        byte[] lastbytes;
        using (HttpClient client = new HttpClient()) {
            client.DefaultRequestHeaders.Add("Range", "bytes=" + Math.Max(filesize - 65536, 0) + "-" + filesize);
            using (HttpResponseMessage response = await client.GetAsync(filename)) {
                Debug.WriteLine("getting last bytes (" + "bytes=" + (filesize - 65536) + "-" + filesize + ")");
                lastbytes = await response.Content.ReadAsByteArrayAsync();
            }
        }
        for (int i = 0; i < lastbytes.Length; i += sizeof (long)) {
            lhash += BitConverter.ToInt64(lastbytes, i);
        }
    
        //Return result
        byte[] result = BitConverter.GetBytes(lhash);
        Array.Reverse(result);
        return result;
    }