Search code examples
bashopendiff

Opendiff and online file


I'd like to make a diff between to file, one is local and the other is online using for example

opendiff http://www.tex.ac.uk/ctan/web/lua2dox/Doxyfile Doxyfile

But it throw the following error :

2014-02-12 15:23:43.579 opendiff[72650:1007] /Users/Dev/Joker/http:/www.tex.ac.uk/ctan/web/lua2dox/Doxyfile does not exist

So how can I use a online file the same way as a local one ?


Solution

  • Since this is a programming Q&A site, we may as well write a program to do this for us :-)

    You can create a script called (for example) odw for OpenDiffWeb which will detect whether you're trying to access web-based files and first download them to a temporary location.

    Examine the following script, it's pretty rudimentary but it shows the approach that can be taken.

    #!/bin/bash
    
    # Ensure two parameters.
    
    if [[ $# -ne 2 ]] ; then
        echo Usage: $0 '<file/url-1> <file/url-2>'
        exit 1
    fi
    
    # Download first file if web-based.
    
    fspec1=$1
    if [[ $fspec1 =~ http:// ]] ; then
        wget --output-document=/tmp/odw.$$.1 $fspec1
        fspec1=/tmp/odw.$$.1
    fi
    
    # Download second file if web-based.
    
    fspec2=$2
    if [[ $fspec2 =~ http:// ]] ; then
        wget --output-document=/tmp/odw.$$.2 $fspec2
        fspec2=/tmp/odw.$$.2
    fi
    
    # Show difference of two files.
    
    diff $fspec1 $fspec2
    
    # Delete them if they were web-based.
    
    if [[ $fspec1 =~ /tmp/odw. ]] ; then
        rm -f $fspec1
    fi
    
    if [[ $fspec2 =~ /tmp/odw. ]] ; then
        rm -f $fspec2
    fi
    

    In this case, we detect a web-based file as one starting with http://. If it is, we simply use wget to bring it down to a temporary location. Both files are checked this way.

    Once both files are on the local disk (either because they were brought down or because thet were already there), you can run the diff - I've used the standard diff but you can substitute your own.

    Then, the temporary files are cleaned up.

    As a test, I downloaded the page http://www.example.com and made a very minor change to it then compared the page to my modified local copy:

    pax> odw http://www.example.com example.txt 
    --2014-09-25 16:40:02--  http://www.example.com/
    Resolving www.example.com (www.example.com)... 93.184.216.119,
        2606:2800:220:6d:26bf:1447:1097:aa7
    Connecting to www.example.com (www.example.com)|93.184.216.119|:80...
        connected.
    HTTP request sent, awaiting response... 200 OK
    Length: 1270 (1.2K) [text/html]
    Saving to: `/tmp/odw.6569.1'
    
    100%[=================================>] 1,270       --.-K/s   in 0s      
    
    2014-09-25 16:40:02 (165 MB/s) - `/tmp/odw.6569.1' saved [1270/1270]
    
    4c4
    <     <title>Example Domain</title>
    ---
    >     <title>Example Domain (slightly modified)</title>
    

    Now there's all sorts of added stuff that could go into that script, the ability to pass flags to the diff and wget programs, the ability to handle other URL types, deletion of temporary files on signals and so on.

    But it should hopefully be enough to get you started.