Can I achieve this in HTML5 + other web technologies?

Almost 6 months ago, I asked a question on stackoverflow "Software to help in log analysis?"

Please look at that question before reading ahead.

It turned out that there is no good software available currently that can intermix log files based on timestamps and present them in a good UI.

I wanted to take initiative and develop something and open source it once its completed.

Earlier, I worked around by writing a quick and dirty piece of code in c++, that would generate a tab separated file (like csv but tab separated) which I would later opened in Excel.

I am not satisfied with my c++ code for the following reasons: 1. It totally depends on Excel to view the output file later. 2. Since there is no UI involved, its not easy to write its commandline everytime. 3. Because of the learning curve of the commandline, its not so much sharable with other team members (and the world).

For the above reasons (and a few more), I was thinking to develop that as a web solution. That way I can share the working instance with everyone.

What I have in mind is a web based solution something like this:

The user will be able to give the input log files using HTML5's File API.
And then user would probably tell the format of the timestamp associated with each log file.
Thereafter, the javascript would process those log files into intermixed HTML output in a table.

I am just a beginner in web based technologies. So I need your help in determining if this would be the best way to go about it?

I want a web solution, but that doesn't mean I want user to upload his log files for backend processing. I want a web-based client only solution.

Thanks for your inputs.

EDIT: Based on comment below by Raynos

@bits You do realise that browsers were not meant to handle large pieces of data. There was stackoverflow.com/questions/4833480/… which shows that this can cause problems.

I feel that doing this in browsers isn't the best deal. Probably, I should explore backend based solutions. Any ideas or suggestions?

Solution

Your looking for an online diff tool which takes n files containing a list of timestamps in some order including a extra data to be displayed in place but not parsed in the diffing.

The file upload would involve

<input id="upload" type="file">

Along with snippets of javascript

$("#upload").change(function(files) {
    var files = this.files;
    for (var i = 0; i < files.length; i++) {
        (function() {
            var file = files[i]; 
            var reader = new FileReader;
            reader.onload = function(e) {
                var text = reader.result;
                console.log(text);
            };
            reader.readAsText(file);
        }());
    }
});

See live example.

So you have all the text you just need to work on a parser. I hope that helps a bit.

As for the markup of the diff I would suggest something like:

<table>
 <!-- one tr per unique timestamp -->
 <tr>
  <!-- one td/textarea per file -->
  <td> <textarea /> </td>
  <td> <textarea /> </td>
 </tr>
 ...
</table>

I would recommend making this a template and using a template engine to do some of the heavy lifting.

Let's say we want to use jquery-tmpl.

Here's an example to get you started. (I spend zero time on making it look good. That's your job).

All that's left is generating JSON data to insert into the template.

So given your file input you should have an array of fileTexts somewhere.

We want to have some kind of deliminator to split it up into individual time stamp records. For simplicities sake let's say that the new line character would work.

var fileTexts = [file];
var regex = new RegExp("(timestampformat)(.*)");

for (var i = 0; i < fileTexts.length; i++) {
    var text = fileTexts[i];
    var records = text.split("\n");
    for (var j = 0; j < records.length; j++) {
        var match = regex.exec(records[j]);
        addToTimestamps(match[1], match[2], i);
    }
}

function addToTimestamps(timestamp, text, currFileCount) {
    console.log(arguments);
    // implement it.
}

As per example.

These are the basic building blocks. Get the data from the File API. Manipulate the data into a normalised data format then use some kind of rendering tool on the data format.