Search code examples
pythonpython-3.xdjangodjango-rest-frameworkdjango-views

In Django, when i compare two Html texts, how to remove the blank rows with the aim of making the comparison positive despite the blank lines?


I'm comparing two texts with Django. Precisely, i am comparing two HTML codes using Json and difflib module. The two files compared are: user_input.html and source_compare.html,they are compared with the def function_comparison function in views.py.

enter image description here

PROBLEM: The comparison works correctly if there are no blank rows, but the problem is that if there is a space between the rows (1 or even more rows), then the two html files are different, so i get the preconfigured message "No, it is not the same!".

WHAT WOULD I WANT: I would like the comparison to be correct (so I get the message "Yes, it is the same!") if there are empty rows (1 or even more rows), so it's as if I want to "ignore" the blank rows.

EXAMPLE: I'll show you an example. I would like the comparison to be the same, if for example I have this html in the two files (note the spaces between the rows in user_input.html):

In source_compare.html file:

<!DOCTYPE html>
<html>
     <head>
         <title>Page Title</title>
     </head>
     <body>
         <h1 class="heading">This is a Heading</h1>
         <p>This is a paragraph.</p>
     </body>
</html>

In textbox on home page (user_input.html file): ​

<!DOCTYPE html>
<html>
     <head>
         <title>Page Title</title>
     </head>

     <body>
         <h1 class="heading">This is a Heading</h1>
         <p>This is a paragraph.</p>

     </body>

</html>

I would like the comparison to be positive and results the same (I get the message "Yes, it is the same!") despite one html file having the space between the lines and the other file not having the space. Currently the two files are different in the comparison, so I get the message "No, it is not the same!"

Here is the code:

index.html

{% load static %}

<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="UTF-8">
    <meta http-equiv="X-UA-Compatible" content="IE=edge">
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
    <title>University</title>

    <link rel="stylesheet" href ="{% static 'css/style.css' %}" type="text/css">

    <script src="https://ajax.googleapis.com/ajax/libs/jquery/3.7.0/jquery.min.js"></script>

</head>

<body>
  
  <div class="test">

    <div>Input User</div> 

      <div class="editor">
        <pre class="editor-lines"></pre>
        <div class="editor-area">
          <pre class="editor-highlight"><code class="language-html"></code></pre>
          <textarea 
              class="editor-textarea" 
              id="userinput"
              data-lang="html" 
              spellcheck="false" 
              autocorrect="off" 
              autocapitalize="off">
        &lt;!DOCTYPE html>
        &lt;html>
        &lt;head>
        &lt;title>Page Title&lt;/title>
        &lt;/head>

        &lt;body>     
        &lt;h1 class="heading">This is a Heading&lt;/h1>   
        &lt;p>This is a paragraph.&lt;/p>       

        &lt;/body>

        &lt;/html>
          </textarea>
        </div>
      </div>

    </div> 
      
    <button type="submit" onclick="getFormData();">Button</button>
  
    <br><br>
    <div>Comparison Result</div>
    <div class="result row2 rowstyle2" id="result">
      {% comment %} Ajax innerHTML result {% endcomment %}
    </div>
    
  </div> 

{% comment %} script to disable "want to send form again" popup {% endcomment %}
<script>
  if ( window.history.replaceState ) {
      window.history.replaceState( null, null, window.location.href );
  }
</script>

<script>
  function getFormData() {
      $.ajax({
          type:"GET",
          url: "{% url 'function_comparison' %}",
          data:{
              // con Div: "formData": document.getElementById("userinput").innerText
            "formData": document.getElementById("userinput").value
          },
          success: function (response) {
              document.getElementById("result").innerHTML = response.message;
          },
          error: function (response) {
              console.log(response)
          }
      });
  }
</script>


<link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/11.9.0/styles/vs2015.css">

</body>
</html>

views.py

from django.http import JsonResponse
from django.shortcuts import render
import difflib
from django.conf import settings
import re
   
def index(request):
    return render(request, "index.html")


def function_comparison(request):
    context = {}
    if request.method == "GET":
        
        user_form_data = request.GET.get("formData", None)

        with open('App1/templates/user_input.html', 'w') as outfile:
            outfile.write(user_form_data)

        file1 = open('App1/templates/source_compare.html', 'r').readlines()
        file2 = open('App1/templates/user_input.html', 'r').readlines()

        file1_stripped = []
        file2_stripped = []

        # FIXED TEXT WHITE SPACE PROBLEM
        # re sub here checks for each item in the list, and replace a space, or multiple space depending, with an empty string
        for file1_text in file1:
            file1_text = re.sub("\s\s+", "", file1_text)
            file1_stripped.append(file1_text)

        for file2_text in file2:
            file2_text = re.sub("\s\s+", "", file2_text)
            file2_stripped.append(file2_text)

        # check if the last item in the user input's list is an empty line with no additional text and remove it if thats the case.
        if file2_stripped[-1] == "":
            file2_stripped.pop()
        ### End - Fixed text white space problem ###

        htmlDiffer = difflib.HtmlDiff(linejunk=difflib.IS_LINE_JUNK, charjunk=difflib.IS_CHARACTER_JUNK)
        htmldiffs = htmlDiffer.make_file(file1_stripped, file2_stripped, context=True)

        if "No Differences Found" in htmldiffs:
            context["message"] = "Yes, it is the same!"
        
        if settings.DEBUG:
            if "No Differences Found" not in htmldiffs:
                context["message"] = htmldiffs
        else:
            if "No Differences Found" not in htmldiffs:
                context["message"] = "No, it is not the same!"

    return JsonResponse(context, status=200)

user_input.html: will be empty

source_compare.html: This is the default and will have no gaps between lines

<!DOCTYPE html>
<html>
    <head>
        <title>Page Title</title>
    </head>
    <body>               
        <h1 class="heading">This is a Heading</h1>   
        <p>This is a paragraph.</p>
    </body>
</html>

myapp/urls.py

from django.urls import path
from . import views

urlpatterns=[
  path('', views.index, name='index'), 
  path('function_comparison/', views.function_comparison,name="function_comparison"),
]

project/urls.py

from django.contrib import admin
from django.urls import path, include

urlpatterns = [
    path('admin/', admin.site.urls),
    path('', include('App1.urls')),
]

For completeness, I also insert the minimal CSS:

style.css

*,
*::after,
*::before {
    margin: 0;
    box-sizing: border-box;
}
  

.rowstyle1 {
  background-color: black;
  color: white;
}

.row2 {
  margin-top: 20px;
  width: 100%;
}


.rowstyle2 {
  background-color: #ededed;;
  color: black;
}


/* Code Editor per Highlightjs */

/* Scrollbars */
::-webkit-scrollbar {
  width: 5px;
  height: 5px;
}

::-webkit-scrollbar-track {
  background: rgba(0, 0, 0, 0.1);
  border-radius: 0px;
}

::-webkit-scrollbar-thumb {
  background-color: rgba(255, 255, 255, 0.3);
  border-radius: 1rem;
}

.editor {
  --pad: 0.5rem;
  display: flex;
  overflow: auto;
  background: #1e1e1e;
  height: 100%;
  width: 100%;
  padding-left: 4px;
}

.editor-area {
  position: relative;
  padding: var(--pad);
  height: max-content;
  min-height: 100%;
  width: 100%;
  border-left: 1px solid hsl(0 100% 100% / 0.08);
}

.editor-highlight code,
.editor-textarea {
  padding: 0rem !important;
  top: 0;
  left: 0;
  right: 0;
  bottom: 0;
  background: transparent;
  outline: 0;
}

.editor-highlight code,
.editor-textarea,
.editor-lines {
  white-space: pre-wrap;
  font: normal normal 14px/1.4 monospace;
}

.editor-textarea {
  display: block;
  position: relative;
  overflow: hidden;
  resize: none;
  width: 100%;
  color: white;
  height: 250px;
  caret-color: hsl(50, 75%, 70%); /* But keep caret visible */
  border: 0;
  &:focus {
    outline: transparent;
  }
  &::selection {
    background: hsla(0, 100%, 75%, 0.2);
  }
}

.editor-highlight {
  position: absolute;
  left: var(--pad);
  right: var(--pad);
  user-select: none;
  margin-bottom: 0;
  min-width: 0;
}

.editor-lines {
  display: flex;
  flex-direction: column;
  text-align: right;
  height: max-content;
  min-height: 100%;
  color: hsl(0 100% 100% / 0.6);
  padding: var(--pad); /* use the same padding as .hilite */
  overflow: visible !important;
  background: hsl(0 100% 100% / 0.05);
  margin-bottom: 0;
  & span {
    counter-increment: linenumber;
    &::before {
      content: counter(linenumber);
    }
  }
}


/* highlight.js customizations: */

.hljs {
  background: none;
}

Solution

  • You need to modify how you process the HTML content before comparing them.

    I've added a new function (process_file) inside your function_comparison function:

    def function_comparison(request):
        context = {}
        if request.method == "GET":        
            user_form_data = request.GET.get("formData", None)
            with open('App1/templates/user_input.html', 'w') as outfile:
                outfile.write(user_form_data)
            file1 = open('App1/templates/source_compare.html', 'r').readlines()
            file2 = open('App1/templates/user_input.html', 'r').readlines()
    
            def process_file(file_lines):
                processed_lines = []
                for line in file_lines:              
                    stripped_line = line.strip() # Remove spaces and newlines
                    if stripped_line:  # Add the line only if it's not empty
                        processed_lines.append(stripped_line)
                return processed_lines
    
            file1_processed = process_file(file1)
            file2_processed = process_file(file2)
    
            htmlDiffer = difflib.HtmlDiff(linejunk=difflib.IS_LINE_JUNK, charjunk=difflib.IS_CHARACTER_JUNK)
            htmldiffs = htmlDiffer.make_file(file1_processed, file2_processed, context=True)
    
            if "No Differences Found" in htmldiffs:
                context["message"] = "Yes, it is the same!"
            else:
                context["message"] = "No, it is not the same!" if not settings.DEBUG else htmldiffs
    
        return JsonResponse(context, status=200)
    
    

    As you can see the stripping process is done to both files, so in both of them blank lines are eliminated.