Search code examples
stringassemblydosx86-16hamming-distance

How to find the hamming distance for strings that are not necessarily equal length?


I have an assignment asking me to find the hamming distance of two user-input strings that are not necessarily equal in length.

So, I made the following algorithm:

  1. Read both strings
  2. check the length of each string
  3. compare the length of the strings
  4. if(str1 is shorter)
    set counter to be the length of str1
    END IF
  5. if(str1 is longer)
    set counter to be the length of str2
    END IF
  6. if(str1 == str2)
    set counter to be length of str1
    END IF
  7. loop through each digit of the strings
    if(str1[digitNo] XOR str2[digitNo] == 1)
        inc al
        END IF
    
  8. the final al value is the hamming distance of the strings, print it.

But I'm stuck at step 3 and I don't seem to get it working. any help?

I tried playing around with the registers to save the values in, but none of that worked, I still didn't get it working.

    ; THIS IS THE CODE I GOT
.model small
.data
str1 db 255 
db ?
db 255 dup(?)
msg1 db 13,10,"Enter first string:$"
str2 db 255
db ?
db 255 dup(?)
msg2 db 13,10,"Enter second string:$"
one db "1"
count db ?
.code
.startup
mov ax,@data
mov ds,ax

; printing first message
mov ah, 9
mov dx, offset msg1
int 21h
; reading first string
mov ah, 10
mov dx, offset str1
int 21h            
; printing second message
mov ah, 9
mov dx, offset msg2
int 21h
; reading second string
mov ah, 10
mov dx, offset str2
int 21h 

; setting the values of the registers to zero
mov si, 0
mov di, 0
mov cx, 0
mov bx, 0 


; checking the length of the first string
mov bl, str1+1
add bl, 30h
mov ah, 02h
mov dl, bl
int 21h


; checking the length of the second string
mov bl, str2+1
add bl, 30h
mov ah, 02h 
mov dh, bl
int 21h  


; comparing the length of the strings
cmp dl,dh 
je equal
jg str1Greater
jl str1NotGreater

; if the strings are equal we jump here
equal:      
mov cl, dl
call theLoop

; if the first string is greater than the second, we jump here and set counter of str1
str1Greater:


; if the second string is greater than the first, we jump here and set counter to length of str2
Str1NotGreater: 



; this is the loop that finds and prints the hamming distance
;we find it by looping over the strings and taking the xor for each 2, then incrementing counter of ones for each xor == 1
theLoop:



end

So, in the code I provided, it's supposed to print the length of each string (it prints the lengths next to each other), but it seems to always keep printing the length of the first string, twice. The register used to store the length of the first string is dl, and the register used to store the length of the second is dh, if I change it back to dl, it would then print the correct length, but I want to compare the lengths, and I think it won't be possible to do so if I save it in dl both times.


Solution

  • but it seems to always keep printing the length of the first string, twice.

    When outputting a character with the DOS function 02h you don't get to choose which register to use to supply the character! It's always DL.

    Since after printing both lengths you still want to work with these lengths it will be better to not destroy them in the first place. Put the 1st length in BL and the second length in BH. For outputting you copy these in turn to DL where you do the conversion to a character. This of course can only work for strings of at most 9 characters.


     ; checking the length of the first string
     mov  BL, str1+1
     mov  dl, BL
     add  dl, 30h
     mov  ah, 02h
     int  21h
    
     ; checking the length of the second string
     mov  BH, str2+1
     mov  dl, BH
     add  dl, 30h
     mov  ah, 02h 
     int  21h  
    
     ; comparing the length of the strings
     cmp  BL, BH 
     ja   str1LONGER
     jb   str1SHORTER
    
     ; if the strings are equal we ** FALL THROUGH ** here
    equal:      
     mov  cl, BL
     mov  ch, 0
     call theLoop
    
    !!!! You need some way out at this point. Don't fall through here !!!!
    
    ; if the first string is greater than the second, we set counter of str1
    str1LONGER:
    
    
    ; if the second string is greater than the first, we set counter to length of str2
    Str1SHORTER: 
    
    
    ; this is the loop that finds and prints the hamming distance
    ;we find it by looping over the strings and taking the xor for each 2, then incrementing counter of ones for each xor == 1
    theLoop:
    

    Additional notes

    • Lengths are unsigned numbers. Use the unsigned conditions above and below.
    • Talking about longer and shorter makes more sense for strings.
    • Don't use 3 jumps if a mere fall through in the code can do the job.
    • Your code in theLoop will probably use CX as a counter. Don't forget to zero CH. Either using 2 instructions like I did above or else use movzx cx, BL if you're allowed to use instructions that surpass the original 8086.

    Bonus

     mov  si, offset str1+2
     mov  di, offset str2+2
     mov  al, 0
    MORE:
     mov  dl, [si]
     cmp  dl, [di]
     je   EQ
     inc  al
    EQ:
     inc  si
     inc  di
     loop MORE