Search code examples
javacryptographysha256hash-function

How to hash multiple times and concatenate a string in each round


I am writing a program which concatenate a word R at the end of a password and then calculate the SHA-256 hash. Later on, adding the R word again at the end of the hex result and calculate the new hash with SHA256.

I want this to repeat for 100 times. Every time I want to print the hash.

So something like this, in pseudo code:

hash = SHA256(...(SHA256(SHA256(“password”||R)||R)||R)..)

I am currently testing my code by hashing 2 times:

   String R = "f@ghj!$g";
   hash = password.concat(R);

   MessageDigest md = MessageDigest.getInstance("SHA-256");
   digest = hash.getBytes(StandardCharsets.UTF_8);

   for (int i=0;i<2;i++) {

     md.update(digest);
     digest = md.digest();

     hash = String.format("%064x", new BigInteger(1,digest)).concat(R);
     System.out.println(hash);

     digest = hash.getBytes(StandardCharsets.UTF_8);
   }

Lets forget this concatenation for a sec.

For example can't understand why the following two codes produce different results:

Code 1:

   for (int i=0;i<2;i++) {

     md.update(digest);
     digest = md.digest();

   }

 hash = String.format("%064x", new BigInteger(1,digest));   
 System.out.println(hash);

Code 2:

   for (int i=0;i<2;i++) {

     md.update(digest);
     digest = md.digest();
     //convert hash to string
     hash = String.format("%064x", new BigInteger(1,digest));
     //convert string again to bytes
     digest = hash.getBytes(StandardCharsets.UTF_8);
   }

 System.out.println(hash);

My question is: which is the right way to decode the hash (Byte[]) into hex String every time to concatenate the R word and encode again into bytes the right way?


Solution

  • Code fragment 1 is correct, but you need to add the print statement to it to get your expected output. For this however you need to use a true hex encoder / decoder, which - unhelpfully - is not delivered by default in java.util.


    Here is a reworked example, without the concatenation, which I left out deliberately to leave you something to do.

    The code uses a relatively slow but easy to remember and read toHex function. The BigInteger first needs to construct a BigInteger, which is wasteful and probably even slower. Although the code seems to work correctly for 32 byte hash values, I'd still consider the code hard to maintain.

    public static byte[] printHexadecimalHashIterations(byte[] input, int iterations)
    {
        var digest = input.clone();
    
        MessageDigest md;
        try
        {
            md = MessageDigest.getInstance("SHA-256");
        }
        catch (NoSuchAlgorithmException e)
        {
            throw new IllegalStateException("SHA-256 hash should be available", e);
        }
    
        for (int i = 0; i < iterations; i++)
        {
            md.update(digest);
            digest = md.digest();
    
            printDigest("Intermediate hash", digest);
        }
    
        printDigest("Final hash", digest);
    
        return digest;
    }
    
    public static void printDigest(String hashType, byte[] digest)
    {
        var digestInHex = toHex(digest);
        System.out.printf("%s: %s%n", hashType, digestInHex);
    }
    
    public static String toHex(byte[] data)
    {
        var sb = new StringBuilder(data.length * 2);
        for (int i = 0; i < data.length; i++)
        {
            sb.append(String.format("%02X", data[i]));
        }
        return sb.toString();
    }
    
    public static void main(String[] args)
    {
        printHexadecimalHashIterations("password".getBytes(StandardCharsets.UTF_8), 2);
    }
    

    The main thing to take away from this is that data to (secure) hash functions consists of bytes (or octets if you prefer that name). The hexadecimal string is just a textual representation of these bytes. It is not identical to the data itself.

    You should be able to differentiate between binary data and hexadecimals, which is just a representation of binary data. Don't ever call binary data "hex" as you do in the question: that's a red flag that you don't get the difference.

    However, in your case you only need the hexadecimals to print them out to screen; you don't need to convert the digest byte array to hexadecimals at all; it remains available. So you can just go on with it.


    In case you need to convert this textual representation back to bytes then you would need to perform hex decoding. Obviously you would again need a good method that does not involve BigInteger for that. There are plenty of libraries (Guava, Apache Commons, Bouncy Castle) that provide good hex encoders / decoders and good questions / answers on this on SO. The statement hash.getBytes(StandardCharsets.UTF_8) in code fragment 2 doesn't perform hexadecimal decoding, it performs character encoding.


    As a final hint: the update methods allow streaming of data into the digest function. That means that you never actually have to concatenate anything to calculate the digest over the concatenation: you can just perform multiple calls to update instead.

    Happy programming.


    EDIT:

    To perform your task I would do something like this:

    final byte[] passwordBytes = "password".getBytes(StandardCharsets.UTF_8);
    final byte[] rBytes = "f@ghj!$g".getBytes(StandardCharsets.UTF_8);
    
    digest.update(passwordBytes);
    digest.update(rBytes);
    byte[] currentHash = digest.digest();
    
    for (int i = 1; i < iterations; i++)
    {
        digest.update(currentHash);
        digest.update(rBytes);
        currentHash = digest.digest();
    }