In the crypto node library there are many functions labeled update
(example, example, example). The documentation is good, however every doc simply explains the function of update
with 'Updates...'
For example:
hmac.update(data[, inputEncoding])
Updates the Hmac content with the given data, the encoding of which is given in inputEncoding and can be 'utf8', 'ascii' or 'latin1'. If encoding is not provided, and the data is a string, an encoding of 'utf8' is enforced. If data is a Buffer, TypedArray, or DataView, then inputEncoding is ignored.
My question is "Please explain what it means to update in this context"
The update
methods updates the internal state of the HMAC algorithm with the given input. A method called update
is almost universally combined with some kind of final
method (sometimes called doFinal
or similar to avoid a name conflict wit a keyword named final
). This final method performs the last update of the internal state and performs any final operations.
In the case of HMAC it performs the final hash over the o_key_pad
and the hashed i_key_pad
and of course the message. The final
method may also be called differently; for instance for HMAC it is called digest
to calculate the final digest, i.e. the output of the HMAC calculation.
The update method was created to allow streaming large messages using multiple updates. The final method is necessary so the algorithm knows the end of the message has been reached and the final operations can be performed.
Signature generation and HMAC calculation have rather identical purposes; one is using symmetric key and the other an asymmetric key pair. But generally the signature generation / verification works almost identical to HMAC generation / verification.
If encryption is used then update may also return ciphertext or plaintext output. If it does depends on the algorithm and algorithm implementation what is returned, and when. For instance, if you call CBC mode then at least a block of plaintext / ciphertext needs to be buffered before any block encryption / decryption can take place. In principle counter mode however could directly return the ciphertext for the specific bytes; this is the online property of a stream cipher. The implementation may however also decide to buffer the plaintext / ciphertext until a full block is available.
In case of NodeJS, there is also an implementation of CCM mode. This mode has special requirements for both update
and final
. update
may be called just once, and final
must be called precisely once. CCM mode works with a packet format which cannot be updated using a stream, so multiple updates would break CCM (which includes the size of the message at the front of the calculation of the authentication tag). Finally, final
is required to be called to create / verify the authentication tag.
Notes:
Usually the initial, internal state is set either during creation or by using an explicit initialization method. NodeJS clearly has opted for initialization during creation of the objects.
Hash algorithms also require buffering of plaintext because they can only operate on blocks of plaintext (512 bytes for SHA-256 and 1024 bytes for SHA-512, to be precise). This is however transparent to the user of the hash functions as they generate no intermediate output. HMAC, being based solely on a hash function and some XOR'ing, has of course the same requirements and therefore needs buffering.
Sometimes the update
functionality is also required because not all of the message is available at the same time or within the same array. TLS, for instance, authenticates all the send / received handshake messages, so update
is called whenever the next message becomes available.
Other algorithms that do not handle large messages generally do not include an update method. For instance PBKDF2 doesn't use update
because there is no reason to stream the password or salt; they are simply given using a single variable.