javascript performance big-o space-complexity

Space complexity of finding non-repeating character in string

Here is a simple algorithm exercise. The problem is to return the first non-repeating character. For example, I have this string: 'abbbcdd' and the answer is 'a' because 'a' appears before 'c'. In case it doesn't find any repeated characters, it will return '_'.

My solution works correctly, but my question is about the performance. The problem statement says: "Write a solution that only iterates over the string once and uses O(1) additional memory."

Here is my code:

console.log(solution('abbbcdd'))

function solution(str) {
  let chars = buildCharMap(str)
  for (let i in chars) {
    if (chars[i] === 1) {
      return i
    }
  }
  return '_'
}

function buildCharMap(str) {
  const charMap = {}
  for (let i = 0; i < str.length; i++) {
    !charMap[str[i]] ? charMap[str[i]] = 1 : charMap[str[i]]++
  }
  return charMap
}

Does my answer meet the requirement for space complexity?

Solution

The time complexity is straightforward: you have a loop over a string of length n, and another loop over an object with strictly at most n keys. The operations inside the loops take O(1) time, and the loops are consecutive (not nested), so the running time is O(n).

The space complexity is slightly more subtle. If the input were a list of numbers instead of a string, for example, then we could straightforwardly say that charMap takes O(n) space in the worst case, because all of the numbers in the list might be different. However, for problems on strings we have to be aware that there is a limited alphabet of characters which those strings could be formed of. If that alphabet has size a, then your charMap object can have at most a keys, so the space complexity is O(min(a, n)).

That alphabet is often explicit in the problem - for example, if the input is guaranteed to contain only lowercase letters, or only letters and digits. Otherwise, it may be implicit in the fact that strings are formed of Unicode characters (or in older languages, ASCII characters). In the former case, a = 26 or 62. In the latter case, a = 65,536 or 1,112,064 depending on if we're counting code units or code points, because Javascript strings are encoded as UTF-16. Either way, if a is a constant, then O(a) space is O(1) space - although it could be quite a large constant.

That means that in practice, your algorithm does use O(1) space. In theory, it uses O(1) space if the problem statement specifies a fixed alphabet, and O(min(a, n)) space otherwise; not O(n) space. Assuming the former, then your solution does meet the space-complexity requirement of the problem.

This raises the question of why, when analysing algorithms on lists of numbers, we don't likewise say that Javascript numbers have a finite "alphabet" defined by the IEEE 754 specification for floating point numbers. The answer is a bit philosophical; we analyse running time and auxiliary space using abstract models of computation which generally assume numbers, lists and other data structures don't have a fixed limit on their size. But even in those models, we assume strings are formed from some alphabet, and if the alphabet isn't fixed in the problem then we let the alphabet size be a variable a which we assume is independent of n. This is a sensible way to analyse algorithms on strings, because alphabet size and string length are independent in the problems we're usually interested in.