Search code examples
javascriptarraysmemoization

Javascript memoize find array


I'm trying to improve my knowledge regarding memoization in javascript. I have created a memoization function(I think..)

I've got an array of changes(a change log) made to items. Each item in the array contains a reference-id(employeeId) to whom made the edit. Looking something like this.

const changeLog = [
  {
    id: 1,
    employeeId: 1,
    field: 'someField',
    oldValue: '0',
    newValue: '100',
  },
  {
    id: 2,
    employeeId: 2,
    field: 'anotherField',
    oldValue: '20',
    newValue: '100',
  },
  ...
]

I've also got an array containing each employee, looking something like this:

const employees = [
    {
        name: 'Joel Abero',
        id: 1
    },
    {
        name: 'John Doe',
        id: 2
    },
    {
        name: 'Dear John',
        id: 3
    }
]

To find the employee who made the change I map over each item in the changeLog and find where employeeId equals id in the employees-array. Both of these arrays contains 500+ items, I've just pasted fragments. Below I pasted my memoize helper.

1) How can I perform a test to see which of these two run the fastest?
2) Is this a proper way to use memoization?
3) Is there a rule of thumb when to use memoization? Or should I use it as often as I can?

const employees = [
	{
		name: 'Joel Abero',
		id: 1
	},
	{
		name: 'John Doe',
		id: 2
	},
	{
		name: 'Dear John',
		id: 3
	}
]

const changeLog = [
  {
    id: 1,
    employeeId: 1,
    field: 'someField',
    oldValue: '0',
    newValue: '100',
  },
  {
    id: 2,
    employeeId: 2,
    field: 'anotherField',
    oldValue: '0',
    newValue: '100',
  },
  {
    id: 3,
    employeeId: 3,
    field: 'someField',
    oldValue: '0',
    newValue: '100',
  },
  {
    id: 4,
    employeeId: 3,
    field: 'someField',
    oldValue: '0',
    newValue: '100',
  },
  {
    id: 5,
    employeeId: 3,
    field: 'someField',
    oldValue: '0',
    newValue: '100',
  }
]

function findEditedByEmployee (employeeId) {
  return employees.find(({ id }) => id === employeeId)
}

function editedByWithMemoize () {
  let employeesSavedInMemory = {}
  return function(employeeId) {
    if(employeeId in employeesSavedInMemory) {
      console.log("from memory")
      return employeesSavedInMemory[employeeId]
    }
    console.log("not from memory")
    const findEditedBy = findEditedByEmployee(employeeId)
    employeesSavedInMemory[findEditedBy.id] = {name: findEditedBy.name }
    return findEditedBy
  }
}

const memoizedEmployee = editedByWithMemoize();

// with memoization
const changeLogWithEmployeesMemoized = changeLog.map( log => {
  const employeeName = memoizedEmployee(log.employeeId);
  return {
    ...log,
    employeeName: employeeName.name
  }
})

// without memoization
const changeLogWithEmployees = changeLog.map( log => {
  const editedBy = findEditedByEmployee(log.employeeId);
  return {
    ...log,
    employeeName: editedBy.name
  }
})

console.log('memoized', changeLogWithEmployeesMemoized)
console.log('not memoized', changeLogWithEmployees)


Solution

  • I'll try to answer each of your questions:

    1) How can I perform a test to see which of these two run the fastest?

    The best way is just a simple for loop. Take for example a fake API request:

    const fakeAPIRequest = id => new Promise(r => setTimeout(r, 100, {id}))
    

    This will take 100ms to complete on request. Using memoization, we can avoid making this 100ms request by checking if we've made this request before:

    const cache = {}
    const memoizedRequest = async (id) => {
      if (id in cache) return Promise.resolve(cache[id])
      return cache[id] = await fakeAPIRequest(id)
    }
    

    Here's a working example:

    const fakeAPIRequest = id => new Promise(r => setTimeout(r, 100, {id}))
    
    const cache = {}
    
    const memoizedRequest = async (id) => {
      if (id in cache) return Promise.resolve(cache[id])
      return cache[id] = await fakeAPIRequest(id)
    }
    
    const request = async (id) => await fakeAPIRequest(id)
    
    const test = async (name, cb) => {
      console.time(name)
      for (let i = 50; i--;) {
        await cb()
      }
      console.timeEnd(name)
      
    }
    
    test('memoized', async () => await memoizedRequest('test'))
    test('normal', async () => await request('test'))

    2) Is this a proper way to use memoization?

    I'm not entirely sure what you mean by this, but think of it as short-term caching. Should your memo call include an API request, it could be great for non-changing data, saving plenty of time, but on the other hand, if the data is subject to change within a short period of time, then memoization can be a bad idea, meaning it will shortly be outdated.

    If you are making many many calls to this function, it could eat up memory depending on how big the return data is, but this factor is down to implementation, not "a proper way".

    3) Is there a rule of thumb when to use memoization? Or should I use it as often as I can?

    More often than not, memoization is overkill - since computers are extremely fast, it can often boil down to just saving milliseconds - If you are only calling the function even just a few times, memoization provides little to no benefit. But I do keep emphasising API requests, which can take long periods of time. If you start using a memoized function, you should strive to use it everywhere where possible. Like mentioned before, though, it can eat up memory quickly depending on the return data.


    One last point about memoization is that if the data is already client side, using a map like Nina suggested is definitely a much better and more efficient approach. Instead of looping each time to find the object, it loops once to transform the array into an object (or map), allowing you to access the data in O(1) time. Take an example, using find this time instead of the fakeAPI function I made earlier:

    const data = [{"id":0},{"id":1},{"id":2},{"id":3},{"id":4},{"id":5},{"id":6},{"id":7},{"id":8},{"id":9},{"id":10},{"id":11},{"id":12},{"id":13},{"id":14},{"id":15},{"id":16},{"id":17},{"id":18},{"id":19},{"id":20},{"id":21},{"id":22},{"id":23},{"id":24},{"id":25},{"id":26},{"id":27},{"id":28},{"id":29},{"id":30},{"id":31},{"id":32},{"id":33},{"id":34},{"id":35},{"id":36},{"id":37},{"id":38},{"id":39},{"id":40},{"id":41},{"id":42},{"id":43},{"id":44},{"id":45},{"id":46},{"id":47},{"id":48},{"id":49},{"id":50},{"id":51},{"id":52},{"id":53},{"id":54},{"id":55},{"id":56},{"id":57},{"id":58},{"id":59},{"id":60},{"id":61},{"id":62},{"id":63},{"id":64},{"id":65},{"id":66},{"id":67},{"id":68},{"id":69},{"id":70},{"id":71},{"id":72},{"id":73},{"id":74},{"id":75},{"id":76},{"id":77},{"id":78},{"id":79},{"id":80},{"id":81},{"id":82},{"id":83},{"id":84},{"id":85},{"id":86},{"id":87},{"id":88},{"id":89},{"id":90},{"id":91},{"id":92},{"id":93},{"id":94},{"id":95},{"id":96},{"id":97},{"id":98},{"id":99}]
    const cache = {}
    
    const findObject = id => data.find(o => o.id === id)
    const memoizedFindObject = id => id in cache ? cache[id] : cache[id] = findObject(id)
    
    const map = new Map(data.map(o => [o.id, o]))
    const findObjectByMap = id => map.get(id)
    
    const list = Array(50000).fill(0).map(() => Math.floor(Math.random() * 100))
    const test = (name, cb) => {
      console.time(name)
      for (let i = 50000; i--;) {
        cb(list[i])
      }
      console.timeEnd(name)
    }
    
    test('memoized', memoizedFindObject)
    test('normal', findObject)
    test('map', findObjectByMap)

    All in all, memoization is a great tool, very similar to caching. It provides a big speed up on heavy calculations or long network requests, but can prove ineffective if used infrequently.