Search code examples
mmapcrystal-lang

I can’t use mmap to share a Hash between processes


I am implementing a multi-process library that provides a data structure for shared memory. But I am having trouble now, I modified the shared Hash object in the child process, but the parent process still has not read the changed value.

Sample code: https://play.crystal-lang.org/#/r/6n34

Modified with the same pointer, why is it not effective?


Solution

  • When you fork a process its memory is copied while retaining the same virtual memory addresses.

    You're just putting a pointer into your shared memory section, so your memory layout before the fork is:

     +--------------------+    +--------------------+
     |    Shared memory   |    |     Parent heap    |
     |                    |    |                    |
     |                    |    |                    |
     |  Virtual address   |    |  +---------+       |
     |        of  --------------> | Hash    |       |
     |                    |    |  +---------+       |
     |                    |    |                    |
     +--------------------+    +--------------------+
    

    After the fork the pointer is refering to each process' private memory respectively:

     +--------------------+    +--------------------+
     |    Shared memory   |    |     Parent heap    |
     |                    |    |                    |
     |                    |    |                    |
     |  Virtual address   |    |  +---------+       |
     |        of  --------------> | Hash    |       |
     |                 |  |    |  +---------+       |
     |                 |  |    |                    |
     +--------------------+    +--------------------+
                       |
                       |
                       |       +--------------------+
                       |       |     Child heap    |
                       |       |                    |
                       |       |                    |
                       |       |  +---------+       |
                       +--------> | Hash    |       |
                               |  +---------+       |
                               |                    |
                               +--------------------+
    

    So when you dereference the pointer in the child, you're touching the object in the child heap only.

    What you have to do instead is put all the actual data into the shared memory. This is tricky to do for standard Crystal datatypes, since they rely on being able to request new memory and have it managed by a garbage collector. So you would need to implement a GC that can work on shared memory.

    However if you only have a fixed amount of data, say a couple of numbers or a fixed size string, you can utilize Crystal's value type to make the affair a little bit nicer:

    module SharedMemory
      def self.create(type : T.class, size : Int32) forall T
        protection = LibC::PROT_READ | LibC::PROT_WRITE
        visibility = LibC::MAP_ANONYMOUS | LibC::MAP_SHARED
        ptr = LibC.mmap(nil, size * sizeof(T), protection, visibility, 0, 0).as(T*)
        Slice(T).new(ptr, size)
      end
    end
    
    record Data, point : Int32 do
      setter point
    end
    
    shared_data = SharedMemory.create(Data, 1)
    shared_data[0] = Data.new 23
    
    child = Process.fork
    if child
      puts "Parent read: '#{shared_data[0].point}'"
      child.wait
      puts "Parent read: '#{shared_data[0].point}'"
    else
      puts "Child read: '#{shared_data[0].point}'"
      # Slice#[] returns the object rather than a pointer to it, 
      # so given Data is a value type, it's copied to the local 
      # stack and the update wouldn't be visible in the shared memory section.
      # Therefore we need to get the pointer using Slice#to_unsafe 
      # and dereference it manually with Pointer#value
      shared_data.to_unsafe.value.point = 42
      puts "Child changed to: '#{shared_data[0].point}'"
    end
    
    Parent read: '23'
    Child read: '23'
    Child changed to: '42'
    Parent read: '42'
    

    https://play.crystal-lang.org/#/r/6nfn

    The cavehat here is that you cannot put any Reference types like String or Hash into a struct, as then again these are just pointers into each process' private address space. Crystal provides types and API to make sharing for example a string a bit easier, namely Slice and String#to_slice etc., but you have to copy it to and from the shared memory each time you want to pass it or convert it back respectively, and you have to know your (maximum) string length in advance.