Search code examples
linuxmmapcopy-on-write

How to know whether a copy-on-write page is an actual copy?


When I create a copy-on-write mapping (a MAP_PRIVATE) using mmap, then some pages of this mapping will be copied as soon as I write to specific addresses. At a certain point in my program I would like to figure out which pages have actually been copied. There is a call, called 'mincore', but that only reports whether the page is in memory or not, which is not the same as the page being copied or not.

Is there some way to figure out which pages have been copied ?


Solution

  • Good, following the advice of MarkR, I gave it a shot to go through the pagemap and kpageflags interface. Below a quick test to check whether a page is in memory 'SWAPBACKED' as it is called. One problem remains of course, which is the problem that kpageflags is only accessible to the root.

    int main(int argc, char* argv[])
    {
      unsigned long long pagesize=getpagesize();
      assert(pagesize>0);
      int pagecount=4;
      int filesize=pagesize*pagecount;
      int fd=open("test.dat", O_RDWR);
      if (fd<=0)
        {
          fd=open("test.dat", O_CREAT|O_RDWR,S_IRUSR|S_IWUSR);
          printf("Created test.dat testfile\n");
        }
      assert(fd);
      int err=ftruncate(fd,filesize);
      assert(!err);
    
      char* M=(char*)mmap(NULL, filesize, PROT_READ|PROT_WRITE, MAP_PRIVATE,fd,0);
      assert(M!=(char*)-1);
      assert(M);
      printf("Successfully create private mapping\n");
    

    The test setup contains 4 pages. page 0 and 2 are dirty

      strcpy(M,"I feel so dirty\n");
      strcpy(M+pagesize*2,"Christ on crutches\n");
    

    page 3 has been read from.

      char t=M[pagesize*3];
    

    page 1 will not be accessed

    The pagemap file maps the process its virtual memory to actual pages, which can then be retrieved from the global kpageflags file later on. Read the file /usr/src/linux/Documentation/vm/pagemap.txt

      int mapfd=open("/proc/self/pagemap",O_RDONLY);
      assert(mapfd>0);
      unsigned long long target=((unsigned long)(void*)M)/pagesize;
      err=lseek64(mapfd, target*8, SEEK_SET);
      assert(err==target*8);
      assert(sizeof(long long)==8);
    

    Here we read the page frame numbers for each of our virtual pages

      unsigned long long page2pfn[pagecount];
      err=read(mapfd,page2pfn,sizeof(long long)*pagecount);
      if (err<0)
        perror("Reading pagemap");
      if(err!=pagecount*8)
        printf("Could only read %d bytes\n",err);
    

    Now we are about to read for each virtual frame, the actual pageflags

      int pageflags=open("/proc/kpageflags",O_RDONLY);
      assert(pageflags>0);
      for(int i = 0 ; i < pagecount; i++)
        {
          unsigned long long v2a=page2pfn[i];
          printf("Page: %d, flag %llx\n",i,page2pfn[i]);
    
          if(v2a&0x8000000000000000LL) // Is the virtual page present ?
            {
            unsigned long long pfn=v2a&0x3fffffffffffffLL;
            err=lseek64(pageflags,pfn*8,SEEK_SET);
            assert(err==pfn*8);
            unsigned long long pf;
            err=read(pageflags,&pf,8);
            assert(err==8);
            printf("pageflags are %llx with SWAPBACKED: %d\n",pf,(pf>>14)&1);
            }
        }
    }
    

    All in all, I'm not particularly happy with this approach since it requires access to a file that we in general can't access and it is bloody complicated (how about a simple kernel call to retrieve the pageflags ?).