Search code examples
javacmemory-leaksnativejna

Intermittent memory corruption errors in JNA calls


We have a very simple requirement of calling a couple of native functions from Java. We are using JNA for making those native calls.

Edits: We don't have any custom native code. We are making calls to Linux Kernel C library functions.

We are getting very weird memory corruption errors like

  • Error in `/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.191.b12-0.el7_5.x86_64/jre/bin/java': malloc(): memory corruption: 0x00007f9b7849fc40
  • Error in `/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.191.b12-0.el7_5.x86_64/jre/bin/java': corrupted size vs. prev_size: 0x00007f253c4470f0
  • SIGSEGV (0xb)

The program even hangs up some times. These errors are intermittent.

Some standard examples/docs around using structures in JNA calls would be helpful.


This is our library wrapper having native functions:

https://github.com/tmtsoftware/csw/blob/master/csw-time-client/src/main/scala/csw/time/client/internal/TimeLibrary.java

These are the native models which map to structures in C:

https://github.com/tmtsoftware/csw/tree/master/csw-time-client/src/main/scala/csw/time/client/internal/native_models

And this is how we are accessing the library functions:

val timeVal = new NTPTimeVal()
TimeLibrary.ntp_gettimex(timeVal)
println(timeVal.tai)

You can refer to the TimeServiceImpl.scala for more clarity.

https://github.com/tmtsoftware/csw/blob/master/csw-time-client/src/main/scala/csw/time/client/internal/TimeServiceImpl.scala

Could someone tell us what exactly are we doing wrong?


Solution

  • There are some reserved fields in ntptimeval and related structures:

    struct ntptimeval
    {
      struct timeval time;  /* current time (ro) */
      long int maxerror;    /* maximum error (us) (ro) */
      long int esterror;    /* estimated error (us) (ro) */
      long int tai;     /* TAI offset (ro) */
    
      long int __glibc_reserved1;
      long int __glibc_reserved2;
      long int __glibc_reserved3;
      long int __glibc_reserved4;
    };
    

    Which you don't have in your code:

    public class NTPTimeVal extends Structure {
        public TimeVal time;        /* Current time */
        public Long maxerror;       /* Maximum error */
        public Long esterror;
        public int tai;
    }
    

    If those reserved fields happen to be used in your glibc version, that can explain the heap corruption.

    I would also carefully examine the data you get back. If some fields contain strange values, it may mean field size/alignment problem, which can also signal the structure being shorter than it needs to be.