I use Cgo to access C/C++ library in Go code, and I found some exception logs like following:
fatal error: unexpected signal during runtime execution
[signal SIGSEGV: segmentation violation code=0x1 addr=0x90 pc=0x7ff0fbdc23ff]
....
STACK ...
Now I could confirm that the exception is from C/C++ library, but this exception will crash my Go program, even if I write the recover code.(PS: it seems that I cannot recover the fatal error).
My scenarios:
In this process, Go program might receive the wrong message(eg: invalid message format). Wrong message might crash the C libraray, and it cannot be found in Go programe, I cannot do anything when the C library crashed, even if I want to skip the wrong message when Go programe restart.
Is there any way to catch the exception from C/C++ library?
Or In general, What's the best practice for error handling in Cgo?
I want to stress what @Not_a_Golfer said: a process is sent the SIGSEGV
signal when the OS encounters it tried to access memory it must have not ever tried to access.
The problem is that the cause of such an error might indeed be "harmless" (see below) or it might not.
Harmless might be like trying to read some memory at an address which is invalid for the process. The most common scenario is trying to dereference a so-called NULL-pointer.
In such scenario the process might have likely not overwrote a range of memory, and if you're lucky, aborting the operation could mostly allow the process to chug away¹.
It's not unicorns and rainbows, though: if the process had allocated some memory before the operation started, you'd most likely end up with a memory leak.
Severe cases result from writing into a memory region not intended for the process.
The problem with them is that by the time the process hits an invalid memory region, it might have overwritten its own live data structures which was not intended.
In such cases all bets are really off.
No matter which class a particular problem which led to invalid memory access belongs to, please note that it indicates the program contains at least one logic error, and the code path exercising that error was executed. This means the process is now in a somewhat undefined state because such errors easily become "propagated": they may cause cascading effect when otherwise unrelated parts of a program might start to misbehave because the invariants their logic is based on were inadvertently changed.
In your case the code seems to access memory at address 0x90
which looks like a classic pointer arithmetics involved a NULL pointer (just a guess, but still).
What I would do in this case is this:
And by all means, please try to fix the root cause, if at all possible.
¹ Correctly restoring execution after the OS had trapped access to an invalid memory region is tough business by itself—see this for instance.
Basically you have to implement a custom signal handler which will set things up in such a way that the OS would restart executing the code of your process not from the CPU instruction which actually accessed that memory block and blew up but with a known-good location (which supposedly should be some place near exit from the library's entry point function which executed the faulty code somewheer down its call path.
And you'd need to properly restore the stack pointer and may be something else.
Really it's not something you routinely do.
It even may be less resource-consuming to binary-patch the library image to prevent faulty code paths from being executed or diverting them to their fixed counterparts, added to the image—much like bugfixes done via binary patching similar to those done for TTD, for example.