How to locate the origin of "Invalid floating point" exception without the debugger?

My system runs for hours without issues.
Suddenly it throws the invalid floating point exception.
It does not happen:
- While running in the debugger
- In all computers

How can I determine where the exception is thrown, without the debugger?
I use Delphi 6.

Solution

By combination of logging and exception tracing. That would mean your system would have to be deployed with debug information. Does not seem a problem for you but sometimes is a problem for box software.

There are a lot of tools to do it, but maybe not all are Delphi-6 compatible still. To name a few:

So you would have to change default exception handler (in TApplication, ExceptProc or whatever - those tools in their sources would show you how to do it) and log floating point exceptions (you would hardly be interested in ALL the possible exceptions now).

This brings another question: a logging framework. Some libs above are already having it, some would need an extra library. You would already need it now and you would need it even more later.

Now you run your service for a while and it keeps saving all FP-related exceptions with their stack traces. IF you compiled it with some optimizations disabled (like "always generate stack frame") it would probably also show some local variables and parameters along the way.

If you re lucky - that would be enough for you to understand how the error happens. But most probably you would see the immediate error condition, but not how they developed from failed initial assumptions.

In that case you would at least have the stack trace (execution path) to the error (or few paths to a few similar errors that you know think being one). At that point you shift your main effort into logging. Knowing your execution path you can log all the interesting functions parameter and local variables along the execution path and see how those variables get abnormal values before the logged exception (and how values are normal if now exception happens).

You would have to do several iterations, first expanding your search down the call stack, adding more params and vars to log, and maybe including some side-by routines to the logging, that are not directly in the call stack but were called before the exception and were affecting local vars values before the error.