Search code examples
mathstatic-analysisformal-verification

How can abstract intepretation in static analysis be used without formal specification


Recently I have read a lot about formal verification and I am fascinated by the topic. Yet I cannot figure out the following:
Formal verification requires a formal specification so how can be abstract interpretation used on any source code in compilers, when there is no formal specification of the program?

Translation from a text in a foreign language that seems correct (and again does not seem to require a formal spec.)

If a program is represented by its control flow chart, then each branch represents a program state (can be more than one - e.g. in a loop a branch is traversed multiple times) and abstract intepretation creates static semantics that approximates the set of this states.

Here an article about abstract interpretation as a formal verification technique: http://www.di.ens.fr/~cousot/publications.www/Cousot-NFM2012.pdf


Solution

  • One can do "concrete interpretation" by building an interpreter for a language, that computes results according to the language specification manual. (The fact that the language manuals are informal leaves the correctness of such an interpreter in permanent doubt).

    Abstract interpretation only requires that one have a good understanding of the program semantics, and an idea about what is worthwhile abstracting. You can get this by taking the interpreter above, and replacing actual computation with an abstraction of the computation, e.g., you can decide to represent all integer values as "positive", "zero", "negative", or "unknown". You can still compute with this, now producing qualitative results. More importantly, you can compute with this without necessarily having actual program inputs (perhaps just abstracted values). I note that the abstract interpreter is also completely untrustworthy in a formal sense, because you are still computing using the informal reference manual as a guide as to what the language will do.

    Now, by running such an abstract program, you may discover that it makes mistakes (e.g., dereferences a null value) with the variables controlling that null value not being "undefined". In that case, you can suggest there is a bug in the program; you may not be right, but this might produce useful results.

    Nowhere does abstract interpretation in practice tell you what the program computes formally because you are still using an informal reference manual. If the manual were to become a formal document, and your abstract interpreter is derived from by provably correct steps, then we might agree that an abstract interpretation of the program is a kind of specification modulo the abstraction for what the program actually does.

    Nowhere does abstract interpretation provide with access to a formal specification of the intention for the program. So you cannot use it by itself to prove the program "correct" with respect to the specification.