Tis-interpreter: An interpreter for detecting undefined behavior in standard C

99 points by modeless 6 years ago

modeless 6 years ago

Here's an example of how it was used to detect undefined behavior in libwebp: https://trust-in-soft.com/out-of-bounds-pointers-a-common-pa...

And musl libc: https://trust-in-soft.com/variadic-functions-in-tis-interpre...

unwind 6 years ago

Thanks, very interesting.
The second one about musl was very long-winded, it seemed it was written for people who have never even heard about varargs, which made me wonder what the intended audience is, heh. I mean, if you know about undefined behavior enough to use a tool to track it down, wouldn't it make sense to also know about C's varargs mechanism?
Also it made it sound as if varargs is super-archaic and "best avoided" scary monster thing, which it really isn't in my opinion. It's a well-defined and useful part of the language and really helps make better APIs.
rurban 6 years ago

While testing all the libc's I found many more issues and quirks: https://rurban.github.io/safeclib/doc/safec-3.2/d1/dae/md_do...
tis-interpreter looks fine enough to be added to my testcases, but it doesn't compile on my macports ocaml. A package would be nice. Travis can build those and store them under release tags.
MaxBarraclough 6 years ago

Oops, I posted a reply to you, but not as a reply - it's over here: https://news.ycombinator.com/item?id=16412494

MaxBarraclough 6 years ago

I'm not convinced by their justification for doing arithmetic on uintptr_t and then casting back to a pointer type.

They say

> It should work fine for all the ordinary platforms we are used to, because of footnote 67 in the C11 standard.

which admits things aren't assured to work. Aren't we in the business of producing good quality code, here? Following their link to the footnote, it is indeed far from a rock-solid assurance of expected behaviour.

These two StackOverflow comments make a good case against it - https://stackoverflow.com/a/19476805/ , https://stackoverflow.com/a/43419534/

but our author isn't the only person who thinks it might be ok - https://wiki.sei.cmu.edu/confluence/display/c/INT36-C.+Conve...

tom_mellior 6 years ago

Out of interest, where is that quoted sentence from?
My understanding is that while this tool aims to catch undefined behavior, it also tries to keep going when it encounters implementation-defined behavior. I think you can choose a concrete target platform (CPU/OS/compiler) to model, at least you can do that in Frama-C, which tis-interpreter is based on.
So if you specified, say, x86-64 Linux with GCC as your platform, and you happen to be mostly sure that x86-64 GCC on Linux generates code as they expect, then that's fine.
It makes no sense to interpret C code independent of a model of a target platform. There is just too much implementation-defined stuff.
- MaxBarraclough 6 years ago
  
  > Out of interest, where is that quoted sentence from?
  Oops, I should've replied to modless's comment - it's from this blog post: https://trust-in-soft.com/out-of-bounds-pointers-a-common-pa...
  > you happen to be mostly sure that x86-64 GCC on Linux generates code as they expect, then that's fine.
  Being 'mostly sure' isn't good enough; that's the point of rigorous tools like this. If a new GCC optimisation breaks your code because you failed to follow the letter of the law and had UB in your code, that's your fault, not GCC's.
  > There is just too much implementation-defined stuff.
  Take that thinking to an extreme and you're barely able to reason rigorously about C code at all. Any number of dangerous constructs might happen to work on a given CPU+compiler. Good C code should be portable. Detecting dangerous, poor quality code is the whole point of tools like tis-interpreter.
  Things like casting int* to int and back to int* might happen to work on your CPU+compiler, but it should cause warnings.
  
  tom_mellior 6 years ago
  
  > Take that thinking to an extreme and you're barely able to reason rigorously about C code at all.
  No reason to take things to the extreme. Is the code
  int i = 32767; int n = i + i;
  well-defined? It is implementation-defined whether it is undefined [0], since that depends on the implementation's size for int.
  You are barely able to reason rigorously about C code without considering some concrete real-world implementation's behavior. (I agree that the uintptr_t thing would merit an optional warning; maybe there is one.)
  [0] Language lawyers might disagree with this statement and just call it undefined. But if anything, that's even less useful.
  
  MaxBarraclough 6 years ago
  
  That's a good example. I see two ways to handle that one:
  1. Don't use int, use a type that gives stronger range assurances, like int_fast32_t (though I'm not sure at a glance what's the minimum legal value of INT_FAST32_MAX)
  2. Put a compile-time assert to ensure that your range assumptions are met by the target platform's 'int'
  A pity that Ada's 'range types' approach to fixed-point types never caught on. It solves this platform-variance question, expresses the programmer's intent better, and enables useful runtime checks. It's possible with C++ templates (libraries exist for the purpose), but no-one does it.
  As for the question of what our static/dynamic analysis tools should do: they should be paranoid, and warn that the code might invoke UB, unless specifically configured to make strong platform assumptions. By default, we should aim to write good portable C.
  
  dhekir 6 years ago
  
  Even CompCert, the formally verified C compiler, had to make some concessions concerning the standard: it does not verify C per se, but CompCert C (http://compcert.inria.fr/man/manual005.html), which is very close to C99 but with fixed-size integers (32-bit), two's complement signed integers (so no UB in signed overflows), etc.
  Otherwise, it is practically impossible to find some actually useful C code that is fully portable and compliant.
  But I agree that analysis tools should be as configurable as possible, and report UB whenever they can. However, if the default settings lead to way too many false positives, the users might end up giving up before learning how to turn some of them off. It's a delicate balance between paranoia and usability. Even -Wall -Wextra -pedantic do not activate all warnings about detectable UBs, and that's about as paranoid as you can get in GCC/Clang.
  
  MaxBarraclough 6 years ago
  
  As tom_mellior says, CompCert is just a compiler, and it's not deviating from the C standard.
  The C standard says that sizeof(int) is implementation defined. CompCert has to choose a size, and they go with the unsurprising choice of 32-bit.
  As for no UB in signed overflows: where the C standard says 'undefined behavior', that doesn't prohibit an implementation (i.e. a compiler) from picking a well-defined behavior. It just means that portable code mustn't rely on that behavior - build such code on a different standards-compliant compiler and you might end up with demons flying out of your nose, etc.
  > it is practically impossible to find some actually useful C code that is fully portable and compliant.
  Disagree. The C/C++ code I've written has been portable and standard. The hardest part was that CMake is a trainwreck.
  > if the default settings lead to way too many false positives, the users might end up giving up before learning how to turn some of them off
  That's more an issue with legacy code that when writing new code.
  
  tom_mellior 6 years ago
  
  CompCert doesn't verify C, it compiles it. And like any compiler, it uses the type sizes defined by the ABI it targets.
  
  mcguire 6 years ago
  
  One of the major difficulties of doing static analysis for systems programming languages like C is that the semantics are defined by the whole system, not just the language. Writing a value to an obscure location not otherwise mentioned in your program is how you turn on that led. No, it's not portable. Yes, it's necessary. And that's ok.
  Better tools are needed.(TM)
  Check out "Some were meant for C" for an interesting read. http://www.cl.cam.ac.uk/~srk31/#onward17
  
  MaxBarraclough 6 years ago
  
  > No, it's not portable. Yes, it's necessary. And that's ok.
  Indeed, and it may be necessary to have an analysis tool skip over that code, but it applies to only a tiny minority of C code.
  Interesting skim-read, that paper. I agree with the author's lamentation that we so often pretend there's nothing we can do about C's safety issues. Even configuring GCC to check all dereferences for NULL, strikes many people as unusually rigorous.
  Dev+debug builds in particular should have as many runtime checks as possible, but for no obvious good reason, that's just not how we do things.

krasin 6 years ago

Thank you for posting this. Would you like to compare tis-interpreter with UBSan? In which case using tis-interpreter will be better?

Speaking as someone who had rather good experience fighting this kind of bugs in Chromium using UBSan.

dhekir 6 years ago
One case in which I couldn't get UBSan (nor MemSan) to report issues was with the usage of unitialized values, which may constitute undefined behavior.
UBSan does not deal with uninitialized reads, but Memory Sanitizer does ("MemorySanitizer is a detector of uninitialized reads"). However, unless it changed, it only detects them when they are used in conditional branches. So in the following code, for instance, I couldn't get UBSan/MemSan to tell me the printf will read some uninitialized value.
```
  #include <stdio.h>
  int main() {
    int uninit;
    printf("%d", uninit);
    return 0;
  }
```
tis-interpreter (and Frama-C/EVA, which is related) will report the uninitialized read.
Note that both tools are very different in nature and usage, since tis-interpreter is a sound static analyzer (for all entry values), while UBSan is a runtime monitoring tool (Frama-C also has E-ACSL, its own runtime monitoring tool). The former provides stronger guarantees, especially when the set of inputs can be large, while the latter will surely be more efficient when doing bug detection work, since there won't be any false positives.
Overall, I'd say, use UBSan/MemSan/etc. for the majority of the code, and focus on sound analysis for critical components or libraries, where mistakes can be expensive.
- MaxBarraclough 6 years ago
  
  > tis-interpreter is a sound static analyzer (for all entry values), while UBSan is a runtime monitoring tool
  Isn't it a dynamic analysis tool, i.e. a C interpreter that keeps an eye out for UB? From the GitHub README.md:
  > tis-interpreter works by interpreting C programs statement by statement from beginning to end, verifying at each statement whether the program can invoke undefined behavior. This makes it comparable to Valgrind and C compiler sanitizers (UBSan, ASan, …)
  
  dhekir 6 years ago
  
  > Isn't it a dynamic analysis tool, i.e. a C interpreter that keeps an eye out for UB?
  Indeed, I'm sorry, I had tis-analyzer in mind. tis-interpreter is indeed closer to Valgrind. It interprets the C code according to its semantics, without going down to assembly level nor executing the code.
- krasin 6 years ago
  
  This (printf of an undefined value) is a good example. Thank you!
  Most of the time, MemorySanitizer is used with instrumented libc++ and sometimes even libc, which allows to catch most of the cases (including printf), as the data ends up in conditional branches.
  Still, your point is very valid.

nickpsecurity 6 years ago

Check out KCC, too, since the foundation is open-source, offers extra benefits for languages modeled in it, and has a commercial tool for C with either low or no false positives depending on how it measures up to claims. No affiliation. I just find the K framework plus an executable, formal semantics for C cool as heck. :)

https://github.com/kframework/c-semantics

https://runtimeverification.com/match/

They also have an article comparing it to Tis-interpreter here:

https://runtimeverification.com/blog/?p=307

Their K has been used for languages from Scheme to Prolog to C. More recently, they also did a semantics for a VM for smart contracts. They're doing a lot of interesting stuff. Builds on Maude and rewriting logic if anyone is curious.

dhekir 6 years ago

tis-interpreter also has an open-source foundation, which is Frama-C (http://www.frama-c.com). Both share some common code.
Their purpose is not exactly the same as RV-Match's; they are more concerned with soundness and absence of run-time errors (as well as other properties and analyses), while RV-Match is a bug detection tool, focusing on not having false positives. It's important to realize the difference when comparing such kinds of tools.
Depending on the situation, one is more appropriate than the other, but in some cases the results may be complementary (as in RV-Match's comparison, where the reported bugs were different for each of them), so if there's already some effort to try one of them (plus the *San's from LLVM), it may be worth trying the others as well. Every bit of help is welcome when dealing with a language such as C.
Disclaimer: I work with Frama-C.