The Beehive, City Place, Gatwick, RH6 0PA, United Kingdom
+44 (0)20 801 74646

Detecting untrusted data with Static Analysis

In this next instalment of my blog series, I’ve decided to take a look at a class of flaws that has attracted lots of media headlines – where unexpected data from the outside world consequently triggers undesirable, perhaps even dangerous (and likely always costly) effects due to missing or ineffective sanitisation.

MITREs Common Weakness Enumeration documents many forms of this vulnerability, one large subset being those whose titles mention “Improper Neutralization” or “Injection”. Correspondingly, static analysis tools, like Codesonar, provide a range of checks for detecting many of them in C/C++, Java and C#, though we focus just on C/C++ examples here.

Irrespective of the source language, the above forms a fairly standard template for how these sort of problems occur. In essence, data is injected into the program, as per the getchar() call here (technically called a source) and between it and when the data is used to index ‘buf’ (called the sink), there is insufficient vetting of the data, which in this case will lead to a possible tainted buffer overrun. The fact I distinguish between a ‘tainted’ buffer over run and a normal buffer overrun is because this instance occurs due to the offending index operation being based on a variable that is tainted by contact with the outside world. By the way, Codesonar is clever enough to detect tainted data as it transitively passes through the program, as per this variation of the above:

As with all my examples, they are kept simple, not because they are at the limits of Codesonars understanding but to make sure the intent can be easily appreciated.

Codesonar can detect numerous variations of course. Heres an SQL based example:

In this next example, there are two ways the data could cause trouble. Firstly, it could be catastrophic if the program had sufficient privileges and the data received was something like “;rm -rf /”. The second is the potential for a tainted buffer overrun, overrunning the end of the buffer and possibly trying to “system” whatever accidentally or otherwise lies in memory.

There are several basic forms by which a program interfaces to the outside world – network and environment being two more (source) examples. For all source types, Codesonar understands a large set of corresponding API calls. For example, for network taint, read(), recv() calls on sockets. Similarly, for sink API’s, Codesonar is sensitive to several calls such as the exec() or CreateProcess() families. However, the out of the box source and sink APIs cannot be exhaustive so rather handily, Codesonar provides mechanisms for additional API’s of both types to be added.

Much like CWE formally documents programming flaws generally, there are several industry standards available that either encompass or specifically target security flaws, an example of the former would be MISRA C2012 or MISRA C++2008, whilst the ISO/IEC TS 17961 or the DISA Application Security and Development STIG rules are examples of the later. In each of these cases and others, Codesonar is capable of performing an analysis targeting just those rulesets (the ones that are statically detectable or course) and producing a report highlighting how clean your codebase happens to be.

Related Posts

Leave a comment