Back to home page of Munawar

Program Transformations to Fix C Integers


Zack Coker and Munawar Hafiz


* Our paper go accepted to appear at ICSE 2013. Link to a copy of the paper will be available later.

* Zack Coker won the first prize in ACM Student Research Competition at SPLASH 2012 in the undergraduate category.

* You can try the program transformations at OpenRefactory/C project's web demo page. The Eclipse Plugin will be released soon.


C makes it easy to misuse integer types; even mature programs harbor many badly-written integer code. Traditional approaches at best detect these problems; they cannot guide developers to write correct code.

We describe three program transformations that fix integer problems---one explicitly introduces casts to disambiguate type mismatch, another adds runtime checks to arithmetic operations, and the third one changes the type of a wrongly-declared integer.

1. Add Integer Cast (AIC)

You have a program in which integer operations have operands with different types; the end result may contain an unexpected value.

Add explicit casts for all type mismatches so that they are visible and are properly handled.

   AIC Implementation in Eclipse

AIC Video (13.0 MB)

2. Replace Arithmetic Operator (RAO)

You have a C program that has a potential integer overflow (or underflow) problem originating from an arithmetic operation.

Replace arithmetic operations with a safe function call that detects an overflow (or underflow) and explicitly handles them.

   RAO Implementation in Eclipse
RAO Video (11.1 MB)

3. Change Integer Type (CIT)

You have a program that has signedness and widthness problems from using variable types in incorrect contexts. These errors derive from incorrectly declared variables.

Change the declared type of variables so that the uses of the variable are not conflicting with the declaration.

   CIT Implementation in Eclipse

CIT Video (10.2 MB)


Are the transformations effective in securing systems?

Fixing Problems in SAMATE Benchmark ProgramsWe demonstrated that the three program transformations are sufficient to fix all possible C integer problems by automatically applying them to remove integer problems from all 7,147 benchmark programs of NIST's SAMATE reference dataset.

SAMATE is a joint project of the National Institute of Standards and Technology (NIST) and the  Department of Homeland Security (DHS). It has a suite of test bench programs in C, C++, and Java to demonstrate common security problems.

SAMATE is the most comprehensive benchmark available for integer vulnerabilities in C and C++. The adjacent table lists 7 CWEs  that describe integer vulnerabilities the  benchmark programs. In total, there were 7,147 C programs with 967 KLOC.

The program transformations preprocessed the programs before executing; they ran on more than 15 million lines of preprocessed programs.

The SAMATE programs have a function showing normal behavior and another function showing problem behavior. In all cases, our program transformations preserved normal behavior, and modified behavior resulting from integer problems.

Does a transformation-based technique work? Do they break original programs?
 Test Programs

Our program transformations modify program behavior to fix a problem, but should not break normal behavior. We automatically applied the transformation  on all appropriate targets of 5 open source programs. The transformations were each applied to more than 700,000 lines of preprocessed code containing 4,493 functions in 222 files.

The 5 programs had recent integer overflow vulnerabilities reported. This is why they were chosen. Our program transformations fixed the vulnerabilities. At the same time, we automatically fixed hundreds of integer problems in these programs and yet did not break the program behavior.

 AIC Numbers

 AIC was applied on all local variables, parameters, array access expressions, and structure element access expressions---1,847 total in libpng.

Of which, 1,262 were considered unsafe -- 490 local variables, 162 parameters, 79 array access expressions, and 531 structure element access expressions.

When the transformation was applied, 6,978 references of the 1,262 variables were analyzed. A total of 358 tokens had changes---751 references were modified by adding or removing or updating a cast.

The raw data from our AIC results are available here: AIC Data (470.4 KB)

RAO Results

RAO was applied on all arithmetic expressions---11,849 in total in libpng. This contained binary expressions (+, -, etc,.), prefix expressions (++, --), postfix expressions (++, --), and arithmetic assignment expressions (+=, -=, etc,.). 12.25\% of all arithmetic expressions were considered unsafe in libpng---1452 in total.

In total, 1452 arithmetic expressions were replaced with safe functions. We used IntegerLib library defined by CERT.

The raw data from our RAO results are available here: RAO Data (1.3 MB)

CIT ResultsCIT was applied to on all local variables. For libpng, CIT was applied on 531 local integer variables; 453 passed the preconditions. There were 2,404 references to these unsafe tokens that were checked. In the end, CIT modified the declaration of 152 local variables in libpng, i.e.,  28.63% (152/531) of the variables that were checked. 

The raw data from our CIT results are available here: CIT Data (578.2 KB)

People (Past and Present)
Zack Coker (undergrad, Auburn University), Munawar Hafiz (Assistant Professor, Auburn University).


Program Transformations to Fix C Integers
          Zack Coker and Munawar Hafiz
          To Appear at International Conference on Software Engineering, ICSE 2013

          May 2013

          Security-oriented Program Transformations to Cure Integer Overflow Vulnerabilities
          Zack Coker
          ACM Student Research Competition
          In Companion of the 27th  Object-Oriented Programming, Languages, Systems, and Applications, OOPSLA 2012
          Oct 2012


Last modified: Dec 21, 2012

Conceived and Maintained by: Munawar Hafiz