Project CSC - 714

Checker Mode Execution to Support Timing Analysis in High-End Embedded Microprocessors


1 Problem Description

A very important requirement for real time embedded systems is to be able to predict the timing behavior of software components accurately, more precisely the WCET. Currently, various testing

methods are employed in order to determine the WCET. However, testing alone can not provide a safe upper bound on WCET. Moreover, testing for all the possible combinations of inputs in generally infeasible.

Alternatives are:

1. Static Timing Analysis: The main idea is that it simulates execution along the control flow paths within the program regardless of the program input but still considering some architectural details like pipelining and caching. It is safe and efficient and yields verifiable bounds on the WCET [7].

However, it cannot keep pace with architectural innovations such as out-of-order execution, speculation and dynamic branch prediction.

2. Hardware Execution for WCET Analysis: New approach to overcome the gap between the capabilities of static timing analysis and the advances in hardware. Discussed below in detail.

2 Approach

Instead of simulating and finding the execution time, we propose to actually execute it in hardware and assess the WCET. For this we use a tool set called Simple Scalar that is used to build modeling applications for program performance analysis, detailed micro architectural modeling and hardware- software co-verification. Using the Simple Scalar tools, we can build modeling applications that simulate real programs running on a range of modern processors and systems. Infact, we can also specify our own processor architecture and work on it.

This is exactly what we are going to do. The main aim of this project is to design an embedded CPU (microprocessor) that is capable of executing a software in two different modes. One as we all know is the Deployment mode, in which a software is executed along a path depending on the input data. But, as we discussed earlier, getting the WCET is tedious by this method as only one path is followed on a branch.

If we could somehow go to both the paths on a branch and find which one has a larger execution time, we can then get a better WCET. We will call this mode the Checker Mode.

3 Checker Mode

This new checker mode is not dependant on the input but some of the branch statements may depend on the value of the input. We tackle this uncertainty by assigning the input-dependant register values as unknown, are internally represented as NaN (not-a-number) values.

While timing, a branch condition based on a known value is evaluated as always and input-invariant branches will result in timing of only the taken execution path (like a for loop, for e.g.). But a branch condition based on an unknown value indicates that we need to consider alternate paths and then conclude upon which one is the worst-case.

One very important consideration during branching is maintaining the processor contexts. Thus, our hardware should be able to support access to the unit level context of hardware resources and it should be able to guarantee safe restoration of saved contexts.

These were some of the important features of the work that I will implement during my project. I have divided my work and wish to achieve the following milestones.

4 Milestones

One of the major components is to learn the Simple Scalar and to be able to make changes to the design of the processor. For that I need the modified version of simple scalar (modified by Dr. Rotenberg).

1. Study Simple Scalar and make modifications – 10/27/2005

2. Implement Safe Restoration of  processor contexts and see if the execution goes a different direction (for a single branch). – 11/4/2005

3. Extend for Multiple branches – 11/11/2005

4. Be able to measure the cycle count between any two points of code – 11/18/2005

5. Integrate the timing analysis and the branching – 11/25/2005

One important thing, to be kept in mind, are the representation of unknown values in registers.

5 Related Work

Methods to obtain upper bounds on execution time range from dynamic (but unsafe) observation to static (safe but not always tight) [8]. Recently, hybrid methods have been proposed [3, 4] as well as hardware-related methods [5, 1]. Yet none of these approaches capture advanced hardware features transparently while providing tight bounds. This project will try to fill up this gap.

6 Progress Report - 1

Till now i have achieved something in between Milestone 1 and 2. I have done some background reading and studied Simplescalar. Also done some experiments. The details can be viewed here.

7 Progress Report - 2

Finished most of Milestone 2. I did very detailed study of Simplescalar and made certain changes to the code in order to help me debug and analyse its working. Also done some experiments. The details can be viewed here.

8 Final Project Report

The final Project Report can be viewed here.

8 References

[1] A. Anantaraman, K. Seth, K. Patil, E. Rotenberg, and F. Mueller. Enforcing safety of real-time schedules on contemporary processors using a virtual simple architecture (visa). In IEEE Real-Time Systems Symposium, pages 114–125, Dec. 2004.

[2] R. Arnold, F. Mueller, D. B. Whalley, and M. Harmon. Bounding worst-case instruction cache performance. In IEEE Real-Time Systems Symposium, pages 172–181, Dec. 1994.

[3] G. Bernat, A. Colin, and S. Petters. Wcet analysis of probabilistic hard real-time systems. In IEEE Real-Time Systems Symposium, Dec. 2002.

[4] A. Hamann, M. Jersak, K. Richter, and R. Ernst. Design space exploration and system optimization with symta/s - symbolic timing analysis for systems. In IEEE Real-Time Systems Symposium, Dec. 2004.

[5] T. Lundqvist and P. Stenstr¨om. An integrated path and timing analysis method based on cycle-level symbolic execution. Real-Time Systems, 17(2/3):183–208, Nov. 1999.

[6] F. Mueller. Timing analysis for instruction caches. Real-Time Systems, 18(2/3):209–239, May 2000.

[7] J. Regehr, A. Reid, and K. Webb. Eliminating stack overflow by abstract interpretation. In Proc. of the 3rd Intl. Conf. on Embedded Software (EMSOFT), pages 306–322, Oct. 2003.

[8] J. Wegener and F. Mueller. A comparison of static analysis and evolutionary testing for the verification of timing constraints. Real-Time Systems, 21(3):241–268, Nov. 2001.

[9] http://www.simplescalar.com/

[10] http://moss.csc.ncsu.edu/~mueller/secret/muri05.pdf  


Neha Kumar

Department of Computer Engineering

North Carolina State University

Raleigh - NC - 27606