229 Lab 6: Dead Code Elimination

CMPUT 229 - Computer Organization and Architecture I

Lab 6: Dead Code Elimination

Creator: Ayokunle Amodu

Introduction

This is the second of two labs focused on the fundamentals of flow analysis used in compiler design. The solution for this lab implements Dead Code Elimination (DCE) on a RISC-V assembly function. DCE is a foundational optimization technique that removes code which is either unreachable or redundant, and builds directly on the Control-Flow Graph (CFG) constructed in Lab 5.

What is Dead Code?

Code is considered Dead Code if it satisfies either of the following conditions:

There is no possible path of execution that reaches it (it is unreachable)
It is executed, but its result is never used and it has no effect on the program's observable behavior (it is redundant)

Note: Dead code can appear not only at the instruction level but also at the block level. For example, in a control flow graph, a block that has no incoming edges from any other block is unreachable and thus can be removed without affecting the program's behavior.

RISC-V Function with Dead Code — The figure above shows a function written in RISC-V assembly that contains several instances of dead code.

Let us examine two instances of redudant dead code in our figure:

The instruction li t2, 20 is dead code because, the result (t2) is only used for other dead instructions like addi t3, t2, 5.
Within the loop, the instruction slli t6, t4, 1 is also dead code, as the value computed in t6 is never referenced later.

Eliminating such instructions helps reduce program size and improve performance, especially in performance-critical or resource-constrained environments.

How to perform DCE?

To successfully perform dead code elimination, we must first understand the concept of Liveness Analysis. A variable is said to be live at a specific point in the program if the value it holds at that point may be used in the future i.e. it is not yet overwritten or discarded.

Thus to perform DCE, we need to first perform liveness analysis. This is a two step process:

Block-Level Liveness Analysis
Instruction-Level Liveness Analysis

Block-Level Liveness Analysis

Block-level liveness analysis uses a fixed-point algorithm to figure out which variables are still needed at the end of each basic block in a program's control flow.

The algorithm iteratively computes the sets of variables that are live at the entry and exit of each basic block in the CFG. It continues until a fixed point is reached, where further iterations do not change the sets of live variables.

Key Sets Computed for Each Block

During this analysis, the following sets are calculated for each basic block in the CFG:

gen: Set of variables that are generated or defined within a block before any of them are killed by another instruction in the same block. These are the variables that might be needed by subsequent blocks, making them candidates for propagation to successor blocks.
kill: Set of variables that are killed or overwritten in a block. Once a variable is killed, its previous value is no longer needed, meaning it won't be considered live after the block's execution unless it's redefined.
live in: Set of variables that are live at the entry point of a block. A variable is live-in if it is used in the block or in any of the block's successors and is not redefined before its use.
live out: Set of variables that are live at the exit of a block. A variable is live-out if it is used in any successor blocks. The live-out set is critical for understanding what information must be carried over to subsequent blocks for correct program execution.

Fixed-Point Algorithm Overview

The core idea of the algorithm is to iteratively refine the live in and live out sets until they stabilize. Initially, all sets are empty. Then:

Start by adding the exit block to the worklist.
For each block, calculate its new live out set by combining the live in sets of its successors.
Compute the new live in set using the formula:
live_in = gen ∪ (live_out - kill)
If either set changes, update them and re-add the block's predecessors to the worklist.

This continues until no further changes occur in any set, signifying that a fixed point has been reached.

The final live in and live out sets represent precise liveness information and are essential for us to perform DCE.

Understanding and correctly implementing this algorithm is essential to producing optimized code. This ensures that only necessary variables are preserved across blocks, making the compiled program both smaller and faster.

Instruction-Level Liveness Analysis

After reaching fixed-point convergence at the block level, we can perform a finer, instruction-level liveness analysis to catch more precise instances of dead code. Block-level analysis may miss definitions that are never used before being redefined within the same block, these are effectively dead but still appear live at the block level.

Instruction-level analysis resolves this by examining each instruction in reverse (from last to first) within a block. This reverse pass catches redefinitions that hide unused values, allowing us to more accurately identify dead code.

Key Sets at the Instruction Level

gen: Registers used by the instruction (e.g., source register for a load word instruction).
kill: Registers defined by the instruction (typically rd register for an instruction).
live in:Registers that hold values that are still needed by subsequent instructions.
Computed as gen ∪ (live_out - kill);
live out: Registers that are live after the instruction executes, meaning these values are still required later in the program.

Dead Code Detection Algorithm

The algorithm initializes current_live_out with the block's live_out set. It then walks through each instruction in reverse. For each instruction:

If it's already marked dead, skip it.
Compute live_in = gen ∪ (live_out - kill).
If it defines a register that isn't live out (and isn't zero), mark it as dead.
Update current_live_out to reflect the new liveness state.

This process continues for every instruction, ensuring we only keep instructions that contribute to the program's final state.

Note: This analysis doesn't handle unreachable code or branches that always behave the same way. It assumes all program paths are possible, so some dead instructions may still escape detection.

Assignment

The goal of this lab is to perform dead code elimination (DCE) based on liveness information for a function written in RISC-V assembly. The task involves implementing liveness analysis to accurately identify and eliminate dead code in the function, ensuring that only necessary instructions are retained.

Register Bit Vectors

In RISC-V, registers from x0 to x31 can be represented as bit vectors. In this representation, each bit corresponds to a specific register, where the bit position matches the register number. For instance, if the register x1 is part of a particular set, the bit in the first position of the bit vector is set to 1, while the remaining bits are set to 0.

Data Structure Descriptions

Pointers to CFG Data Structures

For the liveness analysis explained in previous sections, we rely on CFG information, which you should have experience with from the previous lab. Specifically, you'll work with custom data structures such as leadersArray, edgesList, blocksArray, blockEndsArray, firstSuccessorsArray, secondSuccessorsArray, and predecessorsArray.

Pointers to these data structures are not passed directly to the primary function in deadCodeElimination. Instead, they are accessed through global pointers that are defined for you at the top of your solution file. The common.s file initializes these pointers with the addresses of their respective arrays.

For the successful completion of this lab, you do not require a working solution to the previous lab. you will instead be provided with a binary file that contains the data necessary to initialize the data structures mentioned above. This file will be passed as an argument to the RARS simulator when you run your solution.

DCE Data Structures

Pay close attention to the custom data structures outlined below. Properly constructing them is crucial for the lab. Memory for these structures has been pre-allocated in the common.s file, and pointers to them are passed to the deadCodeElimination function in your solution file deadCodeElimination.s. Your task is to populate these structures with the necessary liveness and dead code information.

instructionsArray

  Description:
    An array of words representing the intput function's instructions.
    Ends with a sentinel value of -1 (0xFFFFFFFF).

In this lab, input functions are represented as an array of RISC-V instructions. Every RISC-V instruction can be represented as a word in memory (e.g., addi t1, t1, 1 = 0x00130313). Thus, a function can be represented as an array of words.

The diagram below illustrates the instructionsArray for the _basicblocks function:

genArray

    Description:
      An array of words representing the gen sets for each basic block in the input function's CFG. 
      Each word is a bit vector where each bit corresponds to a register that is generated (defined) in that block. 
      For example, if a register is defined in the first basic block, the corresponding bit in the first word will be set.    
      This array helps in determining which values are necessary for subsequent blocks in the flow of the program.
      Ends with a sentinel value of -1 (0xFFFFFFFF).

killArray

    Description:
      An array of words representing the kill sets for each basic block in the input function's CFG. 
      Each word is a bit vector where each bit corresponds to a register that is killed (overwritten or redefined) in that block. 
      For example, if a register is overwritten in the first basic block, the corresponding bit in the first word will be set.    
      This array is essential for understanding which values are no longer needed and can be safely ignored in subsequent blocks.
      Ends with a sentinel value of -1 (0xFFFFFFFF).

liveInArray

    Description:
      An array of words representing the live-in sets for each basic block in the input function's CFG. 
      Each word is a bit vector where each bit corresponds to a register that is in the live-in of a given block. 
      Ends with a sentinel value of -1 (0xFFFFFFFF).

liveOutArray

    Description:
      An array of words representing the live-out sets for each basic block in the input function's CFG. 
      Each word is a bit vector where each bit corresponds to a register that is in the live-out of a given block. 
      Ends with a sentinel value of -1 (0xFFFFFFFF).

deadCodeArray

    Description:
      An array of bytes representing the dead code status of each instruction in the input function. 
      Each byte corresponds directly to an instruction.
      For example, the first byte corresponds to the first instruction, the second byte to the second instruction, and so on.    
      If the instruction is identified as dead code, its corresponding byte will be 1; if it is not dead code, the byte will be 0.
      Ends with a sentinel value of -1 (0xFF).

refinedInstructionsArray

    Description:
      An array of words representing the refined instructions of the input function after dead code elimination. 
      Each word corresponds to an instruction in the refined function. 
      Ends with a sentinel value of -1 (0xFFFFFFFF).

As a reminder, the blocksArray index is used to access related data in other arrays. For more details on how blocksArray is utilized, please refer to the previous lab.

Worlist Data Structures

In the context of block-level liveness analysis, the worklist is a crucial data structure used to manage the blocks that need to be processed. The worklist helps ensure that each basic block in the control flow graph (CFG) is revisited until no further changes are detected in its liveness information. Below, you can explore the specific data structures that make up the worklist and their roles in this process.

workList

    Description:
      The workList is a circular queue used to manage a list of items to be processed. It operates with a fixed size and maintains two indices:
      - workListHeadIndex: Points to the position in the list where the next item will be removed.
      - workListTailIndex: Points to the position in the list where the next item will be added.
      
      The workList uses an array where:
      - An entry of -1 (0xFF) indicates an empty slot.
      - When the list is full, adding a new item will result in an error if there is no space.
      - When the list is empty, removing an item will return an error if there are no items to remove.
      
      The workList is implemented as a circular queue, meaning that when the tail index reaches the end of the array, it wraps around to the beginning, and similarly for the head index.

inWorkListArray

      Description:
        An array of bytes corresponding to each block in the input function's CFG. 
        Each byte represents whether the block is currently in the worklist. 
        If the block is in the worklist, the byte will be set to 1; otherwise, it will be set to 0. 
        Ends with a sentinel value of -1 (0xFF).

Writing a Solution

The directive .include "common.s" in the provided solution file (deadCodeElimination.s) causes RARS to execute the code in the file common.s before the solution file. The code in common.s reads a binary file provided in the program arguments, initializes the arguments, calls getControlFlowGraph, parses the custom data structures populated by your solution, and prints the CFG information extracted from the parsed data.

Function Signatures

Write all the following RISC-V functions in the file deadCodeElimination.s:

deadCodeElimination

    Description:
        Performs an iterative analysis to remove dead code from the input function.
     
    Arguments:
        a0: Pointer to the instructionsArray.
        a1: Pointer to the genArray.
        a2: Pointer to the killArray.
        a3: Pointer to the liveInArray.
        a4: Pointer to the liveOutArray.
        a5: Input function's live-out set.
        a6: Pointer to the deadCodeArray.
        a7: Pointer to the refinedInstructionsArray.
  
    Returns:
        None.

getGenKillSets

    Description:
        Computes the gen and kill sets for each block in the control flow graph.
     
    Arguments:
        a0: Pointer to the instructionsArray.
        a1: Pointer to the deadCodeArray.
        a2: Pointer to the genArray.
        a3: Pointer to the killArray.
  
    Returns:
        None.

getLiveSets

    Description:
        Computes the live-in and live-out sets for each block in the 
        input function's control-flow graph.
     
    Arguments:
        a0: Pointer to the genArray.
        a1: Pointer to the killArray.
        a2: Pointer to the liveInArray.
        a3: Pointer to the liveOutArray.
        a4: Function's live-out set.
  
    Returns:
        None.

markDeadCode

    Description:
        Marks instructions as dead within each block of the input 
        function's control-flow graph.
     
    Arguments:
        a0: Pointer to the instructionsArray.
        a1: Pointer to the liveOutArray.
        a2: Pointer to the deadCodeArray.
  
    Returns:
        a0: Dead code status (1 if dead code found, 0 otherwise).

fixTargets

    Description:
        Adjusts branch and jump instruction targets based on the presence of 
        dead code.
     
    Arguments:
        a0: Pointer to the instructionsArray.
        a1: Pointer to the deadCodeArray.
  
    Returns:
        None.

removeDeadCode

      Description:
          Removes dead code from the input function.
       
      Arguments:
          a0: Pointer to the instructionsArray.
          a1: Pointer to the deadCodeArray.
          a2: Pointer to the refinedInstructionsArray.
    
      Returns:
          None.

decodeInstruction

      Description:
          Decodes a given instruction to determine its defined and used registers.
       
      Arguments:
          a0: An instruction word.
    
      Returns:
          a0: Bit vector of the defined register.
          a1: Bit vector of the used registers.

Notes:

For this lab, only the following opcodes are relevant:

I-type (load): 0000011
I-type (arithmetic with immediate): 0010011
U-type (load upper immediate): 0110111
S-type (store): 0100011
SB-type (branch): 1100011
R-type (arithmetic): 0110011

Jump instructions are not considered to define or use any registers in this lab.
For further details, refer to the RISC-V Green Sheet: RISC-V Green Sheet.

Worklist Functions

The worklist functions have already been implemented for you at the bottom of the solution file, deadCodeElimination.s. These functions are essential for managing the processing of blocks during liveness analysis and dead code elimination. Feel free to review them to understand how they interact with the data structures discussed above.

initializeWorkList

    Description:
        Initializes the head and tail indices of the workList to zero. 
        Sets up the workList to be empty and ready for new entries.
     
    Arguments:
        None.
    
    Returns:
        None.

addToWorkList

    Description:
        Adds an item to the workList. Returns an error if the list is full.
     
    Arguments:
        a0: Item to add to workList.
    
    Returns:
        a0: 0 on success, -1 (0xFF) if workList is full.

popFromWorkList

    Description:
        Pops an item from the workList.
    
    Arguments:
        None.
    
    Returns:
        a0: Item from workList or -1 (0xFF) if workList is empty.

In RISC-V assembly, every function must have a return statement. The following are examples of valid return statements: ret, jr ra, and jalr zero, 0(ra). The deadCodeElimination.s file already includes the function labels above and each ends with a return statement. Do not change the names of these functions, as they are called for unit testing in the common.s file.

In this lab, there are two helper functions related to handling immediate values in branch and jump instructions:

adjustBranchImm (optional implementation): Provides a new immediate value to a branch instruction, modifying the instruction word accordingly. This function is not implemented by default and needs to be completed and tested by you.
adjustJalImm: Provides a new immediate value to a jump instruction, modifying the instruction word accordingly.

Recommended Program Flow

Call getControlFlowGraph to populate the CFG data structures.
Initialize the deadCodeArray with all instructions set as unmarked.
Perform the iterative analysis as follows:

Call genKillSets to compute the generation and killing sets.
Call getLiveSets to determine the live-in and live-out sets for each basic block.
Call markDeadCode to identify and mark the dead code based on the live sets.
If markDeadCode indicates that new dead code was found, return to step 3.

Once the loop exits (when no new dead code is found), call fixTargets to adjust any targets affected by the dead code removal.
Finally, call removeDeadCode to eliminate the marked dead code from the program.

Assumptions and Notes

The input function provided to deadCodeElimination.s will consist of at most 254 instructions.
There will be no calls to other functions within the input function.
All unconditional jumps within the input function will be jal, not jalr (which is used for returns). Therefore, do not handle jalr as a control-flow instruction; instead, treat it as a normal statement.

Testing your Lab

For this lab, two program arguments are required, which should be specified in the program arguments section under the Execute tab in RARS. They include:

The binary file of the input function (e.g., nested_loop.bin).
The function's live-out set in hexadecimal format, provided without the 0x prefix (e.g., 1234ABCD instead of 0x1234ABCD). The hexadecimal characters can be in either uppercase or lowercase.

When specifying these arguments, ensure that the file path to the binary file is fully provided, with no quotation marks or spaces in the path or file names. For example, a path like Users/JohnDoe/CMPUT-229/Lab 6/Code/Tests/nested_loop.bin may cause issues due to the space in the folder name Lab 6. It is recommended to rename it to Lab-6 or another name without spaces to avoid problems. Additionally, ensure that the function's live-out argument is separated by a space from the file path for proper execution.

The common.s file contains some unit tests that check if your solution correctly populates the intended arrays with the appropriate data, which is then compared against the expected values. These tests are hardcoded and do not run on the file specified in the program arguments. The results of the unit tests can be viewed in the Run I/O panel of RARS.

The common.s file assists in parsing the data structures populated by your solution, displaying both the liveness information and the refined function with dead code removed. This output can be viewed in the Run I/O panel of RARS following the unit test results. Several input functions along with their corresponding expected outputs are provided in the Tests folder within the Code directory. While passing these tests is important, it does not guarantee full marks, so you are encouraged to create your own corner-case tests to thoroughly validate your code.

Creating Test Cases

To create a test case, write a RISC-V function (e.g., INPUT_FUNCTION.s) in an assembly file to begin with. Then, type in the following command in the terminal: rars a dump .text Binary <INPUT_FUNCTION.bin> <INPUT_FUNCTION.s>. The binary file generated by this command can serve as a program argument to the CFG_simulator tool included with this lab. The CFG_simulator tool will parse the file and output another binary file which can be used as an argument to the deadCodeElimination.s file.

Instruction Disassembly

For the provided tests, you are given the expected refined functions with its instructions in hexadecimal format. The common.s file processes the refinedInstructionsArray to display the refined function with dead code removed. While you can compare the hexadecimal representations to identify discrepancies, using a disassembler allows you to compare the actual instructions, offering a clearer view of any differences.

This lab has an additional folder in the Code folder called Disassembly. It contains files that performs instruction disassembly which is the process of translating hexadecimal instructions to their corresponding RISC-V instructions. This code is from an Open Source RISC-V Disassembler. Please contact a TA if you run into any issues using the code.

To utilize this disassembler, follow these steps:

First, compile the file print-instructions.c using a C compiler. For example, from the terminal, execute the command: gcc print-instructions.c -o disassembler or clang print-instructions.c -o disassembler.
Create a file containing the hexadecimal instructions to disassemble by copying and pasting your RARS output into a text file.
Run the executable with the hexadecimal text file as a program argument. For example: ./disassembler RARS-instructions.txt.

See the README.md in the Disassembly folder for more information.

Resources

Slides used for the introduction of the lab (.pptx) (.pdf)
Marksheet used to mark the lab (.txt)

Marking Guide

Assignments too short to be adequately judged for code quality will be given a zero for that portion of the evaluation.

5% for deadCodeElimination
15% for getGenKillSets
25% for getLiveSets
15% for markDeadCode
10% for fixTargets
5% for removeDeadCode
5% for decodeInstruction
20% for code cleanliness, readability, and comments

Submission

There is a single file to be submitted for this lab (deadCodeElimination.s ) and it should contain the code for your solution.

Do not add a main label to this file.
Do not modify the function names.
Do not remove the CMPUT 229 Student Submission License at the top of the file containing your solution and remember include your name in the appropriate section of this text.
Do not modify the line .include "common.s".
Remember to include your controlFlowGraph.s solution.
Keep the files in the Code folder of the git repository.
Push your repository to GitHub before the deadline. Just committing will not upload your code. Check online to ensure your solution is submitted.