ricardo angeli my blog about hardware and software

    6502 on FPGA: Part I - The ALU Awakens

    image

    What’s an ALU?

    ALU stands for Arithmetic Logic Unit. It is a device that takes one or two numerical values, performs an arithmetic operation, and then outputs the result. Operations can range from addition to bitwise functions such as OR, XOR, or AND. It’s the heart within the processor architecture. Modern processor architectures may include multiple ALUs, but the 6502 keeps it simple with just one.

    The 6502 ALU

    The implementation of the ALU in the 6502 is simple and only contains what’s absolutely necessary, which is great for our efforts in modeling its behavior. It takes in two input values stored in registers, A & B, and performs one of the five operations (addition, bitwise OR, XOR, AND, and right shift) and then stores the result in the output register. image The processor takes care of deciding which two registers are being fed to the ALU. Then, all five operations are executed in parallel and the processor decides which buffer will get written to the output register by setting the control bits. In essence, the control bits which decide which operation is being executed on the inputs.

    There’s also an overflow status bit which gets outputted when adding two values. Since the 6502 only supports 8-bit signed values, the result of the addition must range from -127 to 128. If the result is more than 128, we have what’s called an overflow. If the result comes out to be less than -127, then that is an underflow. Whenever either one happens, the 6502 will set the overflow bit and the processor will know that the result was out of bounds.

    In the original implementation, the values from the five operations are buffered, but in my implementation they are not and are simply selected using a multiplexer.

    Verilog Implementation

    Implementing this ALU as a Verilog module is very simple. First off we have our inputs and outputs:

    
    module alu_6502(
    	input wire [7:0] regA,
    	input wire [7:0] regB,
    	input wire [4:0] control,
    	output reg [7:0] regOut,
    	output reg overflow
    	);
    

    The registers should be self-explanatory. We also have a 5-bit control input which controls the operation being outputted through a multiplexer. Lastly, there’s our overflow status bit which will be elaborated upon soon.

    The bitwise functions are implemented through simple combinatorial functions as shown below:

    
    // Combinatorial functions
    assign orOut = regA | regB;
    assign xorOut = regA ^ regB;
    assign andOut = regA & regB;
    assign shiftOut = regA >> regB;
    

    Note that in this implementation, we can shift right multiple times as defined by regB. This may or may not result in an additional opcode in the future.

    Really the only challenge in the Verilog ALU is figuring out how to calculate the overflow bit. There is a combinatorial formula I learned back in Digital Systems which is V = carryBit[6] XOR carryBit[7]. However, since we are not implementing any carry bits to calculate the addition, this would complicate the code. This method led to the development of my current algorithm which simply adds an additional bit to both inputs and the outputs and is able to determine using that whether an overflow or underflow occurred.

    
    // Detect overflow/underflow and perform addition
    always @(*) begin
    
      // Perform addition with an extra bit (which is the same as the value of the
      // MSB for the input register)
      // Store the result in sumOut and store the additional bit in extraBit
      {extraBit, sumOut} = {regA[7], regA} + {regB[7], regB};
    
      // If the extra bit and the MSB of the sum is 0x01, overflow
      // If the extra bit and the MSB of the sum is 0x10, underflow
      // Otherwise the overflow bit is set low
      overflow = ({extraBit, sumOut[7]} == 2'b01) || ({extraBit, sumOut[7]} == 2'b10);
    end
    

    The curly braces are Verilog’s concatenation operator. They combine multiple bits from different variables together into one. Remember that we are programming this using HDL and so these are actually wires. To a wire it makes no difference where it is being fed from or to; as long as there are two endpoints.

    Finally, the results are multiplexed using a case structure.

    
    // Multiplex the result according to the control
    always @(*) begin
      case(control)
        `SUMS: regOut = sumOut;
        `ORS:  regOut = orOut;
        `XORS: regOut = xorOut;
        `ANDS: regOut = andOut;
        `SRS:  regOut = shiftOut;
      endcase
    end
    

    Simulation and Testing

    Once I was happy with the ALU module, it was time to develop a test bench. The test bench feeds input values to the ALU module so we can see the outputs on the simulator. The file name is alu_6502_tb.v. Let’s begin with the tests!

    Addition

    The most important of all the tests, here we’ll be testing addition, subtraction, addition with overflow, and subtraction with underflow. Since numbers are represented in two’s complement, subtraction is simply addition with a negative number. The two’s complement of a positive will result in the negative number and vice-versa. Below are the test benches for addition:

    
    // Regular addition: 100 + 11 = 111
    	a <= 8'b01100100; // 100
    	b <= 8'b00001011; // 11
    	ctrl <= `SUMS;
    	#10
    // Regular subtraction: 99 - 88 = 11
    	a <= 8'b01100011; // 99
    	b <= 8'b10101000; // -88
    	ctrl <= `SUMS;
    	#10
    // Overflow: 70 + 60 = -126
    	a <= 8'b01000110; // 70
    	b <= 8'b00111100; // 60
    	ctrl <= `SUMS;
    	#10
    // Underflow: -60 - 70 = 126
    	a <= 8'b11000100; // -60
    	b <= 8'b10111010; // -70
    	ctrl <= `SUMS;
    	#10
    

    Note that the results of the overflows that are written in the comments are the results we would expect to get – along with the overflow flag getting set – not the correct mathematical result of the equation.

    We run this through our simulator and we find our answers below:

    image

    The image you are looking at is a screenshot of the output of the simulator. It is divided into five different signals which correlate to the inputs and outputs of the ALU module. The green signals are the inputs and the cyan signals are the outputs. Since there are several wires being monitored, they are combined into buses and represented as hexadecimal numbers.

    Each of the four groups of signals correlate with the four tests done on the test bench. All the inputs and outputs match and so the test is passed. Note that the overflow signal v is HIGH on the last two tests where an overflow/underflow is triggered. Looks good!

    Bitwise OR

    The remaining tests are very straightforward. This one is on the bitwise OR function. Below is our testbench:

    
    // LOGICAL OR
    // 0xAA | 0x55 = 0xFF
    	a <= 8'b10101010; // 0xAA
    	b <= 8'b01010101; // 0x55
    	ctrl <= `ORS;
    	#10
    // 0x75 | 0xC2 = 0xF7
    	a <= 8'b01110101; // 0x75
    	b <= 8'b11000010; // 0xC2
    	ctrl <= `ORS;
    	#10
    

    The OR function is very simple, if there is a ONE bit located in a certain location on either register the result will be ONE at that location. Otherwise, it’s zero. Here are the results from the simulation:

    image

    One can visually check and realize that the OR function is working as expected in both tests.

    Bitwise XOR

    Ah the XOR operator! This one acts just like an OR, except if both inputs are set, then the output goes back to zero at that bit. We have two tests for this function:

    
    // LOGICAL XOR
    // 0xAA ^ 0x55 = 0xFF
    	a <= 8'b10101010; // 0xAA
    	b <= 8'b01010101; // 0x55
    	ctrl <= `XORS;
    	#10
    // 0x75 ^ 0xC2 = 0xB7
    	a <= 8'b01110101; // 0x75
    	b <= 8'b11000010; // 0xC2
    	ctrl <= `XORS;
    	#10
    

    As expected, the tests match the output perfectly!

    image

    Bitwise AND

    The last of our bitwise function is the AND. If both inputs are HIGH at a certain bit, the output is also HIGH, otherwise it’s zero. Below are the tests:

    
    // LOGICAL AND
    // 0xAA & 0x55 = 0x00
    	a <= 8'b10101010; // 0xAA
    	b <= 8'b01010101; // 0x55
    	ctrl <= `ANDS;
    	#10
    // 0x75 & 0xE4 = 0x64
    	a <= 8'b01110101; // 0x75
    	b <= 8'b11100100; // 0xE4
    	ctrl <= `ANDS;
    	#10
    

    And they match quite well!

    image

    Shift Right

    This final operation simply shifts the value in regA a number of times defined by the value in regB. This behavior is different than the actual opcode on the 6502. The original instruction can only shift regA once. I decided to extend that so we may create an extra opcode later. Anyway, the test bench is below:

    
    // SHIFT RIGHT
    // 0x3A >> 0x01 = 0x1D
    	a <= 8'b00111010; // 0x3A
    	b <= 8'b00000001; // 0x01
    	ctrl <= `SRS;
    	#10
    // 0x3A >> 0x04 = 0x07
    	a <= 8'b01110101; // 0x3A
    	b <= 8'b00000100; // 0x04
    	ctrl <= `SRS;
    	#10
    

    Shifting by zero would be pointless as the result would be the same as the input and shifting by eight or more would also be pointless as the result would always just be zero. Our two tests show some basic shifting and the results from the simulation validate those tests.

    image

    Conclusion

    The ALU was pretty straightforward to implement but this will serve as the core for the rest of the project. It was great starting out with this as it served as a warmup exercise in Verilog as well as creating test benches in the simulator. Really the time spent documenting this was much more than the time it took to make it. If you want the source code and project files, check out the link below. Feel free to ask any questions or leave a comment if you found this helpful. In the meantime, stay tuned for the next part as we get our hands dirty with the processor architecture!

    Check it out on GitHub (rangeli/alu_6502)

    Load comments ...

subscribe via RSS