Search code examples
vhdllookup-tables

Design of a VHDL LUT Module


Description: I am trying to write vhdl module a LUT (Look Up Table) with 4 inputs and 3 outputs. I want my 3 bit output to be a binary number equal to the number of 1's in the input.

My Truth Table:

ABCD|XYZ
0000|000
0001|001
0010|001
0011|010
0100|011
0101|010
0110|010
0111|011
1000|001
1001|010
1010|010
1011|011
1100|010
1101|011
1110|011
1111|100

My VHDL code:

library IEEE;
use IEEE.STD_LOGIC_1164.all;

entity lut is
Port (
a : in STD_LOGIC; 
b : in STD_LOGIC; 
c : in STD_LOGIC; 
d : in STD_LOGIC; 
x : out STD_LOGIC; 
y : out STD_LOGIC; 
z : out STD_LOGIC);  

end lut;   

architecture Behavioral of lut is  
signal s0: STD_LOGIC;
signal s1: STD_LOGIC;
signal s2: STD_LOGIC;  
signal s3: STD_LOGIC;
signal s4: STD_LOGIC;
signal s5: STD_LOGIC;
signal s6: STD_LOGIC;
signal s7: STD_LOGIC; 
signal s8: STD_LOGIC;
signal s9: STD_LOGIC;
signal s10: STD_LOGIC;
signal s11: STD_LOGIC;
signal s12: STD_LOGIC;
signal s13: STD_LOGIC;

begin 
----------MUX1----------- 
process(a,b) 
begin
if a='0' 
then s0<=a;
else
s0<=b;
end if; 
end process; 

--------MUX2---------- 
process(a,b) 
begin
if a='0' 
then s1<=a;
else
s1<=b; 
end if;
end process;

---------MUX3-----------
process(a,b) 
begin
if a='0' 
then s2<=a;
else
s2<=b;
end if; 
end process;
---------MUX4-----------
process(a,b) 
begin
if a='0' 
then s3<=a;
else
s3<=b;
end if; 
end process;
---------MUX5-----------
process(c,d,a) 
begin
if a='0' 
then s4<=c;
else
s4<=d;
end if; 
end process;
---------MUX6-----------
process(c,d,a) 
begin
if a='0' 
then s5<=c;
else
s5<=d;
end if; 
end process;
---------MUX7-----------
process(c,d,a) 
begin
if a='0' 
then s6<=c;
else
s6<=d;
end if; 
end process;
---------MUX8-----------
process(c,d,a) 
begin
if a='0' 
then s7<=c;
else
s7<=d;
end if; 
end process;
---------MUX9-----------
process(s0,s1,b) 
begin
if b='0' 
then s8<=s0;
else
s8<=s1;
end if; 
end process;
---------MUX10-----------
process(s2,s3,b) 
begin
if b='0' 
then s9<=s2;
else
s9<=s3;
end if; 
end process;
---------MUX11-----------
process(s4,s5,b) 
begin
if b='0' 
then s10<=s4;
else
s10<=s5;
end if; 
end process;
---------MUX12-----------
process(s6,s7,b) 
begin
if b='0' 
then s11<=s6;
else
s11<=s7;
end if; 
end process;
---------MUX13-----------
process(s8,s9,c) 
begin
if c='0' 
then s12<=s8;
x<= s8;
else
s12<=s9;
x<= s9;
end if; 
end process;
---------MUX14-----------
process(s10,s11,c) 
begin
if c='0' 
then s13<=s10;
z<=s10;
else
s13<=s11; 
z<=s11
end if; 
end process; 
---------MUX15-----------
process(s12,s13,d) 
begin
if d='0' 
then y<=s12;
else
y<=s13;
end if; 
end process;
end Behavioral;

Assumptions: I need a total of 15 multiplexers to model what I need. They will be cascaded to one output. I would have a total of 15 processes shown above.

Questions:
1.) What are my selects for the mux, ABCD?
2.) Am I modeling this the correct way? Will I achieve what I want from the info given?
3.) If there is a better way or you have a different Idea could you please provide an example?
4.) I am not getting my xyz output, its close but what am i doing wrong?

I have tried to provide as much research as possible. If you have any questions I will respond immediately


Solution

  • I'm going to go out on a limb here and tell you to let your synthesizer optimize it. Other than that you can use a minimizer (e.g. espresso) on your table then code the result in VHDL.

    I'm guessing this should be what you should do when targeting an FPGA:

    library ieee;
    use ieee.std_logic_1164.all;
    use ieee.numeric_std.all;
    
    entity bit_count is
        port (
            a,b,c,d:   in  std_logic;
            x,y,z:     out std_logic    
        );
    end entity;
    
    architecture lut of bit_count is
        subtype lutin is std_logic_vector (3 downto 0);
        subtype lutout is std_logic_vector (2 downto 0);
        type lut is array (natural range 0 to 15) of lutout;
        constant bitcount:   lut := (
            "000", "001", "001", "010", 
            "011", "010", "010", "011", 
            "001", "010", "010", "011",
            "010", "011", "011", "100"
            );
    
        signal temp:    std_logic_vector (2 downto 0);
    
    begin
    
        temp <= bitcount( TO_INTEGER ( unsigned (lutin'(a&b&c&d) ) ) );
    
        (x, y, z) <= lutout'(temp(2), temp(1), temp(0));
    
    end architecture;
    

    And failing that I think hand optimizing it as a ROM is likely to be close in terms of gate count:

    --  0000   0001   0010   0011
    --  "000", "001", "001", "010", 
    --  0100   0101   0110   0111
    --  "011", "010", "010", "011", 
    --  1000   1001   1010   1011
    --  "001", "010", "010", "011",
    --  1100   1101   1110   1111
    --  "010", "011", "011", "100"
    
    -- output         Input
    -----------------------
    -- bit 0  is true 0001 0010 0100 0111 1000 1011 1101 1111
    -- bit 1          0011 0100 0101 0110 0111 1001 1010 1011 1100 1101 1110
    -- bit 2          1111
    
    architecture rom of bit_count is
    
        signal t0,t1,t2:    std_logic;
        signal t4,t7,t8:    std_logic;
        signal t11,t13,t14: std_logic;
        signal t15:         std_logic;
    
    begin
    -- terms
        t0  <= not a and not b and not c and not d;
        t1  <=     a and not b and not c and not d;
        t2  <= not a and     b and not c and not d;
    --  t3  <=     a and     b and not c and not d;
        t4  <= not a and not b and     c and not d;
    --  t5  <=     a and not b and     c and not d;
    --  t6  <= not a and     b and     c and not d;
        t7  <=     a and     b and     c and not d;
        t8  <= not a and not b and not c and     d;
    --  t9  <=     a and not b and not c and     d;
    --  t10 <= not a and     b and not c and     d;
        t11 <=     a and     b and not c and     d;
    --  t12 <= not a and not b and     c and     d;
        t13 <=     a and not b and     c and     d;
        t14 <= not a and     b and     c and     d;
        t15 <=     a and     b and     c and     d;
    
    -- outputs
    
        x <= t15;
    
        y <= not ( t0 or t1 or t2 or t8 or t15 );
    
        Z <= t1 or t2 or t4 or t7 or t8 or t11 or t13 or t14;
    
    end architecture;
    

    It should be fewer gates than your chained multiplexers and a bit flatter (faster).

    The two architectures have been analyzed but not simulated. It's easy to get errors when doing hand gate level coding.