How to reuse BRAM once it's not needed by module?

I'm working on a (seemingly) simple project as a learning exercise: connecting an SSD1331-based 96x64 PMOD display via iCEstick (Lattice iCE40HX-1k FPGA) to PC so I can send some RGB565-encoded image through USB to be shown on said display.

Thing is, SSD1331 display requires an initialization procedure just to get to the "clear black screen" state. There's about 20 commands to be shifted into display controller; length varies between 1 and 5 bytes, total is 44 bytes.

So far I wrote Verilog pwr_on module with FSM for shifting commands into PMOD in the right sequence; values for the commands are defined as localparam. Everything works fine but there's always a but. I figured all those command constants get stored in LUTs (I'm not inferring any RAM blocks so where else would they go, right?), and with only 1,280 LUTs available in iCE40HX1k using a hundred or so of them for init procedure that takes about 150ms and is never needed until the next reset seems to be a waste.

Now, I can see the following ways to deal with this problem:

Don't implement init sequence in FPGA at all; instead, send those commands via USB.
Simple but not that interesting; after all, I'm trying to learn FPGA programming, not Linux drivers.
Take advantage of SB_WARMBOOT and multi-configuration.
iCE40HX can have up to 4 configurations stored in EEPROM; SB_WARMBOOT primitive basically lets you jump between them at will. I could program init procedure in configuration 0 and once it's done jump to configuration 1 with USB support, thus having a clean slate. However, I need to hold at least 3 display PMOD pins (pmod_enable, vcc_enable and pmod_rstn) high while in transition between the configurations. I cannot find any means to do that; if anybody knows please send me in the right direction.
Store commands data in BRAM.
HX1K has 16 RAM4K blocks (each storing 4096 bits) so even one of them should provide plenty of room for 44 bytes of command data without spending valuable LUTs.

Option 3 looks simple enough. However, being scrooge about my resources I'd love to have that RAM4K block available for other tasks once init is done. Now, it seems to me that Verilog synthesizer (I'm using yosys) is completely oblivious of the fact that when pwr_on module pulls done wire high, the BRAM cell it's been attached to can be reused when inferring other logic.

One solution that comes to mind is to allocate that BRAM block in a separate module, fill it with the data I need for init and wire it to pwr_on module, then rewire it to other modules as needed. Yet this approach looks ugly for a few reasons, thus the question: is there a trick I'm missing? How could I use one BRAM block in, let's say, SB_RAM512x8 configuration for one module and then reuse it as SB_RAM256x16 for another?

Solution

Multiplex the read address to the EBR used for PMOD configuration data

EBR's of ice40 can not, in my knowledge, change the WRITE_MODE and READ_MODE while running (please correct me if I'm wrong). Hence, I would suggest to instantiate your EBR in the configuration you want to use after initiation of the PMOD. The contents of the EBR must include the configuration data for the PMOD, specified the usual way via INIT_0 through INIT_F.

The read address to the EBR need to be a mux of an address from the FSM controlling PMOD-initiation, and the address to use after initiation, this will only cost around 8 LUTs.