Data handling for a flash chip

I am copying an excerpt from here

Flash erase cycles are long - really long - it can take several seconds to erase a Flash sector. Also as the number of guaranteed erase / re-write cycles is usually limited (typically around 10,000 or up to 100,000), we cannot afford to erase an entire sector just because one variable changed. The approach is to "sacrifice" an entire sector for variable storage. In this sector, variables are stored in a table. If a variable changes, it does not get overwritten, instead the old value is discarded and a new entry into the table gets generated.

I don't understand the above statement.

Why we don't have to erase the sector when we add a data to be modified as a new data item instead of modifying the existing data "in-place"? Is it because the free area in the sector where we are going to store the new data is pre-erased with 0x00 or 0xFF?

If the answer to the above question is "Yes", is it possible to avoid erase cycle in the below mentioned case?

I am writing system logs to the flash. Once the flash area is completely filled with system logs, I need to erase the earliest log entry and replace it with the newest one. In this case I can't think of a case where I could avoid erase. I never worked with flash driver before. Any help will be greatly appreciated. I am not a native English speaker and hope the question is not vague. Thank you.

Solution

These kind of problems pop up when you don't have any proper data flash (or eeprom) on the MCU, so you need to use program flash for storing data. Program flash works the same way as data flash, but since it isn't supposed to get erased often, you can use physically smaller circuits. Because, typically: the smaller the physical circuit, the larger the erase sector.

The problem is that erase of a large flash sector takes a lot of time. But once the whole sector has been erased (typically all cells are set to 1), you can write to any erased memory location once. Basically you can always turn a 1 into a 0, but not a 0 to a 1 without erasing. So indeed you can do the write because the area is pre-erased. Such a write does not take nearly as long as an erase.

Therefore there exists various more or less confused algorithms to take advantage of this. It is not really a solution I would recommend, but I can explain it, since it is unfortunately a somewhat common one:

Suppose you have two variables inside the flash sector, that you need to update now and then. Each of them has 1 byte of data. You would then also give each variable an unique search key (which cannot be the value of an erased flash cell) and store them like this:

Address  Key   Value
0x0000   0x01  0xAA
0x0002   0x02  0xBB

You'll have some program structure like

typedef struct
{
  uint8_t key;
  uint8_t val;
} flash_var;

const flash_var* x = (flash_var*)0x0001;
const flash_var* y = (flash_var*)0x0002;

Next you want to change the value of x to 0xCC. You'd call upon your flash programming driver and it will write a copy of the new variable in the next available flash location. Your flash would now look like this:

Address  Key   Value
0x0000   0x01  0xAA
0x0002   0x02  0xBB
0x0004   0x01  0xCC

So you have two copies of the variable x, but the program will update the pointer to only point at the latest occurrence of it. The previous one just sits in the flash as "dead space". You can always find out which is the most recent one, by searching from the end of the flash block, backwards towards the beginning, looking for the first occurrence of the search key 0x01.

This means that upon power-on, finding the variables will not be random access, but rather a pretty slow, linear search.

There are several problems with this algorithm:

Not random access, but very slow search.
Lots of additional complexity to implement, which increases the chance for bugs and takes up resources.
Duplicates of the same data exists. This is unacceptable for mission-critical systems. Suppose a search key gets corrupted - instead of finding nothing and reporting an error, the program will grab old data instead.
Depending on the nature of the flash, you might want to include a checksum in the flash. With the above algorithm, there has to be an individual checksum for each variable, which is a massive waste of space.
When the flash sector gets full, you have to erase it no matter what. You'll then have to come up with a way to store variables in RAM temporarily. It turns complex.
But most importantly, the sector could probably get filled at any given time and it is likely hard to predict when. Your program must be able to handle this special case and cope with the long erase time when it happens. This is the worst-case, and real-time embedded systems must always be designed after the worst-case scenario.

And here is the major logic flaw: if your program can handle the special case when the sector gets filled and you have to erase, then why can't it handle the very same case at each and every time? That is, since your program must be able to handle this anyway, you could as well erase the whole sector every time.

So it turns out that the algorithm only saves time in the best-case scenario, which is quite useless, since it must be written to function in the worst-case scenario anyhow. And in the worst-case scenario, which is what you must design for, it doesn't save any time at all. In fact, the extra complexity that the algorithm introduced made the worst-case scenario take more time.

That is why these kind of algorithms are somewhat confused per design. In a properly designed real-time system, they only save flash write cycles, nothing else.

So to sum it up, I would advise not to use these kind of algorithms. Instead, pick a MCU with proper data flash. It will have smaller sectors, fast erase times and more write cycles.