I am looking into implementing a FIFO that can hopefully be reused in multiple devices.
The FIFO template shall expose these methods push
pop
and len
. I want the size of the FIFO to be defined as a parameter and I want the type the FIFO holds to be any type of standard integer types (uint8, uint16, uint32, uint64, int8, int16, int32 and int64). Pushing a 64-bit integer into a 8-bit FIFO shall cause truncation. The FIFO shall also support checkpointing.
I started with this code but it does not compile:
template fifo {
param fifo_size;
saved uint64 buf[fifo_size];
}
If there are only a very small, restricted set of types you need to support, then creating a separate FIFO template for each type may be advisable (using #if
to ensure the buf
declaration works). Such an implementation would be the easiest to understand, while still providing the greatest number of benefits for users of the templates -- its only cost is the excessive boiler-plate needed to define the templates.
However, there is an alternate approach leveraging pseudo-generics which allows you to write a template that scales up to any number of supported types, and has almost all of the same benefits as creating duplicate templates for each type.
Here is how such a fifo
template would look using that approach:
template fifo {
// Maximum size of the FIFO
param fifo_size;
// Any expression, whose type dictates the type of the elements of the FIFO
param fifo_type_carrier;
// Internal
saved uint32 fifo_buf_start;
// Partially internal: should not be modified by the user
saved uint32 len;
// This saved declaration relies on instance-specific information;
// #if (true) is thus needed to prevent DMLC from trying to make it
// a member of the the template type
#if (true) {
saved typeof(fifo_type_carrier) fifo_buf[fifo_size];
}
// Since their signatures are instance-specific, 'push' and 'pop' can't be
// members of `fifo`'s template type.
// Throws if the FIFO is full
method push(typeof(fifo_type_carrier) elem) throws {
if (len >= fifo_size) throw;
fifo_buf[(fifo_buf_start + len++) % fifo_size] = elem;
}
// Throws if the FIFO is empty
method pop() -> (typeof(fifo_type_carrier)) throws {
if (len == 0) throw;
local typeof(fifo_type_carrier) elem = fifo_buf[fifo_buf_start];
fifo_buf_start = (fifo_buf_start + 1) % fifo_size;
--len;
return elem;
}
}
Example usage:
group a_uint64_fifo is fifo {
param fifo_type_carrier = *cast(NULL, uint64 *);
param fifo_size = 32;
}
The greatest downside of fifo
compared to duplicated, type-specific templates is that push
and pop
are not members of the template type. However, this actually addressable -- you can create a child template of fifo
for a specific type, and declare push
and pop
to be shared
in that template -- making them members of the child template's template type.
For example:
template fifo_uint64 is fifo {
param fifo_type_carrier = *cast(NULL, uint64 *);
shared method push(uint64 elem) throws;
shared method pop() -> (uint64) throws;
}
Example usage:
group a_uint64_fifo is fifo_uint64 {
param fifo_size = 32;
}
method m() -> (uint64) throws {
local fifo_uint64 x = cast(a_uint64_fifo, fifo_uint64);
// allowed
x.push(4);
return x.pop();
}
As strange as it may seem to have non-shared
methods defined by a parent template be declared shared
by a child template, it is actually perfectly allowed by DMLC. You can do the same thing with parameters, too -- an untyped param
declared by a parent template can be made a typed param
of a child template. This pattern is called "late shared", in want of a better name.
The one benefit the above lacks compared to FIFO templates with duplicated implementations is that the underlying implementations of pop
and push
aren't shared
(even if the methods themselves are declared shared
by a child template). What this means is that the C functions for them are generated for every object that instantiates the template -- which could inflate compilation times if a particular DML model has a lot of such objects. In contrast, with templates defined by duplicating code, you could make the implementations of push
and pop
shared (although not without a small wrapper method to gain access to the fifo_buf
, whose implementation can't be shared). In theory, such a template would scale better with growing number of instances than one based on the type-generic fifo
.