First of all, I'm not a very experienced programmer. I'm using Delphi 2009 and have been working with sets, which seem to behave very strangely and even inconsistently to me. I guess it might be me, but the following looks like there's clearly something wrong:
unit test;
interface
uses
Windows, Messages, SysUtils, Classes, Graphics, Controls, Forms,
Dialogs, StdCtrls;
type
TForm1 = class(TForm)
Button1: TButton;
Edit1: TEdit;
procedure Button1Click(Sender: TObject);
private
test: set of 1..2;
end;
var Form1: TForm1;
implementation
{$R *.dfm}
procedure TForm1.Button1Click(Sender: TObject);
begin
test := [3];
if 3 in test then
Edit1.Text := '3';
end;
end.
If you run the program and click the button, then, sure enough, it will display the string "3" in the text field. However, if you try the same thing with a number like 100, nothing will be displayed (as it should, in my opinion). Am I missing something or is this some kind of bug? Advice would be appreciated!
EDIT: So far, it seems that I'm not alone with my observation. If someone has some inside knowledge of this, I'd be very glad to hear about it. Also, if there are people with Delphi 2010 (or even Delphi XE), I would appreciate it if you could do some tests on this or even general set behavior (such as "test: set of 256..257") as it would be interesting to see if anything has changed in newer versions.
I was curious enough to take a look at the compiled code that gets produced, and I figured out the following about how sets work in Delphi 2010. It explains why you can do test := [8]
when test: set of 1..2
, and why Assert(8 in test)
fails immediately after.
An set of byte
has one bit for every possible byte value, 256 bits in all, 32 bytes. An set of 1..2
requires 1 byte but surprisingly set of 100..101
also requires one byte, so Delphi's compiler is pretty smart about memory allocation. On the othter hand an set of 7..8
requires 2 bytes, and set based on a enumeration that only includes the values 0
and 101
requires (gasp) 13 bytes!
Test code:
TTestEnumeration = (te0=0, te101=101);
TTestEnumeration2 = (tex58=58, tex101=101);
procedure Test;
var A: set of 1..2;
B: set of 7..8;
C: set of 100..101;
D: set of TTestEnumeration;
E: set of TTestEnumeration2;
begin
ShowMessage(IntToStr(SizeOf(A))); // => 1
ShowMessage(IntToStr(SizeOf(B))); // => 2
ShowMessage(IntToStr(SizeOf(C))); // => 1
ShowMessage(IntToStr(SizeOf(D))); // => 13
ShowMessage(IntToStr(SizeOf(E))); // => 6
end;
Conclusions:
set of byte
, with 256 possible bits, 32 bytes.set of 1..2
it probably only uses the first byte, so SizeOf()
returns 1. For the set of 100.101
it probably only uses the 13th byte, so SizeOf()
returns 1. For the set of 7..8
it's probably using the first two bytes, so we get SizeOf()=2
. This is an especially interesting case, because it shows us that bits are not shifted left or right to optimize storage. The other interesting case is the set of TTestEnumeration2
: it uses 6 bytes, even those there are lots of unusable bits around there.Test 1, two sets, both using the "first byte".
procedure Test;
var A: set of 1..2;
B: set of 2..3;
begin
A := [1];
B := [1];
end;
For those understand Assembler, have a look at the generated code yourself. For those that don't understand assembler, the generated code is equivalent to:
begin
A := CompilerGeneratedArray[1];
B := CompilerGeneratedArray[1];
end;
And that's not a typo, the compiler uses the same pre-compiled value for both assignments. CompiledGeneratedArray[1] = 2
.
Here's an other test:
procedure Test2;
var A: set of 1..2;
B: set of 100..101;
begin
A := [1];
B := [1];
end;
Again, in pseudo-code, the compiled code looks like this:
begin
A := CompilerGeneratedArray1[1];
B := CompilerGeneratedArray2[1];
end;
Again, no typo: This time the compiler uses different pre-compiled values for the two assignments. CompilerGeneratedArray1[1]=2
while CompilerGeneratedArray2[1]=0
; The compiler generated code is smart enough not to overwrite the bits in "B" with invalid values (because B holds information about bits 96..103), yet it uses very similar code for both assignments.
set of 1..2
, test with 1
and 2
. For the set of 7..8
only test with 7
and 8
. I don't consider the set
to be broken. It serves it's purpose very well all over the VCL (and it has a place in my own code as well).set of 1..2
behave the same as set of 0..7
is the side-effect of the previous lack of optimization in the compiler.var test: set of 1..2; test := [7]
) the compiler should generate an error. I would not classify this as a bug because I don't think the compiler's behavior is supposed to be defined in terms of "what to do on bad code by the programmer" but in terms of "what to do with good code by the programmer"; None the less the compiler should generate the Constant expression violates subrange bounds
, as it does if you try this code:(code sample)
procedure Test;
var t: 1..2;
begin
t := 3;
end;
{$R+}
, the bad assignment should raise an error, as it does if you try this code:(code sample)
procedure Test;
var t: 1..2;
i: Integer;
begin
{$R+}
for i:=1 to 3 do
t := i;
{$R-}
end;