I am working on a toy language written using the llvm c++ api and I am trying to implement arrays. I have tried several different things none of which have worked very well.
Here is what I am going for:
[8 x [8 x [8 x [...]
)Ideally, they would resemble arrays in swift.
Currently I am using the basic array type. This checks all of the boxes except that while a one dimensional array can be passed to and returned from a function, multidimensional (nested) arrays must be passed by pointer. This means that I have to not only bitcast the returned pointer to the proper type but then loop through every element and store it in the appropriate place. Quickly this becomes not only very messy but also slow.
As far as I can tell, you cannot have nested vectors. eg. this does not work:
<8 x <8 x i32>>
Lastly, I tried using malloc to allocate the appropriate space then just use pointers for everything. This had many of the same problems as my current solution.
Take the following example:
%1 = alloca [5 x i32], align 16
%2 = alloca i32*, align 8
%3 = getelementptr inbounds [5 x i32], [5 x i32]* %1, i32 0, i32 0
%4 = call i32* @_Z8getArrayPi(i32* %3)
store i32* %4, i32** %2, align 8
There is no way to store the pointer %4
into %1
so a new allocation must be created. Otherwise (with nested arrays) each element would need to be copied and there would be the same problem as in my current solution. Also, ideally the compiler would deal with all the pointers, so it would just look like an array to the user of the language.
In case it was not clear - my question is how do I implement arrays so that I can do all of the things listed above (without having to copy every element).
Preamble
I think alloca
is the correct approach if they're going to be fixed size but you'll have to write quite a bit of boilerplate to implement guaranteed (C++ style) elision to optimize function returns.
Now the problem is in ownership semantics, if you want to do it perfectly you will need to determine their lifetime and establish whether they need heap allocations or can get away with stack allocations only. But let's ignore optimization for now. The Swift/ARC implementation is quite messy behind the scenes and uses a lot of optimizations so you don't necessarily want to model it either, let alone the gaps between Swift/ObjC ARC code an "unmanaged" code.
Simple solution "like Swift"
If you don't want to deal with that, assuming you know all the possible exit paths from a function, you can allocate your arrays using some form of a runtime on the heap and provide intrusive reference counting semantics. This is the model Swift uses (with a ton of optimizations on top) - you heap allocate every array and implement a control block to track the reference count and size of your array (if you want dynamic boundary checking for example). In this case, you would be going with simple pass-by-reference semantics, as such:
Value
to track those) is passed in, you bump up the refcount.That's pretty much the basics of it an is a very very dumbed down version of how ARC works. Now here's things to think about:
I would suggest starting off with a refcounted object that's going to be a base of everything managed in your language including your array types and carry some sort of type information with it. (like a far less complex version ofNSObject
in libobjc4
).