Search code examples
smalltalkpharosqueak

Is there an upper limit to the number of objects in a Smalltalk image?


I'm putting together a NLP experiment in which concepts are agents in a system designed to engender Emergent properties consisting of new concepts (here's a link for those who don't know what Emergence is). Smalltalk (specifically the Pharo dialect) appears to be ideal for this kind of application because of the ease with which I can create fully-encapsulated concept objects that relate to one another as independent agents, and, the fact that SmallTalk allows me to inspect the state of the system as it's running.

My concern is whether or not the system will start to choke if too many objects are present and all sending messages to one another. In theory, my implementation could engender millions of concept objects and I don't want to devote the time working this out in SmallTalk if the system can't handle something that large.

  1. Are there limiting factors (software factors, not hardware) regarding the quantity of active objects in a SmallTalk image?

  2. Can the system handle the message traffic that would be present in a system with millions of chatty objects?

Thank you in advance for your help!


Solution

  • The internal working size of object pointers within Pharo is still 32 bit I believe. There's been chatter of 64b versions, but it's one thing to have a 32b VM running on a 64b machine, and another thing to have an actual, 64b through and through VM.

    So there's an implicit limit right there, but still room for "millions" of objects. Start reaching in to the "100's of millions" and you may well bump in to some limits.

    Having millions of objects in the end isn't really an issue, now it moves to threads of control, and Pharo doesn't do much threading in that case. So it really comes how to how many actual distinct contexts you will have, not necessarily objects per se.

    Having a chain of millions of objects talking to each other isn't really a big deal, you'll simply run in to whatever message passing overhead there is in the underlying VM to limit raw performance. Pharo is pretty fast, but it's not Java fast. Whether it's fast enough for you is for you to answer.

    I also can't speak to how well the Pharo GC handles millions of live objects, I can only suggest that it's 2013, Squeak (upon which Pharo is based) has been around since the mid 90's, GC tech is pretty much mature now, and I don't suspect that Pharo's GC is spectacularly awful in this regard.

    I would simply do some micro benchmarks and try for yourself.