The app I'm working on app where users can simulate tests and answer them offline. I have a software that takes the data from my database (the questions, alternatives, type of question and etc) and turn them into a array.
I don't know which is the most efficient (memory-wise): create an object with a big array with all the questions or creating separated objects (for each subject for example) with an array each or creating multiple arrays in the same object. Is it ok to create an array with about 1000 arrays inside or is it better to split that array, in... 10 arrays with 100 arrays inside each?
P.S: During the test I will only use 30 items from the array, so I'll take the entries from the big array (or from the multiple arrays) and add them to the small 30 entries array that'll be created accordingly to the user's inputs.
What I would like to use
I would like a big array, because for me it would be easier to sort and create random tests, some people are saying 1000 entries aren't too much, so I think I'll stick to a big array. What would be too big? 10k, 100k?
There are three kinds of efficiency you need to consider"
Note that the above criteria work against each other.
The memory size in bytes of an array of references N in Java given by
N * reference_size + array_header_size + padding
where:
reference_size
is the size of a reference in bytes (typically 4 or 8)array_header_size
is typically 12 bytespadding
is greater or equal to zero, and less than the heap node size granularity.The array itself also has a unique reference which must be held in memory somewhere.
So, if you split a large array into M smaller arrays, you will be using at least (M - 1) * 16
extra bytes of RAM, and possibly more. On the other hand, we are talking about bytes here, not kilobytes or megabytes. So this is hardly significant.
This is harder to predict. The CPU utilization effects will depends largely on what you do with the arrays, and how you do it.
If you are simply subscripting (indexing) an array, that operation doesn't depend on the array size. But if you have multiple arrays (e.g. an array of arrays) then there will be additional overheads in determining which array to in subscript.
If you are searching for something in an array, then the larger the array you have to search the longer it will take (on average). But if you split a large array into smaller arrays, that doesn't necessarily help ... unless you know before hand which of the smaller arrays to search.
It will probably make your code more complicated if you use multiple arrays rather than one. More complicated code means more programmer effort in all phases of the application's development and maintenance lifecycle. It is hard to quantify how much extra effort is involved. However programmer effort means cost (paying for salaries) and time (deadlines, time to market, etc), and this is likely to outweigh any small savings in memory and CPU.
You said:
Some people are saying 1000 entries aren't too much, so I think I'll stick to a big array. What would be too big? 10k, 100k?
Once again, it depends on the context. In reality, the memory used for an array of 100K instances of X depends largely on the average size of X. You will most likely run out of memory to represent the X instances instead of the array.
So, if you want your application to scale up indefinitely, you should probably change the architecture so that it fetches the questions / answers from the database on demand rather than loading them all into memory on start up.
Donald Knuth is often (mis-)quoted1 as saying:
"Premature optimization is the root of all evil."
What he is pointing out is that programmers are inclined to optimize things that don't really need optimizing, or spend their effort optimizing the wrong areas of their code based on incorrect intuitions.
My advice on this is the following:
Don't do fine-grained optimization too early. (This doesn't mean that you should ignore efficiency concerns in the design and coding stages, but my advice would be to only consider on the major issues; e.g. complexity of algorithms, granularity of APIs and database queries, and so on. Especially things that would be a lot of effort to fix later.)
If and when you do your optimization, do it scientifically:
Set some realistic goals (or time limits) for your optimization and stop when you reach them.
1 - The full quotation is more nuanced. Look it up. And in fact, Knuth is himself quoting Tony Hoare. For a deeper exploration of this, see https://ubiquity.acm.org/article.cfm?id=1513451