Why were square waves used in old hardware instead of sine waves?

This answer here postulates that to actually generate a square wave (or any other abstract wave-shape) you have to layer multiple sine waves on top of each other. Yet old hardware (Commodore, NES, etc) lacked sine wave channels and instead relied heavily on square pulse-waves, triangle waves, noise and sawtooth waves. I always assumed this was done because those waves are easier to generate than a simple sine wave. So,would genereating these wave shapes not be computationally more expensive? Why was it done anyway?

Solution

This answer here postulates that to actually generate a square wave […] you have to layer multiple sine waves on top of each other.

Not really, it just describes how a square wave can be analyzed to prove certain facts about its sound - how much energy is in each frequency band and such. This is somewhat similar to how every integer can be factored into one or more smaller prime factors (15=3×5) which is useful when analyzing algorithms but still doesn't change how we came up with the original number (maybe counting 15 sheep).

Separating a "complex" wave into sinusoidal components are very useful mathematically, but does not tell us the mechanism behind its original creation.

I always assumed this was done because those waves are easier to generate than a simple sine wave.

Your assumption here is correct. Starting with a digital circuit, the square wave is the easiest and cheapest waveform to create¹. Just turn a voltage on and off using a single transistor. It is also cheaper in a mass-market manufacturing context because a sine wave generator (and even a saw-tooth) made from analog electronics will require a lot of extra components in order to not drift with temperature, age, and humidity.

It is also arguably more useful in a synthesizer context than one single sine wave because it has a lot of harmonics you can modify with a filter like in the SID.

The next step on the complexity ladder is any ramp-shape, like the triangle or saw-tooth. While you can make these using analog electronics, even back in the early eighties they were typically implemented by a simple DAC driven by a digital counter. The rate of the counter determined how fast the waveform goes from 0 to MAX and thus determined the pitch.

Once you have your DAC in your computer you could use it to generate a sine wave but it requires either impossibly expensive real-time calculations or a large table of pre-calculated sine values, so it was rarely (never?) done. When computers got some useful amount of RAM and bandwidth, they quickly switched to plain arbitrary samples and never looked back.

^{1) In fact, anything else is so much more complicated that today we just do everything using simple digital pulses and just filter the result in various ways (PDM, PWM, Delta-sigma)}