Understanding Jenkins: does Jenkins Agent nodes require the same environment as controller node? How exactly does agent works?

(master / slave terminology is now controller / agent)

I am learning about Jenkins build automation and I have trouble understanding how the controller/agent structure works.

In my setup, there is a controller Jenkins node on the same computer of the source to be built. The source consists of gigabytes of data (including the source code, the compiler and some resource data). To be precise, the "compiler" is the Unity3D editor (which itself consists of several gigabytes of data) and the "source code" is the scripts and assets of a Unity project.

The Jenkins job is supposed to run the "compiler" on the "source code" and generate some outputs. In a controller/agent configuration, the process is supposed to be parallelized over multiple agent nodes (Jenkins agents).

Now here comes the part I struggle to understand: if I were to setup multiple agent nodes (Jenkins agents) on multiple machines, do I need all those agent machines to have the exact same "compiler" and "source code" on them for Jenkins to work correctly?

On one hand, based on my reading it seems there isn't the need to do so and the agent machine only needs a "Jenkins agent" to be set up for everything to work properly. The agent just "magically" knows about the "compiler" and the "source code" on the remote controller machine, without the need to either install the "compiler" or the synchronize the "source code".

However on the other hand, I just couldn't imagine how is this possible to be done with gigabytes of data involved, to the extent that I am confused about whether my understanding on the reading is correct. Even if the "source code" can be transferred "by pieces" the "compiler" must be fully executable and cannot be transferred by piece right?

Any explanation would be appreciated.

Solution

I have also been learning Jenkins for game development and, as far as I understand, agents first use is in separating out scheduling vs building. The idea is you have one controller and it can run anywhere, and one or more agents that run on different hardware. This way, and when an agent is working, the controller doesn't loose resource to do it's job. You might have just one agent, or a Windows machine and Mac that each build for their respective platforms or any number of agents. The point is not bogging down the controller.

AFAIK, Jenkins pipelines added parallelism but it is optional there isn't any magic. For Unreal, there is a product Incredibuild that does distributed compilation but Jenkins wouldn't handle that. A cursory search suggests that Unity does not build a way to support distributed compilation.

Note: I haven't done any of this but I think parallel agents could be useful if you had some preprocessing step like down-sampling all your textures. You could have parallel actions process different folders across multiple agents, and when the they are all done, it goes back to a non-parallel build. Again, this is not a great example but the point is I think it's up to you to figure out if there are pieces of your process that could be made parallel.