I am looking into how to set up a liferay project with version control and automated deployment. I have a working local development environment in eclipse, but as far as I understand it, setting up a portal in liferay is in part the liferay portal instance running on tomcat and then my custom module projects for customization. I basically want all of that in one git repository which can then be
1: cloned by any developer to set up their local dev environment
2: built and deployed by eg. jenkins into eg. AWS
I have looked at the liferay documentation regarding creating a docker container for the portal, but I don't fully understand how things like portal content would be handled.
I would be very grateful if someone could lead me in the right direction on how a environment like this would be set up.
Code and content are different beasts. Set up a local Liferay instance for every single developer. Share/version the code through whatever version control (you mention git).
This way, every developer can work on their own project, set breakpoints, and create content that doesn't interfere with other developers.
Set up a separate integration test environment, that gets its code exclusively through your CI server, never gets touched manually.
Your production (or preproduction) database will likely have completely different content: Where a developer is quick to create a few "Lorem Ipsum" posts and pages, you don't want them to escape into production. Thus there's no movement of content from development to production. Only code moves that way.
In case you want your developers to work on a production-like environment, you can restore the production content (database) to development machines. Note that this is risky though: The database also contains user accounts, and you might trigger update notification mails from your development machines - something that you want to avoid at all costs. Plus, this way you give developers access to login data (even though it's hashed) which can be abused. And it might even be explicitly forbidden by industry regulations to use production data in development environments.
In general: Every system has its own database (at least their own schema), document store and indexing server. Every developer has their own portal JVM running. The other environments (integration test, load test, authoring, production) are also separate environments. And no, you don't need all of them all the time.
I can't attribute this quote (Milen can - see his comment), but it holds here:
Everybody has a testing environment. Some are lucky to run a completely different production environment.
Be the lucky one. If everyone has their own fully separated environment, nobody is stepping on each other's shoes. And you'll need the integration tests (with the CI output) anyway.