When using on premise (running on my own) api gateway like Kong, should it be run in a node as 1 withing the main kubernetes cluster or should it be ran as separate kubernetes cluster?
Unless you have an amazing reason to do otherwise: run Kong within the cluster. Pretty much the last thing you'd want is for all API requests to bomb because of a severed connection between cluster-A and cluster-B, not to mention the horrible latency as requests hop from one layer of abstraction to another.
Taking a page from the nginx Ingress controller, you also have the opportunity to use the Endpoint
API to bypass the iptables
-based Service
machinery, saving even more latency and system resources -- a trick that would be almost impossible with a multi-cluster configuration.
It is my recollection there are even Kong-based Ingress controllers, which could save you even more heartache if their featureset and your needs align