I started to read famous Martin Fowler book (Patterns of Enterprise Application Architecture)
I have to mention that I am reading the book translated into my native language so it might be a reason of my misunderstanding.
I found their definitions (back translation into English):
Response time - amount of time to process some external request
Latency - minimal amount of time before getting any response.
For me it is the same. Could you please highlight the difference?
One way of looking at this is to say that transport latency + processing time = response time.
Transport latency is the time it takes for a request/response to be transmitted to/from the processing component. Then you need to add the time it takes to process the request.
As an example, say that 5 people try to print a single sheet of paper at the same time, and the printer takes 10 seconds to process (print) each sheet.
The person whose print request is processed first sees a latency of 0 seconds and a processing time of 10 seconds - so a response time of 10 seconds.
Whereas the person whose print request is processed last sees a latency of 40 seconds (the 4 people before him) and a processing time of 10 seconds - so a response time of 50 seconds.