I am reading documents regarding ICE and feel puzzled in one place.
Step 1. Caller gathers transport candidates (i.e., host, STUN and TURN).
Step 2. Caller sends a SIP INVITE to callee.
Could someone help present a bigger picture? Thanks a lot.
The bigger picture is that there is another channel where call setup is sent, such as a web server. The SIP INVITE
would go through some web server typically.
ICE is used to set up direct connectivity between the two clients so that the bulk of the data does not need to go through the web server.
This P2P channel is typically either used to send real-time data which is latency sensitive, or bulk data that could be expensive to pass through the server.
So you are right, the NAT problem is already solved and data can be sent through the server, but ICE sets up a direct P2P connection that can be cheaper, faster, and can have lower latency.