I've read many VoIP echo topics, like What is echo cancellation? Causes of Echo
And here is what I understand. Supposed there are A and B calling, and A hears his own voice (echo)
Is this right? Please correct me if I'm wrong
You're right about the rationale, and that you typically need AEC at both ends of a telephony link, but you're somewhat wrong in (2) as to how AEC is actually implemented. I suggest starting with the Wikipedia entries on: