Isabelle - exI and refl behavior explanation needed

I am trying to understand the lemma below.

Why is the ?y2 schematic variable introduced in exI?
And why it is not considered in refl (so: x = x)?

lemma "∀x. ∃y. x = y"   
  apply(rule allI)     (*  ⋀x.      ∃y. x = y *)
  thm exI              (*  ?P ?x ⟹ ∃x. ?P x  *)
  apply(rule exI)      (*  ⋀x. x = ?y2 x      *)
  thm refl             (*  ?t = ?t            *)   
  apply(rule refl)   
done

UPDATE (because I can't format code in comments):

This is the same lemma with a different proof, using simp.

lemma "∀x. ∃y. x = y"
  using [[simp_trace, simp_trace_depth_limit = 20]]
  apply (rule allI)  (*So that we start from the same problem state. *)   
  apply (simp only:exI) 
done

The trace shows:

[0]Adding rewrite rule "HOL.exI":
?P1 ?x1 ⟹ ∃x. ?P1 x ≡ True 
[1]SIMPLIFIER INVOKED ON THE FOLLOWING TERM:
⋀x. ∃y. x = y 
[1]Applying instance of rewrite rule "HOL.exI":
?P1 ?x1 ⟹ ∃x. ?P1 x ≡ True    
[1]Trying to rewrite:
x = ?x1 ⟹ ∃xa. x = xa ≡ True   <-- NOTE: not ?y2 xa or similar!
[2]SIMPLIFIER INVOKED ON THE FOLLOWING TERM:
x = ?x1 
[1]SUCCEEDED
∃xa. x = xa ≡ True

So apparently simp and rule handles exI differently. And the remaining question is: what is the mechanical (programmatical) reasoning behind rule's behavior.

Solution

When you use rule thm for some fact thm, Isabelle performs higher-order unification of the conclusion of thm with the current goal. If there is a unifier, it is used to instantiate both the goal and the conclusion of the theorem, and then resolution is performed (i.e. the goal is replaced with the assumptions of thm).

This means that:

Schematic variables in the goal can be instantiated by rule through unification
Variables that appear only in the assumptions of thm will not be instantiated by the unification and will therefore remain schematic. That way, you end up with schematic variables in your new goals. Such variables can be seen as existential in some sense, because the conclusion of thm holds if you can prove the assumptions for just one arbitrary value.

In the case of exI, you have ?P ?x ⟹ ∃x. ?P x. When you apply rule exI, the variable ?P is instantiated to λy. x = y, but the variable ?x appears only in the assumptions of exI, so it remains schematic. This means that you can pick any value you want for ?x later on in your proof.

To be more precise, you end up with ⋀x. x = ?y2 x as your goal. You might ask ‘Why not just ⋀x. x = ?y2?’ That would mean that you have to show that x equals some fixed value y2 for all possible values of x. That is obviously not true in general. ⋀x. x = ?y2 x means you have to show that every x equals some y2 that may depend on x – or, equivalently, that there is a function y2 that, when given x, outputs x.

Of course, there is such a function and it is simply the identity function λx. x. That is precisely what ?y2 gets instantiated to when you apply rule refl: the goal x = ?y2 x is unified with the conclusion of refl ?t = ?t and you end up with ?t = x and ?y2 = λx. x, and since refl has no assumptions, this resolution finishes the proof.

I am not entirely sure what you mean with ‘And why it is not considered in refl?’, but I hope that I have answered your questions.