How should the excluded middle be used in the proof of the pigeonhole principle?

Lemma remove {A} (x : A) xs (p : In x xs) :
  exists xs', (forall x', x' <> x -> In x' xs -> In x' xs') /\ (length xs = S (length xs')).
Proof.
induction xs.
- inversion p.
- destruct p.
  + subst x0.
    exists xs.
    split.
    * intros x' neq pin.
      destruct pin.
      -- contradict neq. symmetry. assumption.
      -- assumption.
    * reflexivity.
  + destruct (IHxs H) as [xs' pxs']. clear IHxs.
    destruct pxs' as [p1 plen]. rename x0 into x'.
    exists (x' :: xs').
    split.
    * intros x'' neq pin. 
      destruct pin.
      -- subst x'. left. reflexivity.
      -- right. apply p1. assumption. assumption.
    * simpl.
      rewrite -> plen.
      reflexivity. 
Qed.

Theorem pigeonhole_principle: forall (X:Type) (l1  l2:list X),
  excluded_middle ->
  AllIn l1 l2 ->
  length l2 < length l1 ->
  repeats l1.
Proof.
induction l1; simpl; intros l2 ex_mid Hin Hlen.
- inversion Hlen.
- apply repeats_rest.
  destruct (remove x l2) as [l2' Hl2].
  + apply Hin. left. reflexivity.
  + destruct Hl2 as [Hmap Hlen'].
    rewrite Hlen' in Hlen.
    clear Hlen'.
    apply (IHl1 l2').
    1 : { assumption. }
    2 : { revert Hlen. unfold lt. intros. omega. }
    clear Hlen IHl1.
    revert Hin.
    unfold AllIn.
    intros.
    apply Hmap.
    2 : { apply Hin. right. assumption. }

1 subgoal
X : Type
x : X
l1, l2 : list X
ex_mid : excluded_middle
l2' : list X
Hmap : forall x' : X, x' <> x -> In x' l2 -> In x' l2'
Hin : forall u : X, In u (x :: l1) -> In u l2
u : X
H : In u l1
______________________________________(1/1)
u <> x

I found various solutions floating about for the pigeonhole principle on the net. The above is adapted from the one by Kovacs, but in his proof he proves that there is an absence of no duplicates rather that there are repetitions as in the SF exercise.

The marked difference is that I cannot prove the u <> x goal here because there is less information when the problem is stated in this form.

Since this problem is both hard and optional and there are existing solutions floating around - and I've already been working on it for two days, could somebody describe to me a high level plan of exactly what I need to make this proof.

I am not looking for a solution, but I am hoping that the one with the excluded middle turns out to be elegant, because the Coq proof without the excluded middle is just a mess of rewriting and knowing the source code of a program is far from understanding what it does. Most explanations of the principle just describe what it is which is not enough for me to bridge the intuition gap.

I've never seen the classical laws in action - it does not seem like knowing something is decidable would gain me much and I find it hard to see what the point of them is. This is especially so in this situation, so I am that much more interested to see what their purpose will turn out to be.

Solution

I came across this question when searching for answers while working through SF (Software Foundations), but managed to prove it myself. I'll provide a sketch for the SF version of the pigeonhole principle, using excluded_middle. The statement to prove is that if all elements in list l1 are in l2 and length l2 is less than length l1, then l1 contains repeated elements.

The proof as SF suggests begins by induction on l1. I'll omit the straightforward empty case. In the inductive case, destruct l2. The case where l2 is empty is straightforward; we consider the other case.

Now, because of induction on l1 and destructuring on l2, your statement to prove is about lists with a first member, let's call them x1::l1 and x2::l2. Your membership hypothesis now looks like: all elements in x1::l1 are in x2::l2.

But first, let us use excluded middle to state that either x1 in l1 or x1 not in l1. If x1 in l1, then trivially we can prove that x1::l1 has repeats. So moving forward, we may assume that x1 is not in l1.

It suffices to match the inductive case that there exists an l2' of the same length as l2 such that all elements of l1 are in l2'.

Now consider the membership hypothesis on x1::l1, with the forall variable introduced as x:

By hypothesis, we know that x1 in x2::l2. Now consider l2', where l2' is x2::l2 with one instance of x1 removed (use in_split). Now because x1 not in l1, we can conclude that all members of l1 are also in l2' and not the removed element. Then this satisfies the membership hypothesis in the induction, and some wrangling with lengths gives us length l2 = S (length l2'), so that the length hypothesis is satisfied. We thus conclude that l1 contains repeats, and thus also x1::l1.

Edit: previously, I also did a case analysis on whether or not x1 = x2 or x1 in l2, and whether or not x = x2 or x in l2, and solved the special cases in a more straightforward manner. It's not needed, and the general case also covers them.

Edit: correct typo length l2 = length l2