Why do these two queries return different values when their subquery has the same output?

I'm making my way through an SQL workbook, and I was wondering why these two queries return a different result? What are the differences in behavior of WHERE IN and inline views?

Query I wrote

SELECT SUM(invoice_total) AS sum_of_maxes
FROM invoices
WHERE invoice_total IN
    (SELECT MAX(invoice_total)
    FROM invoices
    WHERE invoice_total - credit_total - payment_total > 0
    GROUP BY vendor_id)

Whole query output:

	SUM_OF_MAXES
1	52680.48

Subquery output:

	MAX(INVOICE_TOTAL)
1	224
2	21842
3	90.36
4	1575
5	2433
6	17.5
7	46.21
8	662
9	9.95
10	41.8
11	503.2
12	20551.18
13	6
14	1962.13
15	2312.2
16	158

Textbook solution

SELECT SUM(invoice_max) AS sum_of_maximums
FROM (SELECT vendor_id, MAX(invoice_total) AS invoice_max
      FROM invoices
      WHERE invoice_total - credit_total - payment_total > 0
      GROUP BY vendor_id)

Whole query output:

	SUM_OF_MAXIMUMS
1	52434.53

Subquery output:

	VENDOR_ID	INVOICE_MAX
1	37	224
2	72	21842
3	80	90.36
4	83	1575
5	86	2433
6	94	17.5
7	95	46.21
8	96	662
9	97	9.95
10	102	41.8
11	106	503.2
12	110	20551.18
13	115	6
14	121	1962.13
15	122	2312.2
16	123	158

When I ran the subqueries on their own, they return the same values for MAX(invoice_total), but running the scripts in their entirety creates two different final outputs, which surprised me. Sure the methods are different but I don't know enough to understand why the results should be. Have I done something erroneous or should I expect that subqueries behave differently under FROM and WHERE?

Solution

The problem you observe when executing your incorrect query is that the same invoice which you determined as highest invoice of one vendor id appears for other vendor id's where it is not the maximum. The correct query only fetches the highest invoice per vendor id and sums them, but your query sums all other occurences, too.

You can find invoices and their vendor id's that appear multiple times with following query:

SELECT
  invoice_total, vendor_id
FROM invoices
WHERE invoice_total
IN
(SELECT 
  invoice_total
FROM invoices
GROUP BY 
  invoice_total
HAVING 
  COUNT(vendor_id) > 1)
ORDER BY invoice_total;

Or you can change your wrong query and select all columns, then sort by invoice total:

SELECT *
FROM invoices
WHERE invoice_total IN
    (SELECT MAX(invoice_total)
    FROM invoices
    WHERE invoice_total - credit_total - payment_total > 0
    GROUP BY vendor_id)
ORDER BY invoice_total;

Then you will also see the "duplicated" rows that your query includes and the correct one does not.

I created a sample fiddle based on your data where you can reproduce this.