Keras: Why do loss functions have to return one scalar per batch item rather than just one scalar?

I'm writing a custom loss function in Keras and just tripped over the following:

Why do Keras loss functions have to return one scalar per batch item rather than just one scalar?

I care about the cumulative loss for the whole batch, not about the loss per item, don't I?

Solution

I think I figured it out: fit() has an argument sample_weight with which you can assign different weights to different samples in the batch. In order for this to work you need the loss function to return the loss per batch item.

How can I close over variables in kdb/Q?
When is the EACH operator extension necessary in K besides mod/rotate?
Handling single-character strings - in a function or in its caller? ssr()
Kdb+ data fomat when writing to a file
How to convert a symbol to a string in kdb+?
Sum of each two elements using vector functions
A dictionary with a single value and multiple keys
Table transformation, table as list of dicts
Accumulator gives different result then direct function applying
Reshape [cols;table]
FK field over IPC
Protected execution, 2 cases
Enums for tables
Converge (fixed point) syntax difference in q and k
.Q.trp and bt handling
NULLs in q and in k.h
Strange view declaration behaviour
How to build a parse-tree of projections?
Could not evaluate manually created equial ~ parse tree
Select distinct for all columns from keyed table
Parallel execution: blocking receive, deferred synchronous
Multiple variable assignment in q
Select a table from the inside of external select
Select when one of filter-column may not exists
What is the meaning of `s attribute on a table?
On parallel execution - which side reports about an error?
Validate if a keyed table have unique keys
Applying dictionary to dictionary
About xkey implementation
Parse tree built on values from vars