I was reviewing the train function in the quickstart-pytorch example and noticed a logical issue in how the average training loss is calculated.

lefthook · December 15, 2025, 2:34pm

In the current implementation(task.py), running_loss is initialized outside the epoch loop and accumulates the loss across all epochs. However, when calculating the average at the end, it is divided only by len(trainloader) (the number of batches in a single epoch).

While the default configuration uses epochs = 1 (making the current calculation correct), if a user increases the epochs hyperparameter, the reported avg_trainloss will be scaled up by a factor of epochs.

williamlm · December 22, 2025, 9:38am

hi @lefthook , thanks for posting this here!

This is definitely something that we should take a look at, could you open an issue for this on our GitHub and refer to the examples that are relevant?

junsimons · January 7, 2026, 1:31am

I noticed the same thing and didn’t see an issue yet, so I opened one and included a PR if helpful

williamlm · January 12, 2026, 2:03pm

thank you! @junsimons

Topic		Replies	Views
Not clear how loss distributed is calculated Flower Help - Beginners metrics	1	201	June 11, 2024
Simulation succeeding, but only showing eval metric (no train metric) Flower Help - Intermediate flower , metrics	2	203	January 17, 2026
Problem with BatchNormalization more precise with num_batches_tracked Flower Help - Intermediate	3	173	March 7, 2025
Increasing or in general suspicious high loss after first round of training Flower Help - Intermediate	7	346	January 20, 2025
Unable to run the last part of the PyTorch tutorial, Communicating Arbitrary Objects Flower Help - Beginners flower	1	25	June 14, 2026

I was reviewing the train function in the quickstart-pytorch example and noticed a logical issue in how the average training loss is calculated.

Related topics