YOLOv5 on COCO using FLOWER on GCP

Hello, I am running Flower Federated Learning on GCP with one VM acting as the server and five VMs acting as clients. Each client VM hosts two Docker containers, resulting in a total of ten clients. The application trains a YOLOv5 model on the COCO dataset. The server VM has 32 GB RAM and 9 vCPUs, while each client VM has 64 GB RAM and 16 vCPUs. I am starting with a pretrained YOLOv5s model with 10 server rounds, 5 local epochs, a batch size of 64, and a learning rate of 0.005. The main issues are that the training time is extremely long (exceeding 12 hours, sometimes failing and forcing me to stop the application), and the training accuracy is consistently lower than the validation accuracy. The setup includes 10 clients, each assigned a random number of training images, to simulate a realistic federated scenario, ranging from 1,000 to 10,000. For each client, the validation set size is fixed at 50% of its training size and follows the same class distribution. I would appreciate guidance on how to tune the federated and training parameters to reduce training time while improving model accuracy in this setup. Thanks.

Hi @itistawfik ,

Training YOLOv5 on COCO is already quite challenging even in a fully centralized setup, so it’s not too surprising that the federated version is both slow and unstable. FL adds significant communication overhead and optimization difficulty on top of an already heavy workload.

You could try reducing local epochs and increasing the number of server rounds, which often helps with convergence for large models. Plus, even if each client’s train/val split has the same class distribution, the data across clients may likely be still non-IID. In that case, algorithms like FedProx may perform better than plain FedAvg. I’d recommend to try out different FL algorithms (strategies in Flower) to see if they allows better convergence.

1 Like

Thank you, @pan-h got it!

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.