Model aggregation before all clients have finished the round

johanrubak · January 22, 2025, 8:14am

Hi
I am simulating 5 clients, and I have set their ressources up like this to ensure only one client is running at a time in order to max out the batchsize:

[tool.flwr.federations.local-sim-gpu]
options.num-supernodes = 5
options.backend.client-resources.num-cpus = 4
options.backend.client-resources.num-gpus = 1.0
options.backend.init_args.num_cpus = 4 # Only expose 4 CPU to the simulation
options.backend.init_args.num_gpus = 1 # Expose a single GPU to the simulation

I have set all these different parameters in hope of making sure not to end a round and aggregate a model before all clients are done:

    strategy = CustomFedAvg(
        run_config=context.run_config,
        use_wandb=context.run_config["use-wandb"],
        project_name=project_name,
        fraction_fit=1.0,
        fraction_evaluate=1.0,
        min_fit_clients=5,
        min_evaluate_clients=5,
        min_available_clients=5,
        initial_parameters=parameters,
        on_fit_config_fn=on_fit_config,
        accept_failures=False,
        evaluate_fn=gen_evaluate_fn(testloader, device=server_device),
        evaluate_metrics_aggregation_fn=weighted_average,

Now when running the simulation the first two clients are running all 10 epochs for the first round without problem. But then before running the third client, it begins aggregating and then it shows a log of only two results received and three failures (the exception is raised if any failures are detected in the aggregate_fit):

I cannot figure out why the server is not waiting for all clients to finish the round. I would have thought it would be some kind of timeout, but the round_timeout in the ServerConfig is None.

Does someone know how to fix this?

Hope to hear from anyone!

// Johan

Topic		Replies	Views
Launching multiple clients in simulation environnement Flower Help - Beginners	1	170	November 3, 2024
Synchronization Before Aggregation Flower Help - Beginners	0	129	February 21, 2024
Server still waiting while all clients crashes? Flower Help - Intermediate flower	3	75	March 5, 2025
Client fail to call fit function General flower	2	18	May 20, 2025
Early Stopping Implementation Flower Help - Intermediate	7	226	November 22, 2024

Model aggregation before all clients have finished the round

Related topics