Hello everyone,
I have a problem with the training, the first round goes great. however, a problem occurs in the second one. Could someone help me?
The first round works well
INFO : Flower ECE: gRPC server running (20 rounds), SSL is disabled
INFO : [INIT]
INFO : Requesting initial parameters from one random client
Number of availbale Clients 1
INFO : Received initial parameters from one random client
INFO : Evaluating initial global parameters
INFO :
INFO : [ROUND 1]
INFO : configure_fit: strategy sampled 2 clients (out of 2)
Number of availbale Clients 2
INFO : aggregate_fit: received 2 results and 0 failures
In the Server file i get the error with
hist = run_fl(
File ".../miniconda/envs/f1/lib/python3.10/site-packages/flwr/server/server.py", line 483, in run_fl
hist, elapsed_time = server.fit(
File ".../miniconda/envs/f1/lib/python3.10/site-packages/flwr/server/server.py", line 113, in fit
res_fit = self.fit_round(
File ".../multisensor_data_preparation_federated_learning/examples/flowers/testtt/testscaffold/server_scaffold.py", line 237, in fit_round
aggregated_result_combined = parameters_to_ndarrays(aggregated_result[0])
File "/.../miniconda/envs/f1/lib/python3.10/site-packages/flwr/common/parameter.py", line 34, in parameters_to_ndarrays
return [bytes_to_ndarray(tensor) for tensor in parameters.tensors]
AttributeError: 'NoneType' object has no attribute 'tensors
These error is occurs because the num_batches_tracking is empty
File ".../miniconda/envs/f1/lib/python3.10/site-packages/flwr/client/client.py", line 234, in maybe_call_fit
return client.fit(fit_ins)
File "/home/mwalczewski/miniconda/envs/f1/lib/python3.10/site-packages/flwr/client/numpy_client.py", line 238, in _fit
results = self.numpy_client.fit(parameters, ins.config) # type: ignore
File ".../multisensor_data_preparation_federated_learning/examples/flowers/testtt/testscaffold/client_scaffold.py", line 451, in fit
self.set_parameters(model_parameter)
File ".../multisensor_data_preparation_federated_learning/examples/flowers/testtt/testscaffold/client_scaffold.py", line 428, in set_parameters
raise ValueError(f"Parameter {k} is empty.")
ValueError: Parameter conv.batch.num_batches_tracked is empty.
Here is my code in the client file
class FlowerClient1(fl.client.NumPyClient):
def __init__(self, client_index):
args = parse_args()
self.client_index = args.client_index
self.model = model
train1,test2 = train_loader, test_loader
self.train = train1[self.client_index]
self.test = test2[self.client_index]
self.client_cvalue = []
for param in self.model.parameters():
self.client_cvalue.append(torch.zeros(param.shape))
#save_dir = ""
#if save_dir == "":
# save_dir = "clients_cvs"
self.dir = "client_cvs"
if not os.path.exists(self.dir):
os.makedirs(self.dir)
def get_parameters(self, config):
return [val.cpu().numpy() for _, val in self.model.state_dict().items()]
def set_parameters(self, parameters):
params_dict = zip(self.model.state_dict().keys(), parameters)
state_dict = OrderedDict({k: torch.tensor(v) for k, v in params_dict})
# Check for empty tensors
for k, v in state_dict.items():
if v.numel() == 0:
raise ValueError(f"Parameter {k} is empty.")
print(f"Parameter {k} shape: {v.shape}")
self.model.load_state_dict(state_dict, strict=True)
The error is because as mention before the num_batches_tracked is empty. an excerpt is shown here
Parameter resblock5.diconv1.weight shape: torch.Size([256, 64, 3])
Parameter resblock5.diconv1.bias shape: torch.Size([256])
Parameter resblock5.batch1.weight shape: torch.Size([256])
Parameter resblock5.batch1.bias shape: torch.Size([256])
Parameter resblock5.batch1.running_mean shape: torch.Size([256])
Parameter resblock5.batch1.running_var shape: torch.Size([256])
Parameter resblock5.batch1.num_batches_tracked shape: torch.Size([])
its empty. but the first round works great. Have someone a idea what to make. …num_batches_tracked is essential for the functioning BatchNorm layer. I dont know what can be done. If further code is needed please let us know. Outside the flower framework the model works well. It also works when i only set in the server 1 round and in the clients more rounds. But it doesnt work when i what e.g. in the server more rounds