Both this Scikit Learn regression example, or this Torch classifier example, are using tabular data. But they use Flower Datasets to facilitate partitioning instead of manual CSV loading. You can checkout this guide to see how to use a local CSV file with Flower Datasets
Hello,
I have modified the code of the first tutorial in the official Flower website.
I got the following:
def get_dataloaders(partition_id: int):
train_data = pd.read_csv( IRIS_DATASET_PATH / 'Train' / 'Train.csv' )
test_data = pd.read_csv( IRIS_DATASET_PATH / 'Test' / 'Test.csv' )
df_data = pd.DataFrame( np.concatenate( [ train_data.values, test_data.values ], axis = 0 ) )
partitioner = IidPartitioner( num_partitions = NUM_CLIENTS )
partitioner.dataset = Dataset.from_pandas( df_data )
partition = partitioner.load_partition( partition_id = partition_id )
#******************************************************
# DIVIDE DATA ON EACH NODE: 80% TRAIN, 20% VALID + TEST
#******************************************************
partition_train_test = partition.train_test_split( test_size = 0.2,
seed = 42 )
partition_valid_test = partition_train_test['test'].train_test_split( test_size = 0.5,
seed = 42 )
def apply_transforms( batch ):
# Instead of passing transforms to CIFAR10(..., transform=transform)
# we will use this function to dataset.with_transform(apply_transforms)
# The transforms object is exactly the same
for key in batch.keys():
batch[key] = torch.tensor( batch[key] )
return batch
partition_train_test = partition_train_test.with_transform( apply_transforms )
partition_valid_test = partition_train_test.with_transform( apply_transforms )
train_dataloader = torch.utils.data.DataLoader( dataset = partition_train_test["train"],
batch_size = BATCH_SIZE,
shuffle = True )
valid_dataloader =torch.utils.data. DataLoader( dataset = partition_valid_test["train"],
batch_size = BATCH_SIZE )
test_dataloader = torch.utils.data.DataLoader( dataset = partition_valid_test["test"],
batch_size = BATCH_SIZE )
return train_dataloader, valid_dataloader, test_dataloader
Well, when I run Flower, it freezes.
I have not been understanding what I am doing wrong.
Regards
I’d advise to check the change in the centralized setup first.
Dear @adam-narozniak ,
many thanks for your feedback, but I only changed the function “load_datasets” which I renamed “get_dataloaders”.
Best regards,
Well, if it worked before and now it does not work, the problem is likely there. Debugging in an FL setup is hard. So, if you want to get a clearer picture of what’s going on and possibly share the error or problem further, that’d be the way to go. But feel free to try to solve it in the whole FL pipeline if that suits you, but can’t future help without any new information :((
Dear @adam-narozniak ,
I think you are right. As the first debugging step, I will follow your suggestion and, eventually, I will let you know.
Thanks again!
This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.