Are there Flower examples with CSV files?

charles · November 14, 2024, 1:34pm

charles · November 14, 2024, 1:36pm

Both this Scikit Learn regression example, or this Torch classifier example, are using tabular data. But they use Flower Datasets to facilitate partitioning instead of manual CSV loading. You can checkout this guide to see how to use a local CSV file with Flower Datasets

peter · November 20, 2024, 1:58pm

Hello,
I have modified the code of the first tutorial in the official Flower website.

I got the following:

def get_dataloaders(partition_id: int):
    
    
    train_data = pd.read_csv( IRIS_DATASET_PATH / 'Train' / 'Train.csv' )
    test_data  = pd.read_csv( IRIS_DATASET_PATH / 'Test' / 'Test.csv'   )
    df_data    = pd.DataFrame( np.concatenate( [ train_data.values, test_data.values ], axis = 0 ) )
    
    
    partitioner = IidPartitioner( num_partitions = NUM_CLIENTS )
    partitioner.dataset  = Dataset.from_pandas( df_data )
    
    
    partition   = partitioner.load_partition( partition_id = partition_id )
    
    
    #******************************************************
    # DIVIDE DATA ON EACH NODE: 80% TRAIN, 20% VALID + TEST
    #******************************************************
    partition_train_test = partition.train_test_split( test_size = 0.2, 
                                                       seed      = 42 )
    
    partition_valid_test = partition_train_test['test'].train_test_split( test_size = 0.5, 
                                                                          seed      = 42 )
    
    
    def apply_transforms( batch ):
        # Instead of passing transforms to CIFAR10(..., transform=transform)
        # we will use this function to dataset.with_transform(apply_transforms)
        # The transforms object is exactly the same
        for key in batch.keys():
            
            batch[key] = torch.tensor( batch[key] )
        
        return batch

    partition_train_test = partition_train_test.with_transform( apply_transforms )
    
    partition_valid_test = partition_train_test.with_transform( apply_transforms )
    
    train_dataloader = torch.utils.data.DataLoader( dataset    = partition_train_test["train"], 
                                                    batch_size = BATCH_SIZE, 
                                                    shuffle    = True    )
    
    valid_dataloader =torch.utils.data. DataLoader( dataset    = partition_valid_test["train"], 
                                                    batch_size = BATCH_SIZE )
                   
    test_dataloader  = torch.utils.data.DataLoader( dataset    = partition_valid_test["test"], 
                                                    batch_size = BATCH_SIZE )
    
    return train_dataloader, valid_dataloader, test_dataloader

Well, when I run Flower, it freezes.

I have not been understanding what I am doing wrong.

Regards

adam-narozniak · November 21, 2024, 3:21pm

I’d advise to check the change in the centralized setup first.

peter · November 21, 2024, 3:25pm

Dear @adam-narozniak ,
many thanks for your feedback, but I only changed the function “load_datasets” which I renamed “get_dataloaders”.

Best regards,

adam-narozniak · November 21, 2024, 3:32pm

Well, if it worked before and now it does not work, the problem is likely there. Debugging in an FL setup is hard. So, if you want to get a clearer picture of what’s going on and possibly share the error or problem further, that’d be the way to go. But feel free to try to solve it in the whole FL pipeline if that suits you, but can’t future help without any new information :((

peter · November 21, 2024, 3:35pm

Dear @adam-narozniak ,

I think you are right. As the first debugging step, I will follow your suggestion and, eventually, I will let you know.

Thanks again!

charles · November 28, 2024, 3:35pm

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Classification FL model with tabular dataset Example Projects	5	218	June 6, 2024
Announcing Flower Datasets 0.4.0 Flower Datasets datasets	0	55	October 22, 2024
Announcing Flower Datasets 0.2.0! General	0	52	July 9, 2024
Using MS COCO 2017 dataset for FL simulation Flower Help - Intermediate datasets	3	36	November 4, 2024
Announcing Flower Datasets 0.3.0 General	0	36	July 26, 2024

Are there Flower examples with CSV files?

Related topics