Hello guys,
A flwr-datasets question,
I am looking at flwrlabs/shakespeare · Datasets at Hugging Face
And it mentions that it’s a part from LEAF’s benchmark dataset.
The only code mentioned to get and use this data is:
from flwr_datasets import FederatedDataset
from flwr_datasets.partitioner import NaturalIdPartitioner
fds = FederatedDataset(
dataset="flwrlabs/shakespeare",
partitioners={"train": NaturalIdPartitioner(partition_by="character_id")}
)
partition = fds.load_partition(partition_id=0)
In the LEAF benchmark, the data has to be pre-processed with their scripts before use.
Is flwrlabs version of this data pre-processed and ready to use?