LEAF datasets pre-processing

Hello guys,
A flwr-datasets question,
I am looking at flwrlabs/shakespeare · Datasets at Hugging Face
And it mentions that it’s a part from LEAF’s benchmark dataset.
The only code mentioned to get and use this data is:

from flwr_datasets import FederatedDataset
from flwr_datasets.partitioner import NaturalIdPartitioner

fds = FederatedDataset(
    dataset="flwrlabs/shakespeare",
    partitioners={"train": NaturalIdPartitioner(partition_by="character_id")}
)
partition = fds.load_partition(partition_id=0)

In the LEAF benchmark, the data has to be pre-processed with their scripts before use.
Is flwrlabs version of this data pre-processed and ready to use?

Hi @oabuhamdan, I believe this is the raw data. I can take a closer look in a couple of days to verify it. Could you help us check this in the mean time? This dataset is ready to be used as shown above. There are 1129 partitions just like in the original LEAF.