Announcing Flower Datasets 0.4.0

The Flower Team is excited to announce the release of Flower Datasets 0.4.0!

Flower Datasets (flwr-datasets) is a library designed to quickly and easily create datasets for federated learning, federated evaluation, and federated analytics. It was developed by Flower Labs, the creators of Flower: A Friendly Federated Learning Framework, to make working with federated data easier.

Thanks to our contributors

We would like to give our special thanks to all the external contributors who made the new version of Flower Datasets possible:

Carlos MarĂ­ for the idea and initial draft of the GroupedNaturalIdPartitioner

What’s new?

  • New Partitioning Schemes:
    • Add SizePartitioner (#4111)
    • Add GroupedNaturalIdPartitioner (#4051)
  • New Dataset integration tests and tests improvements:
    • Add tests for pacs, cinic10, caltech101, office-home (#4103)
  • Docs Improvements
    • Fix Flower Datasets tutorials naming and formatting (#3897)
    • Add information about heterogeneity to FDS description (#3901)
    • Add info about dataset availability to FDS docs (#3902)
    • Add info about distinguishing features of Flower Datasets (#3903)
    • Improve docs code examples copying experience (#3904)
    • Clarify the partitioner parameter docs of FederatedDataset (#3907)
    • Clarify how the dataloader works with HF dataset (#3910)
    • Clarify the split docs in FederatedDataset (#3912)
    • Fix Flower Datasets docs redirects and formatting (#3929)
    • Update information how to handle DatasetDict local data (#4057)
    • Add examples section to concatenate divisions (#4104)
    • Add (embed) HF Space tool that for dynamic code creation and visualization in the flwr-datasets docs page (#4260, #4266)
  • General Improvements
    • Add dataset type check for dataset assignment (#4058)
  • Fixes
    • Fix PathologicalPartitioner in “first-deterministic” class assignment mode when working with string labels. (#4253)
    • Fix the scale of y axis when plotting using plot_comparison_label_distribution with size_unit="absolute" (#4255)

Breaking changes

  • **Rename SizePartitioner to IdToSizeFunctionPartitioner (#4109)**This Partitioner remains a base class for LinearPartitioner, SquarePartitioner and ExponentialPartitioner.
  • **Remove the support for Python 3.8 (#4213, #4341)**Python 3.8 is no longer supported in Flower Datasets.

We’d love to hear from you! Share your thoughts, questions, or feedback about Flower Datasets 0.4.0 in the comments below. :slightly_smiling_face:

3 Likes