Hello, I am just starting now with studying FL and something is very unclear to me regarding labels.
From what I understand of FL, the idea is to train models on devices so that only model weights are shared with a central server. What is unclear to me is how devices would label their data to then train their models?
For example, let’s say I have a federated vision system with multiple cameras for object detection. Cameras have their initial model, then acquire new data for training, but how would they generate the labels? With the global model? Wouldn’t that make them learn possible errors?
Fl doesn’t have anything unique with data labelling. The paradigm shift of FL is that the training is done on multiple different devices (cameras) and then aggregated on a central server. That process is then repeated iteratively, The actual training, be it supervised or unsupervised is unchanged.
So in your example on the vision dataset, you would need to provide labeled training data for each of the cameras if you wanted to do supervised learning.
Thank you a lot for your explanation. To delve into the decentralized nature of Federated Learning a bit more, could you elaborate on how those devices in a FL setup, such as cameras in a vision system, generate or receive labels for their data in a decentralized manner? Considering the absence of a central authority for labeling, how do these devices effectively annotate their data for training without relying on a centralized labeling process? Since sending data to a server for labelling would somehow go against the whole premise of mantaining data decentralized.
I’ve seen many examples online in which they simply split a dataset (labels+images) among different clients, and while this was helpful to understand the nature of FL from a theoretical point of view, it still leaves me puzzled about a production-ready architecture and how clients would produce data for training by themselves.
There’s no fundamental difference between federated learning and centralized learning when it comes to data labelling.
The biggest real-world difference is to decide which approach is practical in which setting. In federated learning, you can have very different scenarios, and depending on the scenario, some approaches are either practical or not. One dimension here is cross-device vs cross-silo FL.
The example you gave is a typical cross-device setting. In that case, manually labelling the data on the device might not be practical. One option that would not require data labelling would be to use self-supervised learning to pre-train a useful backbone on real-world video data.