I am trying to apply Differential Privacy to the github example on xgboost-comprehensive
I simply started wrapping the server strategy FedXgbBagging within DifferentialPrivacyServerSideFixedClipping but on running I get the following error:
ERROR : ServerApp thread raised an exception: Cannot load file containing pickled data when allow_pickle=False
I know I provided little information but before moving on I would like to know whether Flower Differential Privacy is supposed to work with FedXgbBagging. Is that the case? Can I use DP with XGBoost?
Flower’s central DP was designed around strategies that exchange simple numeric arrays (for example FedAvg on neural network weights). Under the hood the DP wrappers convert the parameters coming from the base strategy into NumPy arrays using parameters_to_ndarrays before clipping and adding noise. This conversion works only when the parameters are encoded as Numpy arrays.
XGBoost‐based strategies such as FedXgbBagging serialise the XGBoost model (a tree ensemble) as a pickled object rather than as arrays. When you wrap FedXgbBagging with DifferentialPrivacyServerSideFixedClipping, the wrapper still tries to call parameters_to_ndarrays, which leads numpy.load to reject the pickled model and raises the error “cannot load file containing pickled data when allow_pickle=False.” In other words, the current DP implementation is not compatible with XGBoost strategies.
In order to make this work, you would have to implement a custom DP mechanism (e.g., local DP on gradients) or look at specialized approaches for federated XGBoost.