RandomForestRegression

helenklim · January 19, 2025, 3:41pm

Hi dear community! I’m trying to train sklearn.RandomForestRegressor with Flower. I have 1 server machine and 1 client machine. But I am getting the error: File “/home/elena/project/venv3/lib/python3.9/site-packages/flwr/server/strategy/aggregate.py”, line 59, in _try_inplace
np_binary_op(x, y, out=x)
numpy._core._exceptions._UFuncNoLoopError: ufunc ‘multiply’ did not contain a loop with signature matching types (dtype(‘<U400’), dtype(‘float64’)) → None

So I started wondering if it is possible to train and aggregate forest models with Flower. Is there proper aggregation method implemented for them? As forests seem to be more complex things than, say, linear regression. Thank you in advance for replies

sasikumar · January 27, 2025, 8:52am

Hi - Yes, it is possible to train random forest models with Flower. See this example of XGBoost implemented in Flower, which also has a YouTube tutorial: Quickstart XGBoost - Flower Framework

Having said that, can you please share more details about your implementation setting because having just 1 client isn’t really FL per se (because you have only source of data). I assume you have the data split on the server machine as well? Also, which aggregation method were you using?

helenklim · January 28, 2025, 8:41am

Hi, thank you for answering and for the link! I am trying to use sklearn RFs. I only have 1 client, because I want to learn Flower framework not only in simulation mode but in deployment, but I only have 2 machines at home
My implementation mostly looks like this example GitHub - Hongwei-Z/Federated-Random-Forest: Using Flower federated learning with scikit-learn random forest except that I only have 1 client, and I tried various ways of implementing set_params() and get_params() functions. But I believe that this is not completely correct anyway, because in that example I can’t see a valid way of passing and aggregating Random Forest estimators… Or perhaps I just misunderstand the whole concept. Could you please comment on the example? If it is correct, then how the aggregation works in it? And how the estimators are being passed between client and server? Shouldn’t we pass them using set_params() and get_params()?

williamlm · February 6, 2025, 2:59pm

Hi @helenklim, thank you for posting here.

In order to run a federated example, you will need a minimum of 2 clients as conventionally there are more clients than servers, otherwise you don’t need the server.

The example you are referring to is good and I generally like tree-based models, however it is built on legacy/outdated code. The newer versions of Flower use a different way of running clientApps and ServerApps. What you could do is to checkout our sklearn example. Together, we can work to integrate the sklearn random forest here.

Best regards
William

helenklim · February 20, 2025, 10:29am

Hi @williamlm ! Thank you for your reply. So as I understand I can rely on the example which I referred to, except that I should use superlink and supernode methods instead of run_server(), run_client(). I’ll try it on my machines

williamlm · February 20, 2025, 7:35pm

Yes, that should work! Let me know how it works out.

Topic	Replies	Views
Announcing Flower 1.18.0 General release , flower	93	April 23, 2025
Announcing Flower 1.12 General flower , release	144	October 14, 2024
Extending flwr to use sikit-learn XGBClassifier - Error when serializing/sending model coeficients Flower Framework	142	February 21, 2024
Announcing Flower 1.16.0 General release , flower	46	March 11, 2025
Announcing Flower Datasets 0.2.0! General	52	July 9, 2024

RandomForestRegression

Related topics