Dimensionality Reduction for Robust Federated Learning: A Theoretical Analysis and Convergence Guarantee
Mirrored from arXiv — Machine Learning for archival readability. Support the source by reading on the original site.
Computer Science > Machine Learning
Title:Dimensionality Reduction for Robust Federated Learning: A Theoretical Analysis and Convergence Guarantee
Abstract:Federated Learning (FL) enables multiple clients to collaboratively train models without sharing raw data, but it is highly vulnerable to Byzantine attacks. Existing robust approaches can neutralize these threats but incur substantial computational overhead during high-dimensional gradient aggregation, an overhead that scales poorly with model size and increasingly dominates the training cost as modern models grow larger. To address this computational bottleneck, we propose Projected Dimensionality Reduction (PDR), a universal acceleration framework for vector-level distance-based robust aggregators, which performs robust aggregation by compressing gradients into a drastically smaller subspace via sparse random projection to efficiently compute reliability weights. This approach reduces the server computational complexity to an optimal $ \mathcal{O}(Mp) $, where $ M $ is the number of clients and $ p $ is the model dimension, matching the theoretical lower bound required merely to read the gradients. We establish convergence guarantees under standard FL assumptions in prior Byzantine-robust FL analyses. By leveraging the Subspace Embedding Theorem, we show that PDR achieves optimal convergence rates of $ \mathcal{O}(1/\sqrt{T}) $ for non-convex functions and $ \mathcal{O}(1/T) $ for strongly convex functions, where $ T $ denotes the number of iterations. Crucially, we mathematically demonstrate that this massive acceleration comes almost for free, merely inflating the inherent Byzantine error floor by a bounded, tunable factor of $ \frac{1+\epsilon}{1-\epsilon} $. Experimental results on benchmark datasets confirm that integrating PDR with existing aggregators yields orders of magnitude speedups in time efficiency while maintaining highly competitive convergence performance.
| Subjects: | Machine Learning (cs.LG) |
| Cite as: | arXiv:2605.28335 [cs.LG] |
| (or arXiv:2605.28335v1 [cs.LG] for this version) | |
| https://doi.org/10.48550/arXiv.2605.28335
arXiv-issued DOI via DataCite (pending registration)
|
Access Paper:
- View PDF
- HTML (experimental)
- TeX Source
References & Citations
Bibliographic and Citation Tools
Code, Data and Media Associated with this Article
Demos
Recommenders and Search Tools
arXivLabs: experimental projects with community collaborators
arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website.
Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them.
Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs.
More from arXiv — Machine Learning
-
Representation as a Bottleneck for Mechanistic Interpretability: The Manifestation Unit Protocol
Jul 2
-
SNAP-FM: Sparse Nonlinear Accelerated Projection for Physics-Constrained Generative Modeling
Jul 2
-
SemiScope: Disentangling Classifier Tuning and Joint Optimization in Semi-Supervised Security Classification
Jul 2
-
A Filtered Mixture-of-Generators for Fully Synthetic Survival Training
Jul 2
Discussion (0)
Sign in to join the discussion. Free account, 30 seconds — email code or GitHub.
Sign in →No comments yet. Sign in and be the first to say something.