Multi-party Data Federation Based on Federated Learning
In 2019, federated learning, as an emerging technology for data collaboration, began to spread in China. I studied and researched it, and applied it to JD's Data Federation and Mu Media Data Platform.
Traditional Data Integration: Integrate feature or label data into one party, and use data from both parties to train and obtain a model. This poses risks of privacy data leakage and data asset outflow.
Federated Learning: Data owners can conduct joint training (exchange encrypted training parameters) and obtain an adequately accurate model (with a small gap compared to traditional data integration model) without disclosing their original data. The training target is either non-individual information or is authorized by users, and no party can infer the other's original data.