Close Menu
Emirates InsightEmirates Insight
  • The GCC
    • Duabi
  • Business & Economy
  • Startups & Leadership
  • Blockchain & Crypto
  • Eco-Impact

Subscribe to Updates

Get the latest creative news from FooBar about art, design and business.

What's Hot

DFSA Welcomes 2025 Cohort Of Its Graduate Programme, Helping To Foster The UAE’s Next Generation Of Regulatory Leaders – Dubai Blog

September 23, 2025

Insulin resistance prediction from wearables and routine blood biomarkers

September 23, 2025

Securing private data at scale with differentially private partition selection

September 23, 2025
Facebook X (Twitter) Instagram LinkedIn
  • Home
  • Guest Writer Policy
  • Privacy Policy
  • Terms of Use
  • Contact Us
Facebook X (Twitter) Instagram LinkedIn
Emirates InsightEmirates Insight
  • The GCC
    • Duabi
  • Business & Economy
  • Startups & Leadership
  • Blockchain & Crypto
  • Eco-Impact
Emirates InsightEmirates Insight
Home»AI & Innovation»Securing private data at scale with differentially private partition selection
AI & Innovation

Securing private data at scale with differentially private partition selection

Emirates InsightBy Emirates InsightSeptember 23, 2025No Comments
Facebook Twitter Pinterest LinkedIn WhatsApp Reddit Tumblr Email
Share
Facebook Twitter LinkedIn Pinterest Email

Large, user-based datasets are invaluable for advancing AI and machine learning models. They drive innovation that directly benefits users through improved services, more accurate predictions, and personalized experiences. Collaborating on and sharing such datasets can accelerate research, foster new applications, and contribute to the broader scientific community. However, leveraging these powerful datasets also comes with potential data privacy risks.

The process of identifying a specific, meaningful subset of unique items that can be shared safely from a vast collection based on how frequently or prominently they appear across many individual contributions (like finding all the common words used across a huge set of documents) is called “differentially private (DP) partition selection”. By applying differential privacy protections in partition selection, it’s possible to perform that selection in a way that prevents anyone from knowing whether any single individual’s data contributed a specific item to the final list. This is done by adding controlled noise and only selecting items that are sufficiently common even after that noise is included, ensuring individual privacy. DP is the first step in many important data science and machine learning tasks, including extracting vocabulary (or n-grams) from a large private corpus (a necessary step of many textual analysis and language modeling applications), analyzing data streams in a privacy preserving way, obtaining histograms over user data, and increasing efficiency in private model fine-tuning.

In the context of massive datasets like user queries, a parallel algorithm is crucial. Instead of processing data one piece at a time (like a sequential algorithm would), a parallel algorithm breaks the problem down into many smaller parts that can be computed simultaneously across multiple processors or machines. This practice isn’t just for optimization; it’s a fundamental necessity when dealing with the scale of modern data. Parallelization allows the processing of vast amounts of information all at once, enabling researchers to handle datasets with hundreds of billions of items. With this, it’s possible to achieve robust privacy guarantees without sacrificing the utility derived from large datasets.

In our recent publication, “Scalable Private Partition Selection via Adaptive Weighting”, which appeared at ICML2025, we introduce an efficient parallel algorithm that makes it possible to apply DP partition selection to various data releases. Our algorithm provides the best results across the board among parallel algorithms and scales to datasets with hundreds of billions of items, up to three orders of magnitude larger than those analyzed by prior sequential algorithms. To encourage collaboration and innovation by the research community, we are open-sourcing DP partition selection on GitHub.

Source link

Share. Facebook Twitter Pinterest LinkedIn Tumblr Telegram Email
Emirates Insight
  • Website

Related Posts

Insulin resistance prediction from wearables and routine blood biomarkers

September 23, 2025

The tech behind YouTube real-time generative AI effects

September 22, 2025

A scalable framework for evaluating health language models

September 22, 2025
Leave A Reply Cancel Reply

Start Your Business in
Dubai with Tijarist

Company setup, residency support, and expert guidance — all in one place.

GET STARTED
Top Posts

Global Leaders Unite at World Climate Summit, The Investment COP 2023 to Redefine Climate Action

December 11, 20235,006 Views

Australia Risks Falling Behind in Climate Investment, New Report Warns

August 21, 20253,047 Views

Dubai Golden Visa for Gamers: How to Apply, Eligibility, and Key Benefits

February 10, 20253,012 Views

EnergyLab Selects 10 Startups for 2025 Climate Solutions Accelerator

August 26, 20251,789 Views

Subscribe to Updates

Get the latest creative news from SmartMag about art & design.

FEATURE YOUR BRAND ON
EMIRATES INSIGHT
CONTACT US
Emirares Insight

Emirates Insight - Lens on the Gulf provides in-depth analysis of the Gulf's business landscape, entrepreneurship stories, economic trends, and technological advancements, offering keen insights into regional developments and global implications.

We're accepting always open for new ideas and partnerships.

Email Us:[email protected]

Facebook X (Twitter)
Our Picks

DFSA Welcomes 2025 Cohort Of Its Graduate Programme, Helping To Foster The UAE’s Next Generation Of Regulatory Leaders – Dubai Blog

September 23, 2025

Insulin resistance prediction from wearables and routine blood biomarkers

September 23, 2025

Securing private data at scale with differentially private partition selection

September 23, 2025
© 2020 - 2025 Emirates Insight. | Designed by Linc Globa Hub inc.
  • Home
  • Guest Writer Policy
  • Privacy Policy
  • Terms of Use
  • Contact Us

Type above and press Enter to search. Press Esc to cancel.