164 results sorted by ID
PPSA: Polynomial Private Stream Aggregation for Time-Series Data Analysis
Antonia Januszewicz, Daniela Medrano Gutierrez, Nirajan Koirala, Jiachen Zhao, Jonathan Takeshita, Jaewoo Lee, Taeho Jung
Cryptographic protocols
Modern data analytics requires computing functions on streams of data points from many users that are challenging to calculate, due to both the high scale and nontrivial nature of the computation at hand. The need for data privacy complicates this matter further, as general-purpose privacy-enhancing technologies face limitations in at least scalability or utility. Existing work has attempted to improve this by designing purpose-built protocols for the use case of Private Stream Aggregation;...
$Shortcut$: Making MPC-based Collaborative Analytics Efficient on Dynamic Databases
Peizhao Zhou, Xiaojie Guo, Pinzhi Chen, Tong Li, Siyi Lv, Zheli Liu
Applications
Secure Multi-party Computation (MPC) provides a promising solution for privacy-preserving multi-source data analytics. However, existing MPC-based collaborative analytics systems (MCASs) have unsatisfying performance for scenarios with dynamic databases. Naively running an MCAS on a dynamic database would lead to significant redundant costs and raise performance concerns, due to the substantial duplicate contents between the pre-updating and post-updating databases.
In this paper, we...
Mario: Multi-round Multiple-Aggregator Secure Aggregation with Robustness against Malicious Actors
Truong Son Nguyen, Tancrède Lepoint, Ni Trieu
Cryptographic protocols
Federated Learning (FL) enables multiple clients to collaboratively train a machine learning model while keeping their data private, eliminating the need for data sharing. Two common approaches to secure aggregation (SA) in FL are the single-aggregator and multiple-aggregator models.
Existing multiple-aggregator protocols such as Prio (NSDI 2017), Prio (SCN 2022), Elsa (S\&P 2023) either offer robustness only in the presence of semi-honest servers or provide security without...
Privacy Comparison for Bitcoin Light Client Implementations
Arad Kotzer, Ori Rottenstreich
Applications
Light clients implement a simple solution for Bitcoin's scalability problem, as they do not store the entire blockchain but only the state of particular addresses of interest. To be able to keep track of the updated state of their addresses, light clients rely on full nodes to provide them with the required information. To do so, they must reveal information about the addresses they are interested in. This paper studies the two most common light client implementations, SPV and Neutrino with...
Hekaton: Horizontally-Scalable zkSNARKs via Proof Aggregation
Michael Rosenberg, Tushar Mopuri, Hossein Hafezi, Ian Miers, Pratyush Mishra
Cryptographic protocols
Zero-knowledge Succinct Non-interactive ARguments of Knowledge (zkSNARKs) allow a prover to convince a verifier of the correct execution of a large computation in private and easily-verifiable manner. These properties make zkSNARKs a powerful tool for adding accountability, scalability, and privacy to numerous systems such as blockchains and verifiable key directories. Unfortunately, existing zkSNARKs are unable to scale to large computations due to time and space complexity requirements...
ArcEDB: An Arbitrary-Precision Encrypted Database via (Amortized) Modular Homomorphic Encryption
Zhou Zhang, Song Bian, Zian Zhao, Ran Mao, Haoyi Zhou, Jiafeng Hua, Yier Jin, Zhenyu Guan
Cryptographic protocols
Fully homomorphic encryption (FHE) based database outsourcing is drawing growing research interests. At its current state, there exist two primary obstacles against FHE-based encrypted databases (EDBs): i) low data precision, and ii) high computational latency. To tackle the precision-performance dilemma, we introduce ArcEDB, a novel FHE-based SQL evaluation infrastructure that simultaneously achieves high data precision and fast query evaluation. Based on a set of new plaintext encoding...
Efficient Verifiable Differential Privacy with Input Authenticity in the Local and Shuffle Model
Tariq Bontekoe, Hassan Jameel Asghar, Fatih Turkmen
Cryptographic protocols
Local differential privacy (LDP) is an efficient solution for providing privacy to client's sensitive data while simultaneously releasing aggregate statistics without relying on a trusted central server (aggregator) as in the central model of differential privacy. The shuffle model with LDP provides an additional layer of privacy, by disconnecting the link between clients and the aggregator, further improving the utility of LDP. However, LDP has been shown to be vulnerable to malicious...
FASIL: A challenge-based framework for secure and privacy-preserving federated learning
Ferhat Karakoç, Betül Güvenç Paltun, Leyli Karaçay, Ömer Tuna, Ramin Fuladi, Utku Gülen
Applications
Enhancing privacy in federal learning (FL) without considering robustness can create an open door for attacks such as poisoning attacks on the FL process. Thus, addressing both the privacy and security aspects simultaneously becomes vital. Although, there are a few solutions addressing both privacy and security in the literature in recent years, they have some drawbacks such as requiring two non-colluding servers, heavy cryptographic operations, or peer-to-peer communication topology. In...
CoGNN: Towards Secure and Efficient Collaborative Graph Learning
Zhenhua Zou, Zhuotao Liu, Jinyong Shan, Qi Li, Ke Xu, Mingwei Xu
Applications
Collaborative graph learning represents a learning paradigm where multiple parties jointly train a graph neural network (GNN) using their own proprietary graph data. To honor the data privacy of all parties, existing solutions for collaborative graph learning are either based on federated learning (FL) or secure machine learning (SML). Although promising in terms of efficiency and scalability due to their distributed training scheme, FL-based approaches fall short in providing provable...
Willow: Secure Aggregation with One-Shot Clients
James Bell-Clark, Adrià Gascón, Baiyu Li, Mariana Raykova, Phillipp Schoppmann
Cryptographic protocols
A common drawback of secure vector summation protocols in the single-server model is that they impose at least one synchronization point between all clients contributing to the aggregation. This results in clients waiting on each other to advance through the rounds of the protocol, leading to large latency even if the protocol is computationally efficient. In this paper we propose protocols in the single-server model where clients contributing data to the aggregation send a single message...
Summation-based Private Segmented Membership Test from Threshold-Fully Homomorphic Encryption
Nirajan Koirala, Jonathan Takeshita, Jeremy Stevens, Taeho Jung
Cryptographic protocols
In many real-world scenarios, there are cases where a client wishes
to check if a data element they hold is included in a set segmented
across a large number of data holders. To protect user privacy, the
client’s query and the data holders’ sets should remain encrypted
throughout the whole process. Prior work on Private Set Intersection (PSI), Multi-Party PSI (MPSI), Private Membership Test (PMT),
and Oblivious RAM (ORAM) falls short in this scenario in many
ways. They either require...
Private Computations on Streaming Data
Vladimir Braverman, Kevin Garbe, Eli Jaffe, Rafail Ostrovsky
Cryptographic protocols
We present a framework for privacy-preserving streaming algorithms which combine the memory-efficiency of streaming algorithms with strong privacy guarantees. These algorithms enable some number of servers to compute aggregate statistics efficiently on large quantities of user data without learning the user's inputs. While there exists limited prior work that fits within our model, our work is the first to formally define a general framework, interpret existing methods within this general...
Private Analytics via Streaming, Sketching, and Silently Verifiable Proofs
Mayank Rathee, Yuwen Zhang, Henry Corrigan-Gibbs, Raluca Ada Popa
Cryptographic protocols
We present Whisper, a system for privacy-preserving collection of aggregate statistics. Like prior systems, a Whisper deployment consists of a small set of non-colluding servers; these servers compute aggregate statistics over data from a large number of users without learning the data of any individual user. Whisper’s main contribution is that its server- to-server communication cost and its server-side storage costs scale sublinearly with the total number of users. In particular, prior...
NodeGuard: A Highly Efficient Two-Party Computation Framework for Training Large-Scale Gradient Boosting Decision Tree
Tianxiang Dai, Yufan Jiang, Yong Li, Fei Mei
Cryptographic protocols
The Gradient Boosting Decision Tree (GBDT) is a well-known machine learning algorithm, which achieves high performance and outstanding interpretability in real-world scenes such as fraud detection, online marketing and risk management. Meanwhile, two data owners can jointly train a GBDT model without disclosing their private dataset by executing secure Multi-Party Computation (MPC) protocols. In this work, we propose NodeGuard, a highly efficient two party computation (2PC) framework for...
RSA-Based Dynamic Accumulator without Hashing into Primes
Victor Youdom Kemmoe, Anna Lysyanskaya
Public-key cryptography
A cryptographic accumulator is a compact data structure for representing a set of elements coming from some domain. It allows for a compact proof of membership and, in the case of a universal accumulator, non-membership of an element x in the data structure. A dynamic accumulator, furthermore, allows elements to be added to and deleted from the accumulator.
Previously known RSA-based dynamic accumulators were too slow in practice because they required that an element in the domain be...
Reckle Trees: Updatable Merkle Batch Proofs with Applications
Charalampos Papamanthou, Shravan Srinivasan, Nicolas Gailly, Ismael Hishon-Rezaizadeh, Andrus Salumets, Stjepan Golemac
Cryptographic protocols
We propose Reckle trees, a new vector commitment based on succinct RECursive arguments and MerKLE trees. Reckle trees' distinguishing feature is their support for succinct batch proofs that are updatable - enabling new applications in the blockchain setting where a proof needs to be computed and efficiently maintained over a moving stream of blocks. Our technical approach is based on embedding the computation of the batch hash inside the recursive Merkle verification via a hash-based...
Atlas-X Equity Financing: Unlocking New Methods to Securely Obfuscate Axe Inventory Data Based on Differential Privacy
Antigoni Polychroniadou, Gabriele Cipriani, Richard Hua, Tucker Balch
Applications
Banks publish daily a list of available securities/assets (axe list) to selected clients to help them effectively locate Long (buy) or Short (sell) trades at reduced financing rates. This reduces costs for the bank, as the list aggregates the bank's internal firm inventory per asset for all clients of long as well as short trades. However, this is somewhat problematic: (1) the bank's inventory is revealed; (2) trades of clients who contribute to the aggregated list, particularly those deemed...
Anonymous Complaint Aggregation for Secure Messaging
Connor Bell, Saba Eskandarian
Applications
Private messaging platforms provide strong protection against platform eavesdropping, but malicious users can use privacy as cover for spreading abuse and misinformation. In an attempt to identify the sources of misinformation on private platforms, researchers have proposed mechanisms to trace back the source of a user-reported message (CCS '19,'21). Unfortunately, the threat model considered by initial proposals allowed a single user to compromise the privacy of another user whose...
A data aggregation protocol based on TFHE
Maria Ferrara, Antonio Tortora, Maria Tota
Cryptographic protocols
Torus Fully Homomorphic Encryption (TFHE) is a probabilistic cryptosytem over the real torus which allows one to operate directly on encrypted data without first decrypting them. We present an aggregation protocol based on a variant of TFHE for computing the sum of sensitive data, working only with the corresponding ciphertexts. Our scheme is an ideal choice for a system of smart meters - electronic devices for measuring energy consumption - that demands consumers’
privacy. In contrast to...
Mastic: Private Weighted Heavy-Hitters and Attribute-Based Metrics
Dimitris Mouris, Christopher Patton, Hannah Davis, Pratik Sarkar, Nektarios Georgios Tsoutsos
Cryptographic protocols
Insight into user experience and behavior is critical to the success of large software systems and web services. Gaining such insights, while preserving user privacy, is a significant challenge. Recent advancements in multi-party computation have made it practical to securely compute aggregates over secret shared data. Two such protocols have emerged as candidates for standardization at the IETF: Prio (NSDI 2017) for general-purpose statistics; and Poplar (IEEE S&P 2021) for heavy hitters,...
Secure Statistical Analysis on Multiple Datasets: Join and Group-By
Gilad Asharov, Koki Hamada, Dai Ikarashi, Ryo Kikuchi, Ariel Nof, Benny Pinkas, Junichi Tomida
Cryptographic protocols
We implement a secure platform for statistical analysis over multiple organizations and multiple datasets. We provide a suite of protocols for different variants of JOIN and GROUP-BY operations. JOIN allows combining data from multiple datasets based on a common column. GROUP-BY allows aggregating rows that have the same values in a column or a set of columns, and then apply some aggregation summary on the rows (such as sum, count, median, etc.). Both operations are fundamental tools for...
SuperFL: Privacy-Preserving Federated Learning with Efficiency and Robustness
Yulin Zhao, Hualin Zhou, Zhiguo Wan
Applications
Federated Learning (FL) accomplishes collaborative model training without the need to share local training data. However, existing FL aggregation approaches suffer from inefficiency, privacy vulnerabilities, and neglect of poisoning attacks, severely impacting the overall performance and reliability of model training. In order to address these challenges, we propose SuperFL, an efficient two-server aggregation scheme that is both privacy preserving and secure against poisoning attacks. The...
PRIDA: PRIvacy-preserving Data Aggregation with multiple data customers
Beyza Bozdemir, Betül Aşkın Özdemir, Melek Önen
Cryptographic protocols
We propose a solution for user privacy-oriented privacy-preserving data aggregation with multiple data customers. Most existing state-of-the-art approaches present too much importance on performance efficiency and seem to ignore privacy properties except for input privacy. Most solutions for data aggregation do not generally discuss the users’ birthright, namely their privacy for their own data control and anonymity when they search for something on the browser or volunteer to participate in...
Conan: Distributed Proofs of Compliance for Anonymous Data Collection
Mingxun Zhou, Elaine Shi, Giulia Fanti
Cryptographic protocols
We consider how to design an anonymous data collection protocol that enforces compliance rules.
Imagine that each client contributes multiple data items (e.g., votes, location crumbs, or secret shares of its input) to an anonymous network, which mixes all clients' data items so that the receiver cannot determine which data items belong to the same user. Now, each user must prove to an auditor
that the set it contributed satisfies a compliance predicate, without identifying which items it...
Reverie: an end-to-end accumulation scheme from Cyclefold
Lev Soukhanov
Foundations
Recent advances in SNARK recursion and incrementally-verifiable computation are vast, but most of the efforts seem to be focused on a particular design goal - proving the result of a large computation known completely in advance.
There are other possible applications, requiring different design tradeoffs. Particularly interesting direction is a case with a swarm of collaborating provers, communicating over a peer-to-peer network - which requires to also optimize the amount of data...
Decentralized Private Steam Aggregation from Lattices
Uddipana Dowerah, Aikaterini Mitrokotsa
Cryptographic protocols
As various industries and government agencies increasingly seek to build quantum computers, the development of post-quantum constructions for different primitives becomes crucial. Lattice-based cryptography is one of the top candidates for constructing quantum-resistant primitives. In this paper, we propose a decentralized Private Stream Aggregation (PSA) protocol based on the Learning with Errors (LWE) problem. PSA allows secure aggregation of time-series data over multiple users without...
Distributed Differential Privacy via Shuffling vs Aggregation: a Curious Study
Yu Wei, Jingyu Jia, Yuduo Wu, Changhui Hu, Changyu Dong, Zheli Liu, Xiaofeng Chen, Yun Peng, Shaowei Wang
Applications
How to achieve distributed differential privacy (DP) without a trusted central party is of great interest in both theory and practice. Recently, the shuffle model has attracted much attention. Unlike the local DP model in which the users send randomized data directly to the data collector/analyzer, in the shuffle model an intermediate untrusted shuffler is introduced to randomly permute the data, which have already been randomized by the users, before they reach the analyzer. The most...
KiloNova: Non-Uniform PCD with Zero-Knowledge Property from Generic Folding Schemes
Tianyu Zheng, Shang Gao, Yu Guo, Bin Xiao
Cryptographic protocols
Most existing accumulation/folding schemes focus on implementing Incrementally Verifiable Computation (IVC). Proof-carrying Data (PCD), as a generalization of IVC, enables sequential computation performance by multiple distrusting parties, thereby offering a robust primitive tool in real-world applications. However, building non-uniform PCD from folding schemes faces many technical challenges, particularly in handling cross items and preserving zero knowledge.
This paper introduces...
A Privacy-preserving Central Bank Ledger for Central Bank Digital Currency
Chan Wang Mong Tikvah
Applications
Central banks around the world are actively exploring the issuance of retail central bank digital currency (rCBDC), which is widely seen as a key upgrade of the monetary system in the 21st century. However, privacy concerns are the main impediment to rCBDC’s development and roll-out. A central bank as the issuer of rCBDC would typically need to keep a digital ledger to record all the balances and transactions of citizens. These data, when combined with other data, could possibly disclose the...
HE$^3$DB: An Efficient and Elastic Encrypted Database Via Arithmetic-And-Logic Fully Homomorphic Encryption
Song Bian, Zhou Zhang, Haowen Pan, Ran Mao, Zian Zhao, Yier Jin, Zhenyu Guan
Cryptographic protocols
As concerns are increasingly raised about data privacy, encrypted database management system (DBMS) based on fully homomorphic encryption (FHE) attracts increasing research attention, as FHE permits DBMS to be directly outsourced to cloud servers without revealing any plaintext data. However, the real-world deployment of FHE-based DBMS faces two main challenges: i) high computational latency, and ii) lack of elastic query processing capability, both of which stem from the inherent...
HEIR: A Unified Representation for Cross-Scheme Compilation of Fully Homomorphic Computation
Song Bian, Zian Zhao, Zhou Zhang, Ran Mao, Kohei Suenaga, Yier Jin, Zhenyu Guan, Jianwei Liu
Applications
We propose a new compiler framework that automates code generation over multiple fully homomorphic encryption (FHE) schemes. While it was recently shown that algorithms combining multiple FHE schemes (e.g., CKKS and TFHE) achieve high execution efficiency and task utility at the same time, developing fast cross-scheme FHE algorithms for real-world applications generally require heavy hand-tuned optimizations by cryptographic experts, resulting in either high usability costs or low...
zkDL: Efficient Zero-Knowledge Proofs of Deep Learning Training
Haochen Sun, Tonghe Bai, Jason Li, Hongyang Zhang
Applications
The recent advancements in deep learning have brought about significant changes in various aspects of people's lives. Meanwhile, these rapid developments have raised concerns about the legitimacy of the training process of deep neural networks. To protect the intellectual properties of AI developers, directly examining the training process by accessing the model parameters and training data is often prohibited for verifiers.
In response to this challenge, we present zero-knowledge deep...
TVA: A multi-party computation system for secure and expressive time series analytics
Muhammad Faisal, Jerry Zhang, John Liagouris, Vasiliki Kalavri, Mayank Varia
Applications
We present TVA, a multi-party computation (MPC) system for secure analytics on secret-shared time series data. TVA achieves strong security guarantees in the semi-honest and malicious settings, and high expressivity by enabling complex analytics on inputs with unordered and irregular timestamps. TVA is the first system to support arbitrary composition of oblivious window operators, keyed aggregations, and multiple filter predicates, while keeping all data attributes private, including record...
Intmax2: A ZK-rollup with Minimal Onchain Data and Computation Costs Featuring Decentralized Aggregators
Erik Rybakken, Leona Hioki, Mario Yaksetig
Cryptographic protocols
We present a blockchain scaling solution called Intmax2, which is a Zero-Knowledge rollup (ZK-rollup) protocol with stateless and permissionless block production, while minimizing the usage of data and computation on the underlying blockchain. Our architecture distinctly diverges from existing ZK-rollups since essentially all of the data and computational costs are shifted to the client-side as opposed to imposing heavy requirements on the block producers or the underlying Layer 1...
Differentially Private Selection from Secure Distributed Computing
Ivan Damgård, Hannah Keller, Boel Nelson, Claudio Orlandi, Rasmus Pagh
Cryptographic protocols
Given a collection of vectors $\mathbf{x}^{(1)},\dots,\mathbf{x}^{(n)} \in \{0,1\}^d$, the selection problem asks to report the index of an "approximately largest" entry in $\mathbf{x}=\sum_{j=1}^n \mathbf{x}^{(j)}$. Selection abstracts a host of problems; in machine learning it can be used for hyperparameter tuning, feature selection, or to model empirical risk minimization. We study selection under differential privacy, where a released index guarantees privacy for individual vectors....
Falkor: Federated Learning Secure Aggregation Powered by AES-CTR GPU Implementation
Mariya Georgieva Belorgey, Sofia Dandjee, Nicolas Gama, Dimitar Jetchev, Dmitry Mikushin
Cryptographic protocols
We propose a novel protocol, Falkor, for secure aggregation for Federated Learning in the multi-server scenario based on masking of local models via a stream cipher based on AES in counter mode and accelerated by GPUs running on the aggregating servers. The protocol is resilient to client dropout and has reduced clients/servers communication cost by a factor equal to the number of aggregating servers (compared to the naïve baseline method). It scales simultaneously in the two major...
Efficient and Secure Quantile Aggregation of Private Data Streams
Xiao Lan, Hongjian Jin, Hui Guo, Xiao Wang
Applications
Computing the quantile of a massive data stream has been a crucial task in networking and data management. However, existing solutions assume a centralized model where one data owner has access to all data. In this paper, we put forward a study of secure quantile aggregation between private data streams, where data streams owned by different parties would like to obtain a quantile of the union of their data without revealing anything else about their inputs. To this end, we designed efficient...
FedVS: Straggler-Resilient and Privacy-Preserving Vertical Federated Learning for Split Models
Songze Li, Duanyi Yao, Jin Liu
Cryptographic protocols
In a vertical federated learning (VFL) system consisting of a central server and many distributed clients, the training data are vertically partitioned such that different features are privately stored on different clients. The problem of split VFL is to train a model split between the server and the clients. This paper aims to address two major challenges in split VFL: 1) performance degradation due to straggling clients during training; and 2) data and model privacy leakage from clients’...
Squirrel: A Scalable Secure Two-Party Computation Framework for Training Gradient Boosting Decision Tree
Wen-jie Lu, Zhicong Huang, Qizhi Zhang, Yuchen Wang, Cheng Hong
Applications
Gradient Boosting Decision Tree (GBDT) and its variants are widely used in industry, due to their strong interpretability. Secure multi-party computation allows multiple data owners to compute a function jointly while keeping their input private. In this work, we present Squirrel, a two-party GBDT training framework on a vertically split dataset, where two data owners each hold different features of the same data samples. Squirrel is private against semi-honest adversaries, and no sensitive...
Flamingo: Multi-Round Single-Server Secure Aggregation with Applications to Private Federated Learning
Yiping Ma, Jess Woods, Sebastian Angel, Antigoni Polychroniadou, Tal Rabin
Applications
This paper introduces Flamingo, a system for secure aggregation of data across a large set of clients. In secure aggregation, a server sums up the private inputs of clients and obtains the result without learning anything about the individual inputs beyond what is implied by the final sum. Flamingo focuses on the multi-round setting found in federated learning in which many consecutive summations (averages) of model weights are performed to derive a good model. Previous protocols, such as...
Interoperable Private Attribution: A Distributed Attribution and Aggregation Protocol
Benjamin Case, Richa Jain, Alex Koshelev, Andy Leiserson, Daniel Masny, Thurston Sandberg, Ben Savage, Erik Taubeneck, Martin Thomson, Taiki Yamaguchi
Cryptographic protocols
Measuring people’s interactions that span multiple websites can provide unique insight that enables better products and improves people’s experiences, but directly observing people’s individual journeys creates privacy risks that conflict with the newly emerging privacy model for the web. We propose a protocol that uses the combination of multi-party computation and differential privacy that enables the processing of peoples’ data such that only aggregate measurements are revealed, strictly...
Verifiable Decentralized Multi-Client Functional Encryption for Inner Product
Dinh Duy Nguyen, Duong Hieu Phan, David Pointcheval
Public-key cryptography
Joint computation on encrypted data is becoming increasingly crucial with the rise of cloud computing. In recent years, the development of multi-client functional encryption (MCFE) has made it possible to perform joint computation on private inputs, without any interaction. Well-settled solutions for linear functions have become efficient and secure, but there is still a shortcoming: if one user inputs incorrect data, the output of the function might become meaningless for all other users...
DIPSAUCE: Efficient Private Stream Aggregation Without Trusted Parties
Joakim Brorsson, Martin Gunnarsson
Cryptographic protocols
Private Stream Aggregation (PSA) schemes are efficient protocols for distributed data analytics. In a PSA scheme, a set of data producers can encrypt data for a central party so that it learns the sum of all encrypted values, but nothing about each individual value. Thus, a trusted aggregator is avoided. However, all known PSA schemes still require a trusted party for key generation. In this paper we propose the first PSA scheme that does not rely on a trusted party. We argue its security...
zkTree: A Zero-Knowledge Recursion Tree with ZKP Membership Proofs
Sai Deng, Bo Du
Implementation
We introduce zkTree, a general framework for constructing a tree by recursively verifying children's zero-knowledge proofs (ZKPs) in a parent ZKP node, while enabling the retrieval of membership proofs for user-supplied zk proofs. We also outline a construction pipeline that allows zkTree to be built and verified on-chain with constant gas cost and low data processing pipeline overhead. By aggregating a large number of user proofs into a single root proof, zkTree makes ZKP on-chain...
A Secure Bandwidth-Efficient Treatment for Dropout-Resistant Time-Series Data Aggregation
Reyhaneh Rabaninejad, Alexandros Bakas, Eugene Frimpong, Antonis Michalas
Cryptographic protocols
Aggregate statistics derived from time-series data collected by individual users are extremely beneficial in diverse fields, such as e-health applications, IoT-based smart metering networks, and federated learning systems. Since user data are privacy-sensitive in many cases, the untrusted aggregator may only infer the aggregation without breaching individual privacy. To this aim, secure aggregation techniques have been extensively researched over the past years. However, most existing...
Prism: Private Set Intersection and Union with Aggregation over Multi-Owner Outsourced Data
Shantanu Sharma, Yin Li, Sharad Mehrotra, Nisha Panwar, Dhrubajyoti Ghosh, Peeyush Gupta
Applications
This paper proposes Prism, Private Verifiable Set Computation over Multi-Owner Outsourced Databases, a secret sharing based approach to compute private set operations (i.e., intersection and union), as well as aggregates over outsourced databases belonging to multiple owners. Prism enables data owners to pre-load the data onto non-colluding servers and exploits the additive and multiplicative properties of secret-shares to compute the above-listed operations in (at most) two rounds of...
Verifiable Distributed Aggregation Functions
Hannah Davis, Christopher Patton, Mike Rosulek, Phillipp Schoppmann
Cryptographic protocols
The modern Internet is built on systems that incentivize collection of information about users. In order to minimize privacy loss, it is desirable to prevent these systems from collecting more information than is required for the application. The promise of multi-party computation is that data can be aggregated without revealing individual measurements to the data collector. This work offers a provable security treatment for "Verifiable Distributed Aggregation Functions (VDAFs)", a class of...
Privacy-Preserving Payment System With Verifiable Local Differential Privacy
Danielle Movsowitz Davidow, Yacov Manevich, Eran Toch
Applications
Privacy-preserving transaction systems on blockchain networks like Monero or Zcash provide complete transaction anonymity through cryptographic commitments or encryption. While this secures privacy, it inhibits the collection of statistical data, which current financial markets heavily rely on for economic and sociological research conducted by central banks, statistics bureaus, and research companies. Differential privacy techniques have been proposed to preserve individuals' privacy while...
PLASMA: Private, Lightweight Aggregated Statistics against Malicious Adversaries
Dimitris Mouris, Pratik Sarkar, Nektarios Georgios Tsoutsos
Cryptographic protocols
Private heavy-hitters is a data-collection task where multiple clients possess private bit strings, and data-collection servers aim to identify the most popular strings without learning anything about the clients' inputs. In this work, we introduce PLASMA: a private analytics framework in the three-server setting that protects the privacy of honest clients and the correctness of the protocol against a coalition of malicious clients and a malicious server.
Our core primitives are a...
2023/078
Last updated: 2023-06-23
An Efficient Multi-Signature Scheme for Blockchain
Mostefa Kara, Abdelkader Laouid, Mohammad Hammoudeh
Cryptographic protocols
Blockchain is a newly emerging technology, however, it has proven effective in many applications because it provides multiple advantages, mainly as it represents a trust system in which data is encrypted in a way that cannot be tampered with or forged. Because it contains many details such as smart contracts, consensus, authentication, etc. the blockchain is a fertile ground for researchers where they can continually improve previous versions of these concepts. This paper introduces a new...
Threshold Signatures in the Multiverse
Leemon Baird, Sanjam Garg, Abhishek Jain, Pratyay Mukherjee, Rohit Sinha, Mingyuan Wang, Yinuo Zhang
Applications
We introduce a new notion of {\em multiverse threshold signatures} (MTS). In an MTS scheme, multiple universes -- each defined by a set of (possibly overlapping) signers, their weights, and a specific security threshold -- can co-exist. A universe can be (adaptively) created via a non-interactive asynchronous setup. Crucially, each party in the multiverse holds constant-sized keys and releases compact signatures with size and computation time both independent of the number of universes....
PoRt: Non-Interactive Continuous Availability Proof of Replicated Storage
Reyhaneh Rabaninejad, Bin Liu, Antonis Michalas
Cryptographic protocols
Secure cryptographic storage is one of the most important issues
that both businesses and end-users take into account before moving
their data to either centralized clouds or blockchain-based decen-
tralized storage marketplace. Recent work [4 ] formalizes the notion
of Proof of Storage-Time (PoSt) which enables storage servers to
demonstrate non-interactive continuous availability of outsourced
data in a publicly verifiable way. The work also proposes a stateful
compact PoSt...
Towards Efficient Decentralized Federated Learning
Christodoulos Pappas, Dimitrios Papadopoulos, Dimitris Chatzopoulos, Eleni Panagou, Spyros Lalis, Manolis Vavalis
Applications
We focus on the problem of efficiently deploying a federated learning training task in a decentralized setting with multiple aggregators. To that end, we introduce a number of improvements and modifications to the recently proposed IPLS protocol. In particular, we relax its assumption for direct communication across participants, using instead indirect communication over a decentralized storage system, effectively turning it into a partially asynchronous protocol. Moreover, we secure it...
Registered Attribute-Based Encryption
Susan Hohenberger, George Lu, Brent Waters, David J. Wu
Public-key cryptography
Attribute-based encryption (ABE) generalizes public-key encryption and enables fine-grained control to encrypted data. However, ABE upends the traditional trust model of public-key encryption by requiring a single trusted authority to issue decryption keys. If an adversary compromises the central authority and exfiltrates its secret key, then the adversary can decrypt every ciphertext in the system.
This work introduces registered ABE, a primitive that allows users to generate secret keys...
aPlonK : Aggregated PlonK from Multi-Polynomial Commitment Schemes
Miguel Ambrona, Marc Beunardeau, Anne-Laure Schmitt, Raphaël R. Toledo
Cryptographic protocols
PlonK is a prominent universal and updatable zk-SNARK for general circuit satisfiability. We present aPlonK, a variant of PlonK that reduces the proof size and verification time when multiple statements are proven in a batch. Both the aggregated proof size and the verification complexity of aPlonK are logarithmic in the number of aggregated statements. Our main building block, inspired by the techniques developed in SnarkPack (Gailly, Maller, Nitulescu, FC 2022), is a multi-polynomial...
Privacy-preserving Federated Singular Value Decomposition
Bowen LIU, Balázs Pejó, Qiang TANG
Applications
Modern Singular Value Decomposition (SVD) computation dates back to the 1960s when the basis for the eigensystem package and linear algebra package routines was created. Since then, SVD has gained attraction and been widely applied in various scenarios, such as recommendation systems and principal component analyses. Federated SVD has recently emerged, where different parties could collaboratively compute SVD without exchanging raw data. Besides its inherited privacy protection, noise...
PEA: Practical private epistasis analysis using MPC
Kay Hamacher, Tobias Kussel, Thomas Schneider, Oleksandr Tkachenko
Applications
Due to the significant drop in prices for genome sequencing in the last decade, genome databases were constantly growing. This enabled genome analyses such as Genome-Wide Association Studies (GWAS) that study associations between a gene and a disease and allow to improve medical treatment. However, GWAS fails at the analysis of complex diseases caused by non-linear gene-gene interactions such as sporadic breast cancer or type 2 diabetes. Epistasis Analysis (EA) is a more powerful approach...
Ibex: Privacy-preserving ad conversion tracking and bidding (full version)
Ke Zhong, Yiping Ma, Sebastian Angel
Applications
This paper introduces Ibex, an advertising system that reduces the amount of data that is collected on users while still allowing advertisers to bid on real-time ad auctions and measure the effectiveness of their ad campaigns. Specifically, Ibex addresses an issue in recent proposals such as Google’s Privacy Sandbox Topics API in which browsers send information about topics that are of interest to a user to advertisers and demand-side platforms (DSPs). DSPs use this information to (1)...
Distributed, Private, Sparse Histograms in the Two-Server Model
James Bell, Adria Gascon, Badih Ghazi, Ravi Kumar, Pasin Manurangsi, Mariana Raykova, Phillipp Schoppmann
Cryptographic protocols
We consider the computation of sparse, $(\varepsilon, \delta)$-differentially private~(DP) histograms in the two-server model of secure multi-party computation~(MPC), which has recently gained traction in the context of privacy-preserving measurements of aggregate user data.
We introduce protocols that enable two semi-honest non-colluding servers to compute histograms over the data held by multiple users, while only learning a private view of the data. Our solution achieves the same...
The State of the Union: Union-Only Signatures for Data Aggregation
Diego F. Aranha, Felix Engelmann, Sebastian Kolby, Sophia Yakoubov
Public-key cryptography
A union-only signature (UOS) scheme (informally introduced by Johnson et al. at CT-RSA 2002) allows signers to sign sets of messages in such a way that (1) any third party can merge two signatures to derive a signature on the union of the message sets, and (2) no adversary, given a signature on some set, can derive a valid signature on any strict subset of that set (unless it has seen such a signature already).
Johnson et al. originally posed building a UOS as an open problem. In this...
TERSE: Tiny Encryptions and Really Speedy Execution for Post-Quantum Private Stream Aggregation
Jonathan Takeshita, Zachariah Carmichael, Ryan Karl, Taeho Jung
Cryptographic protocols
The massive scale and performance demands of privacy-preserving data aggregation make integration of security and privacy difficult. Traditional tools in private computing are not well-suited to handle these challenges, especially for more limited client devices. Efficient primitives and protocols for secure and private data aggregation are a promising approach for private data analytics with resource-constrained devices. However, even such efficient primitives may be much slower than...
Fully Privacy-Preserving Federated Representation Learning via Secure Embedding Aggregation
Jiaxiang Tang, Jinbao Zhu, Songze Li, Kai Zhang, Lichao Sun
Cryptographic protocols
We consider a federated representation learning framework, where with the assistance of a central server, a group of $N$ distributed clients train collaboratively over their private data, for the representations (or embeddings) of a set of entities (e.g., users in a social network). Under this framework, for the key step of aggregating local embeddings trained at the clients in a private manner, we develop a secure embedding aggregation protocol named SecEA, which provides...
2022/390
Last updated: 2022-06-17
An Efficient and Robust Multidimensional Data Aggregation Scheme for Smart Grid Based on Blockchain
Lin You, Xinhua Zhang, Gengran Hu, Longbo Han
Applications
In order to analyze real-time power data without revealing user's privacy, privacy-preserving data aggregation has been extensively researched in smart grid. However, most of the existing schemes either have too much computation overhead and cannot achieve dynamic users, or require a trusted center. In this paper, we propose an efficient and robust multidimensional data aggregation scheme based on blockchain. In our scheme, a leader election algorithm in Raft protocol is used to select a...
Verifiably Distributed Multi-User Secret Sharing schemes
Likang Lu, Jianzhu Lu
Applications
Distributed secret sharing techniques, where a specific secret is encoded into its shares which are conveyed to the IoT device or
its user via storage nodes, are considered. A verifiably distributed secret sharing (VDSS) provides a way for a legitimate user to verify the secret he reconstructs through the downloaded shares while the secrecy condition is satisfied in a weak or a perfect sense. This article examines the impact of minimizing verification information in a VDSS on the...
The Power of the Differentially Oblivious Shuffle in Distributed Privacy Mechanisms
Mingxun Zhou, Elaine Shi
Cryptographic protocols
The shuffle model has been extensively investigated in the distributed differential privacy (DP) literature. For a class of useful computational tasks, the shuffle model allows us to achieve privacy-utility tradeoff similar to those in the central model, while shifting the trust from a central data curator to a ``trusted shuffle'' which can be implemented through either trusted hardware or cryptography. Very recently, several works explored cryptographic instantiations of
a new type of...
Precio: Private Aggregate Measurement via Oblivious Shuffling
F. Betül Durak, Chenkai Weng, Erik Anderson, Kim Laine, Melissa Chase
Cryptographic protocols
We introduce Precio, a new secure aggregation method for computing layered histograms and sums over secret shared data in a client-server setting.
Precio is motivated by ad conversion measurement scenarios, where online advertisers and ad networks want to measure the performance of ad campaigns without requiring privacy-invasive techniques, such as third-party cookies.
Precio has linear (time and communication) complexity in the number of data points and guarantees differentially private...
A General Framework of Homomorphic Encryption for Multiple Parties with Non-Interactive Key-Aggregation
Hyesun Kwak, Dongwon Lee, Yongsoo Song, Sameer Wagh
Public-key cryptography
Homomorphic Encryption (HE) is a useful primitive for secure computation, but it is not generally applicable when multiple parties are involved, as the authority is solely concentrated in a single party, the secret key owner.
To solve this issue, several variants of HE have emerged in the context of multiparty setting, resulting in two major lines of work -- Multi-Party HE (MPHE) and Multi-Key HE (MKHE).
In short, MPHEs tend to be more efficient, but all parties should be specified at the...
Masquerade: Verifiable Multi-Party Aggregation with Secure Multiplicative Commitments
Dimitris Mouris, Nektarios Georgios Tsoutsos
Cryptographic protocols
In crowd-sourced data aggregation, participants share their data points with curators. However, the lack of privacy guarantees may discourage participation, which motivates the need for privacy-preserving aggregation protocols. Unfortunately, existing solutions do not support public auditing without revealing the participants' data. In real-world applications, there is a need for public verifiability (i.e., verifying the protocol correctness) while preserving the privacy of the participants'...
GPS: Integration of Graphene, PALISADE, and SGX for Large-scale Aggregations of Distributed Data
Jonathan Takeshita, Colin McKechney, Justin Pajak, Antonis Papadimitriou, Ryan Karl, Taeho Jung
Implementation
Secure computing methods such as fully homomorphic encryption and hardware solutions such as Intel Software Guard Extension (SGX) have been applied to provide security for user input in privacy-oriented computation outsourcing. Fully homomorphic encryption is amenable to parallelization and hardware acceleration to improve its scalability and latency, but is limited in the complexity of functions it can efficiently evaluate. SGX is capable of arbitrarily complex calculations, but due to...
OmniLytics: A Blockchain-based Secure Data Market for Decentralized Machine Learning
Jiacheng Liang, Songze Li, Wensi Jiang, Bochuan Cao, Chaoyang He
Applications
We propose OmniLytics, a blockchain-based secure data trading marketplace for machine learning applications. Utilizing OmniLytics, many distributed data owners can contribute their private data to collectively train an ML model requested by some model owners, and receive compensation for data contribution. OmniLytics enables such model training while simultaneously providing 1) model security against curious data owners; 2) data security against the curious model and data owners; 3)...
Fragment and Forge: Breaking Wi-Fi Through Frame Aggregation and Fragmentation
Mathy Vanhoef
Implementation
In this paper, we present three design flaws in the 802.11 standard that underpins Wi-Fi. One design flaw is in the frame aggregation functionality, and another two are in the frame fragmentation functionality. These design flaws enable an adversary to forge encrypted frames in various ways, which in turn enables exfiltration of sensitive data. We also discovered common implementation flaws related to aggregation and fragmentation, which further worsen the impact of our attacks. Our results...
Non-Interactive, Secure Verifiable Aggregation for Decentralized, Privacy-Preserving Learning
Carlo Brunetta, Georgia Tsaloli, Bei Liang, Gustavo Banegas, Aikaterini Mitrokotsa
We propose a novel primitive called NIVA that allows the distributed aggregation of multiple users' secret inputs by multiple untrusted servers.
The returned aggregation result can be publicly verified in a non-interactive way, i.e. the users are not required to participate in the aggregation except for providing their secret inputs.
NIVA allows the secure computation of the sum of a large amount of users' data and can be employed, for example, in the federated learning setting in order to...
Securing Parallel-chain Protocols under Variable Mining Power
Xuechao Wang, Viswa Virinchi Muppirala, Lei Yang, Sreeram Kannan, Pramod Viswanath
Applications
Several emerging proof-of-work (PoW) blockchain protocols rely on a “parallel-chain” architecture for scaling, where instead of a single chain, multiple chains are run in parallel and aggregated. A key requirement of practical PoW blockchains is to adapt to mining power variations over time (Bitcoin’s total mining power has increased by a $10^{14}$ factor over the decade). In this paper, we consider the design of provably secure parallel-chain protocols which can adapt to such mining power...
Prio : Privacy Preserving Aggregate Statistics via Boolean Shares
Surya Addanki, Kevin Garbe, Eli Jaffe, Rafail Ostrovsky, Antigoni Polychroniadou
Cryptographic protocols
This paper introduces Prio , a privacy-preserving system for the collection of aggregate statistics, with the same model and goals in mind as the original and highly influential Prio paper by Henry Corrigan-Gibbs and Dan Boneh (USENIX 2017). As in the original Prio, each client holds a private data value (e.g. number of visits to a particular website) and a small set of servers privately compute statistical functions over the set of client values (e.g. the ...
Cryptonomial: A Framework for Private Time-Series Polynomial Calculations
Ryan Karl, Jonathan Takeshita, Alamin Mohammed, Aaron Striegel, Taeho Jung
Cryptographic protocols
In modern times, data collected from multi-user distributed applications must be analyzed on a massive scale to support critical business objectives. While analytics often requires the use of personal data, it may compromise user privacy expectations if this analysis is conducted over plaintext data. Private Stream Aggregation (PSA) allows for the aggregation of time-series data, while still providing strong privacy guarantees, and is significantly more efficient over a network than related...
CryptoGram: Fast Private Calculations of Histograms over Multiple Users’ Inputs
Ryan Karl, Jonathan Takeshita, Alamin Mohammed, Aaron Striegel, Taeho Jung
Cryptographic protocols
Histograms have a large variety of useful applications in data analysis, e.g., tracking the spread of diseases and analyzing public health issues. However, most data analysis techniques used in practice operate over plaintext data, putting the privacy of users’ data at risk. We consider the problem of allowing an untrusted aggregator to privately compute a histogram over multiple users’ private inputs (e.g., number of contacts at a place) without learning anything other than the final...
SAFELearn: Secure Aggregation for private FEderated Learning
Hossein Fereidooni, Samuel Marchal, Markus Miettinen, Azalia Mirhoseini, Helen Möllering, Thien Duc Nguyen, Phillip Rieger, Ahmad Reza Sadeghi, Thomas Schneider, Hossein Yalame, Shaza Zeitouni
Applications
Federated learning (FL) is an emerging distributed machine learning paradigm which addresses critical data privacy issues in machine learning by enabling clients, using an aggregation server (aggregator), to jointly train a global model without revealing their training data. Thereby, it improves not only privacy but is also efficient as it uses the computation power and data of potentially millions of clients for training in parallel.
However, FL is vulnerable to so-called inference attacks...
Privacy-Preserving Video Classification with Convolutional Neural Networks
Sikha Pentyala, Rafael Dowsley, Martine De Cock
Cryptographic protocols
Many video classification applications require access to personal data, thereby posing an invasive security risk to the users' privacy. We propose a privacy-preserving implementation of single-frame method based video classification with convolutional neural networks that allows a party to infer a label from a video without necessitating the video owner to disclose their video to other entities in an unencrypted manner. Similarly, our approach removes the requirement of the classifier owner...
Private Stream Aggregation from Labeled Secret Sharing Schemes
Hendrik Waldner, Tilen Marc, Miha Stopar, Michel Abdalla
Foundations
The concept of private stream aggregation (PSA) has been proposed by Shi et al. (NDSS 2011) to allow for data analysis in a privacy-preserving manner. In this work, we introduce the notion of labeled secret sharing (LaSS) schemes and show how to use it to construct PSA schemes. We also show how to realize LaSS using pseudorandom functions or alternatively with a hash function modeled as a random oracle and how it can be used to construct PSA schemes. Additionally, we revisit the security...
EPISODE: Efficient Privacy-PreservIng Similar Sequence Queries on Outsourced Genomic DatabasEs
Thomas Schneider, Oleksandr Tkachenko
Applications
Nowadays, genomic sequencing has become much more affordable for many people and, thus, many people own their genomic data in a digital format. Having paid for genomic sequencing, they want to make use of their data for different tasks that are possible only using genomics, and they share their data with third parties to achieve these tasks, e.g., to find their relatives in a genomic database. As a consequence, more genomic data get collected worldwide. The upside of the data collection is...
FLAME: Taming Backdoors in Federated Learning
Thien Duc Nguyen, Phillip Rieger, Huili Chen, Hossein Yalame, Helen Möllering, Hossein Fereidooni, Samuel Marchal, Markus Miettinen, Azalia Mirhoseini, Shaza Zeitouni, Farinaz Koushanfar, Ahmad-Reza Sadeghi, Thomas Schneider
Applications
Federated Learning (FL) is a collaborative machine learning approach allowing participants to jointly train a model without having to share their private, potentially sensitive local datasets with others. Despite its benefits, FL is vulnerable to so-called backdoor attacks, in which an adversary injects manipulated model updates into the federated model aggregation process so that the resulting model will provide targeted false predictions for specific adversary-chosen inputs. Proposed...
SLAP: Simple Lattice-Based Private Stream Aggregation Protocol
Jonathan Takeshita, Ryan Karl, Ting Gong, Taeho Jung
Cryptographic protocols
Private Stream Aggregation (PSA) protocols allow for the secure aggregation of time-series data, affording security and privacy to users' private data, with significantly better efficiency than general secure computation such as homomorphic encryption, multiparty computation, and secure hardware based approaches.
Earlier PSA protocols face limitations including needless complexity, a lack of post-quantum security, or other practical issues. In this work, we present SLAP, a Simple...
Function Secret Sharing for PSI-CA: With Applications to Private Contact Tracing
Samuel Dittmer, Yuval Ishai, Steve Lu, Rafail Ostrovsky, Mohamed Elsabagh, Nikolaos Kiourtis, Brian Schulte, Angelos Stavrou
Cryptographic protocols
In this work we describe a token-based solution to Contact Tracing via Distributed Point Functions (DPF) and, more generally, Function Secret Sharing (FSS). The key idea behind the solution is that FSS natively supports secure keyword search on raw sets of keywords without a need for processing the keyword sets via a data structure for set membership. Furthermore, the FSS functionality enables adding up numerical payloads associated with multiple matches without additional interaction. These...
Cryptonite: A Framework for Flexible Time-Series Secure Aggregation with Online Fault Tolerance
Ryan Karl, Jonathan Takeshita, Nirajan Koirla, Taeho Jung
Cryptographic protocols
Private stream aggregation (PSA) allows an untrusted data aggregator to compute statistics over a set of multiple participants' data while ensuring the data remains private. Existing works rely on a trusted third party to enable an aggregator to achieve fault tolerance, that requires interactive recovery, but in the real world this may not be practical or secure. We develop a new formal framework for PSA that accounts for user faults, and can support non-interactive recovery, while still...
2020/1537
Last updated: 2020-12-15
Comments on “ Multi Recipient Aggregate Signcryption Scheme Based on Elliptic Curve”
Nizamud Din, Abdul Waheed, Nasir Saeed
Public-key cryptography
Aggregate signcryption combines the functionalities of aggregate signature and encryption. Very recently, Zia & Ali [1] (Wireless Personal Communications, https://doi.org/10.1007/s11277-020-07637-z) proposed an elliptic curve cryptography (ECC) based multi-recipient aggregate signcryption scheme. The authors claimed that their scheme is correct, efficient, and secure against known attacks. However, by this comment, we show that their scheme is incorrect and the receiver(s) is unable to...
Halo Infinite: Recursive zk-SNARKs from any Additive Polynomial Commitment Scheme
Dan Boneh, Justin Drake, Ben Fisch, Ariel Gabizon
Cryptographic protocols
Polynomial commitment schemes (PCS) have recently been in the spotlight for their key role in building SNARKs. A PCS provides the ability to commit to a polynomial over a finite field and prove its evaluation at points. A succinct PCS has commitment and evaluation proof size sublinear in the degree of the polynomial. An efficient PCS has sublinear proof verification. Any efficient and succinct PCS can be used to construct a SNARK with similar security and efficiency characteristics (in the...
Federated Learning in Side-Channel Analysis
Huanyu Wang, Elena Dubrova
Recently introduced federated learning is an attractive framework for the distributed training of deep learning models with thousands of participants. However, it can potentially be used with malicious intent. For example, adversaries can use their smartphones to jointly train a classifier for extracting secret keys from the smartphones' SIM cards without sharing their side-channel measurements with each other. With federated learning, each participant might be able to create a strong model...
Toward Comparable Homomorphic Encryption for Crowd-sensing Network
Daxin Huang, Qingqing Gan, Xiaoming Wang, Chengpeng Huang, Yijian Lin
Cryptographic protocols
As a popular paradigm, crowd-sensing network emerges to achieve sensory data collection and task allocation to mobile users. On one hand these sensory data could be private and sensitive, and on the other hand, data transmission separately could incur heavy communication overhead. Fortunately, the technique of homomorphic encryption (HE) allows the addictive and/or multiplicative operations over the encrypted data as well as privacy protection. Therefore, several data aggregation schemes...
A Generalization of Paillier's Public-Key System With Fast Decryption
Ying Guo, Zhenfu Cao, Xiaolei Dong
Public-key cryptography
Paillier's scheme is a homomorphic public key encryption scheme which is widely used in practical. For instance, Paillier's scheme can be used in the data aggregation in smart grid. Damg$\mathring{a}$rd and Jurik generalized Paillier's scheme to reduce the ciphertext expansion factor. However, the decryption scheme of Damg$\mathring{a}$rd and Jurik's scheme is more complicated than Paillier's original scheme. In this paper, we propose a new generalization of Paillier's scheme and all the...
Aggregatable Subvector Commitments for Stateless Cryptocurrencies
Alin Tomescu, Ittai Abraham, Vitalik Buterin, Justin Drake, Dankrad Feist, Dmitry Khovratovich
Public-key cryptography
An aggregatable subvector commitment (aSVC) scheme is a vector commitment (VC) scheme that can aggregate multiple proofs into a single, small subvector proof. In this paper, we formalize aSVCs and give a construction from constant-sized polynomial commitments. Our construction is unique in that it has linear-sized public parameters, it can compute all constant-sized proofs in quasilinear time, it updates proofs in constant time and it can aggregate multiple proofs into a constant-sized...
Privately Connecting Mobility to Infectious Diseases via Applied Cryptography
Alexandros Bampoulidis, Alessandro Bruni, Lukas Helminger, Daniel Kales, Christian Rechberger, Roman Walch
Cryptographic protocols
Recent work has shown that cell phone mobility data has the unique potential to create accurate models for human mobility and consequently the spread of infected diseases. While prior studies have exclusively relied on a mobile network operator's subscribers' aggregated data in modelling disease dynamics, it may be preferable to contemplate aggregated mobility data of infected individuals only. Clearly, naively linking mobile phone data with health records would violate privacy by either...
BioLocker: A Practical Biometric Authentication Mechanism based on 3D Fingervein
F. Betül Durak, Loïs Huguenin-Dumittan, Serge Vaudenay
Applications
We design a consecution of protocols which allows organizations to have secure strong access control of their users to their desktop machines based on biometry. It provides both strong secure authentication and privacy. Moreover, our mechanism allows the system admins to grant a various level of access to their end-users by fine tuning access control policy. Our system implements privacy-by-design. It separates biometric data from identity information. It is practical: we fully implemented...
Dynamic Decentralized Functional Encryption
Jérémy Chotard, Edouard Dufour-Sans, Romain Gay, Duong Hieu Phan, David Pointcheval
Public-key cryptography
We introduce Dynamic Decentralized Functional Encryption (DDFE), a generalization of Functional Encryption which allows multiple users to join the system dynamically, without relying on a trusted third party or on expensive and interactive Multi-Party Computation protocols.
This notion subsumes existing multi-user extensions of Functional Encryption, such as Multi-Input, Multi-Client, and Ad Hoc Multi-Input Functional Encryption.
We define and construct schemes for various functionalities...
Turbo-Aggregate: Breaking the Quadratic Aggregation Barrier in Secure Federated Learning
Jinhyun So, Basak Guler, A. Salman Avestimehr
Cryptographic protocols
Federated learning is gaining significant interests as it enables model training over a large volume of data that is distributedly stored over many users, while protecting the privacy of the individual users. However, a major bottleneck in scaling federated learning to a large number of users is the overhead of secure model aggregation across many users. In fact, the overhead of state-of-the-art protocols for secure model aggregation grows quadratically with the number of users. We propose a...
Differentially-Private Multi-Party Sketching for Large-Scale Statistics
Seung Geol Choi, Dana Dachman-Soled, Mukul Kulkarni, Arkady Yerukhimovich
Cryptographic protocols
We consider a scenario where multiple organizations holding large amounts of sensitive data from their users wish to compute aggregate statistics on this data while protecting the privacy of individual users. To support large-scale analytics we investigate how this privacy can be provided for the case of sketching algorithms running in time sub-linear of the input size.
We begin with the well-known LogLog sketch for computing the number of unique elements in a data stream. We show that this...
PrivFL: Practical Privacy-preserving Federated Regressions on High-dimensional Data over Mobile Networks
Kalikinkar Mandal, Guang Gong
Cryptographic protocols
Federated Learning (FL) enables a large number of users to jointly learn a shared machine learning (ML) model, coordinated by a centralized server, where the data is distributed across multiple devices. This approach enables the server or users to train and learn an ML model using gradient descent, while keeping all the training data on users' devices. We consider training an ML model over a mobile network where user dropout is a common phenomenon. Although federated learning was aimed at...
There Are 10 Types of Vectors (and Polynomials): Efficient Zero-Knowledge Proofs of "One-Hotness" via Polynomials with One Zero
William Black, Ryan Henry
Cryptographic protocols
We present a new 4-move special honest-verifier zero-knowledge proof of knowledge system for proving that a vector of Pedersen commitments opens to a so-called "one-hot" vector (i.e., to a vector from the standard orthonormal basis) from $\mathbb{Z}_p^n$. The need for such proofs arises in the contexts of symmetric private information retrieval (SPIR), end-to-end verifiable voting (E2E), and privacy-preserving data aggregation and analytics, among others. The key insight underlying the new...
From Usability to Secure Computing and Back Again
Lucy Qin, Andrei Lapets, Frederick Jansen, Peter Flockhart, Kinan Dak Albab, Ira Globus-Harris, Shannon Roberts, Mayank Varia
Applications
Secure multi-party computation (MPC) allows multiple parties to jointly compute the output of a function while preserving the privacy of any individual party's inputs to that function. As MPC protocols transition from research prototypes to real-world applications, the usability of MPC-enabled applications is increasingly critical to their successful deployment and wide adoption.
Our Web-MPC platform, designed with a focus on usability, has been deployed for privacy-preserving data...
Ad Hoc Multi-Input Functional Encryption
Shweta Agrawal, Michael Clear, Ophir Frieder, Sanjam Garg, Adam O’Neill, Justin Thaler
Cryptographic protocols
Consider sources that supply sensitive data to an aggregator. Standard encryption only hides the data from eavesdroppers, but using specialized encryption one can hope to hide the data (to the extent possible) from the aggregator itself. For flexibility and security, we envision schemes that allow sources to supply encrypted data, such that at any point a dynamically chosen subset of sources can allow an agreed-upon joint function of their data to be computed by the aggregator. A primitive...
Semi-parallel Logistic Regression for GWAS on Encrypted Data
Miran Kim, Yongsoo Song, Baiyu Li, Daniele Micciancio
Applications
The sharing of biomedical data is crucial to enable scientific discoveries across institutions and improve health care. For example, genome-wide association studies (GWAS) based on a large number of samples can identify disease-causing genetic variants. The privacy concern, however, has become a major hurdle for data management and utilization. Homomorphic encryption is one of the most powerful cryptographic primitives which can address the privacy and security issues. It supports the...
Modern data analytics requires computing functions on streams of data points from many users that are challenging to calculate, due to both the high scale and nontrivial nature of the computation at hand. The need for data privacy complicates this matter further, as general-purpose privacy-enhancing technologies face limitations in at least scalability or utility. Existing work has attempted to improve this by designing purpose-built protocols for the use case of Private Stream Aggregation;...
Secure Multi-party Computation (MPC) provides a promising solution for privacy-preserving multi-source data analytics. However, existing MPC-based collaborative analytics systems (MCASs) have unsatisfying performance for scenarios with dynamic databases. Naively running an MCAS on a dynamic database would lead to significant redundant costs and raise performance concerns, due to the substantial duplicate contents between the pre-updating and post-updating databases. In this paper, we...
Federated Learning (FL) enables multiple clients to collaboratively train a machine learning model while keeping their data private, eliminating the need for data sharing. Two common approaches to secure aggregation (SA) in FL are the single-aggregator and multiple-aggregator models. Existing multiple-aggregator protocols such as Prio (NSDI 2017), Prio (SCN 2022), Elsa (S\&P 2023) either offer robustness only in the presence of semi-honest servers or provide security without...
Light clients implement a simple solution for Bitcoin's scalability problem, as they do not store the entire blockchain but only the state of particular addresses of interest. To be able to keep track of the updated state of their addresses, light clients rely on full nodes to provide them with the required information. To do so, they must reveal information about the addresses they are interested in. This paper studies the two most common light client implementations, SPV and Neutrino with...
Zero-knowledge Succinct Non-interactive ARguments of Knowledge (zkSNARKs) allow a prover to convince a verifier of the correct execution of a large computation in private and easily-verifiable manner. These properties make zkSNARKs a powerful tool for adding accountability, scalability, and privacy to numerous systems such as blockchains and verifiable key directories. Unfortunately, existing zkSNARKs are unable to scale to large computations due to time and space complexity requirements...
Fully homomorphic encryption (FHE) based database outsourcing is drawing growing research interests. At its current state, there exist two primary obstacles against FHE-based encrypted databases (EDBs): i) low data precision, and ii) high computational latency. To tackle the precision-performance dilemma, we introduce ArcEDB, a novel FHE-based SQL evaluation infrastructure that simultaneously achieves high data precision and fast query evaluation. Based on a set of new plaintext encoding...
Local differential privacy (LDP) is an efficient solution for providing privacy to client's sensitive data while simultaneously releasing aggregate statistics without relying on a trusted central server (aggregator) as in the central model of differential privacy. The shuffle model with LDP provides an additional layer of privacy, by disconnecting the link between clients and the aggregator, further improving the utility of LDP. However, LDP has been shown to be vulnerable to malicious...
Enhancing privacy in federal learning (FL) without considering robustness can create an open door for attacks such as poisoning attacks on the FL process. Thus, addressing both the privacy and security aspects simultaneously becomes vital. Although, there are a few solutions addressing both privacy and security in the literature in recent years, they have some drawbacks such as requiring two non-colluding servers, heavy cryptographic operations, or peer-to-peer communication topology. In...
Collaborative graph learning represents a learning paradigm where multiple parties jointly train a graph neural network (GNN) using their own proprietary graph data. To honor the data privacy of all parties, existing solutions for collaborative graph learning are either based on federated learning (FL) or secure machine learning (SML). Although promising in terms of efficiency and scalability due to their distributed training scheme, FL-based approaches fall short in providing provable...
A common drawback of secure vector summation protocols in the single-server model is that they impose at least one synchronization point between all clients contributing to the aggregation. This results in clients waiting on each other to advance through the rounds of the protocol, leading to large latency even if the protocol is computationally efficient. In this paper we propose protocols in the single-server model where clients contributing data to the aggregation send a single message...
In many real-world scenarios, there are cases where a client wishes to check if a data element they hold is included in a set segmented across a large number of data holders. To protect user privacy, the client’s query and the data holders’ sets should remain encrypted throughout the whole process. Prior work on Private Set Intersection (PSI), Multi-Party PSI (MPSI), Private Membership Test (PMT), and Oblivious RAM (ORAM) falls short in this scenario in many ways. They either require...
We present a framework for privacy-preserving streaming algorithms which combine the memory-efficiency of streaming algorithms with strong privacy guarantees. These algorithms enable some number of servers to compute aggregate statistics efficiently on large quantities of user data without learning the user's inputs. While there exists limited prior work that fits within our model, our work is the first to formally define a general framework, interpret existing methods within this general...
We present Whisper, a system for privacy-preserving collection of aggregate statistics. Like prior systems, a Whisper deployment consists of a small set of non-colluding servers; these servers compute aggregate statistics over data from a large number of users without learning the data of any individual user. Whisper’s main contribution is that its server- to-server communication cost and its server-side storage costs scale sublinearly with the total number of users. In particular, prior...
The Gradient Boosting Decision Tree (GBDT) is a well-known machine learning algorithm, which achieves high performance and outstanding interpretability in real-world scenes such as fraud detection, online marketing and risk management. Meanwhile, two data owners can jointly train a GBDT model without disclosing their private dataset by executing secure Multi-Party Computation (MPC) protocols. In this work, we propose NodeGuard, a highly efficient two party computation (2PC) framework for...
A cryptographic accumulator is a compact data structure for representing a set of elements coming from some domain. It allows for a compact proof of membership and, in the case of a universal accumulator, non-membership of an element x in the data structure. A dynamic accumulator, furthermore, allows elements to be added to and deleted from the accumulator. Previously known RSA-based dynamic accumulators were too slow in practice because they required that an element in the domain be...
We propose Reckle trees, a new vector commitment based on succinct RECursive arguments and MerKLE trees. Reckle trees' distinguishing feature is their support for succinct batch proofs that are updatable - enabling new applications in the blockchain setting where a proof needs to be computed and efficiently maintained over a moving stream of blocks. Our technical approach is based on embedding the computation of the batch hash inside the recursive Merkle verification via a hash-based...
Banks publish daily a list of available securities/assets (axe list) to selected clients to help them effectively locate Long (buy) or Short (sell) trades at reduced financing rates. This reduces costs for the bank, as the list aggregates the bank's internal firm inventory per asset for all clients of long as well as short trades. However, this is somewhat problematic: (1) the bank's inventory is revealed; (2) trades of clients who contribute to the aggregated list, particularly those deemed...
Private messaging platforms provide strong protection against platform eavesdropping, but malicious users can use privacy as cover for spreading abuse and misinformation. In an attempt to identify the sources of misinformation on private platforms, researchers have proposed mechanisms to trace back the source of a user-reported message (CCS '19,'21). Unfortunately, the threat model considered by initial proposals allowed a single user to compromise the privacy of another user whose...
Torus Fully Homomorphic Encryption (TFHE) is a probabilistic cryptosytem over the real torus which allows one to operate directly on encrypted data without first decrypting them. We present an aggregation protocol based on a variant of TFHE for computing the sum of sensitive data, working only with the corresponding ciphertexts. Our scheme is an ideal choice for a system of smart meters - electronic devices for measuring energy consumption - that demands consumers’ privacy. In contrast to...
Insight into user experience and behavior is critical to the success of large software systems and web services. Gaining such insights, while preserving user privacy, is a significant challenge. Recent advancements in multi-party computation have made it practical to securely compute aggregates over secret shared data. Two such protocols have emerged as candidates for standardization at the IETF: Prio (NSDI 2017) for general-purpose statistics; and Poplar (IEEE S&P 2021) for heavy hitters,...
We implement a secure platform for statistical analysis over multiple organizations and multiple datasets. We provide a suite of protocols for different variants of JOIN and GROUP-BY operations. JOIN allows combining data from multiple datasets based on a common column. GROUP-BY allows aggregating rows that have the same values in a column or a set of columns, and then apply some aggregation summary on the rows (such as sum, count, median, etc.). Both operations are fundamental tools for...
Federated Learning (FL) accomplishes collaborative model training without the need to share local training data. However, existing FL aggregation approaches suffer from inefficiency, privacy vulnerabilities, and neglect of poisoning attacks, severely impacting the overall performance and reliability of model training. In order to address these challenges, we propose SuperFL, an efficient two-server aggregation scheme that is both privacy preserving and secure against poisoning attacks. The...
We propose a solution for user privacy-oriented privacy-preserving data aggregation with multiple data customers. Most existing state-of-the-art approaches present too much importance on performance efficiency and seem to ignore privacy properties except for input privacy. Most solutions for data aggregation do not generally discuss the users’ birthright, namely their privacy for their own data control and anonymity when they search for something on the browser or volunteer to participate in...
We consider how to design an anonymous data collection protocol that enforces compliance rules. Imagine that each client contributes multiple data items (e.g., votes, location crumbs, or secret shares of its input) to an anonymous network, which mixes all clients' data items so that the receiver cannot determine which data items belong to the same user. Now, each user must prove to an auditor that the set it contributed satisfies a compliance predicate, without identifying which items it...
Recent advances in SNARK recursion and incrementally-verifiable computation are vast, but most of the efforts seem to be focused on a particular design goal - proving the result of a large computation known completely in advance. There are other possible applications, requiring different design tradeoffs. Particularly interesting direction is a case with a swarm of collaborating provers, communicating over a peer-to-peer network - which requires to also optimize the amount of data...
As various industries and government agencies increasingly seek to build quantum computers, the development of post-quantum constructions for different primitives becomes crucial. Lattice-based cryptography is one of the top candidates for constructing quantum-resistant primitives. In this paper, we propose a decentralized Private Stream Aggregation (PSA) protocol based on the Learning with Errors (LWE) problem. PSA allows secure aggregation of time-series data over multiple users without...
How to achieve distributed differential privacy (DP) without a trusted central party is of great interest in both theory and practice. Recently, the shuffle model has attracted much attention. Unlike the local DP model in which the users send randomized data directly to the data collector/analyzer, in the shuffle model an intermediate untrusted shuffler is introduced to randomly permute the data, which have already been randomized by the users, before they reach the analyzer. The most...
Most existing accumulation/folding schemes focus on implementing Incrementally Verifiable Computation (IVC). Proof-carrying Data (PCD), as a generalization of IVC, enables sequential computation performance by multiple distrusting parties, thereby offering a robust primitive tool in real-world applications. However, building non-uniform PCD from folding schemes faces many technical challenges, particularly in handling cross items and preserving zero knowledge. This paper introduces...
Central banks around the world are actively exploring the issuance of retail central bank digital currency (rCBDC), which is widely seen as a key upgrade of the monetary system in the 21st century. However, privacy concerns are the main impediment to rCBDC’s development and roll-out. A central bank as the issuer of rCBDC would typically need to keep a digital ledger to record all the balances and transactions of citizens. These data, when combined with other data, could possibly disclose the...
As concerns are increasingly raised about data privacy, encrypted database management system (DBMS) based on fully homomorphic encryption (FHE) attracts increasing research attention, as FHE permits DBMS to be directly outsourced to cloud servers without revealing any plaintext data. However, the real-world deployment of FHE-based DBMS faces two main challenges: i) high computational latency, and ii) lack of elastic query processing capability, both of which stem from the inherent...
We propose a new compiler framework that automates code generation over multiple fully homomorphic encryption (FHE) schemes. While it was recently shown that algorithms combining multiple FHE schemes (e.g., CKKS and TFHE) achieve high execution efficiency and task utility at the same time, developing fast cross-scheme FHE algorithms for real-world applications generally require heavy hand-tuned optimizations by cryptographic experts, resulting in either high usability costs or low...
The recent advancements in deep learning have brought about significant changes in various aspects of people's lives. Meanwhile, these rapid developments have raised concerns about the legitimacy of the training process of deep neural networks. To protect the intellectual properties of AI developers, directly examining the training process by accessing the model parameters and training data is often prohibited for verifiers. In response to this challenge, we present zero-knowledge deep...
We present TVA, a multi-party computation (MPC) system for secure analytics on secret-shared time series data. TVA achieves strong security guarantees in the semi-honest and malicious settings, and high expressivity by enabling complex analytics on inputs with unordered and irregular timestamps. TVA is the first system to support arbitrary composition of oblivious window operators, keyed aggregations, and multiple filter predicates, while keeping all data attributes private, including record...
We present a blockchain scaling solution called Intmax2, which is a Zero-Knowledge rollup (ZK-rollup) protocol with stateless and permissionless block production, while minimizing the usage of data and computation on the underlying blockchain. Our architecture distinctly diverges from existing ZK-rollups since essentially all of the data and computational costs are shifted to the client-side as opposed to imposing heavy requirements on the block producers or the underlying Layer 1...
Given a collection of vectors $\mathbf{x}^{(1)},\dots,\mathbf{x}^{(n)} \in \{0,1\}^d$, the selection problem asks to report the index of an "approximately largest" entry in $\mathbf{x}=\sum_{j=1}^n \mathbf{x}^{(j)}$. Selection abstracts a host of problems; in machine learning it can be used for hyperparameter tuning, feature selection, or to model empirical risk minimization. We study selection under differential privacy, where a released index guarantees privacy for individual vectors....
We propose a novel protocol, Falkor, for secure aggregation for Federated Learning in the multi-server scenario based on masking of local models via a stream cipher based on AES in counter mode and accelerated by GPUs running on the aggregating servers. The protocol is resilient to client dropout and has reduced clients/servers communication cost by a factor equal to the number of aggregating servers (compared to the naïve baseline method). It scales simultaneously in the two major...
Computing the quantile of a massive data stream has been a crucial task in networking and data management. However, existing solutions assume a centralized model where one data owner has access to all data. In this paper, we put forward a study of secure quantile aggregation between private data streams, where data streams owned by different parties would like to obtain a quantile of the union of their data without revealing anything else about their inputs. To this end, we designed efficient...
In a vertical federated learning (VFL) system consisting of a central server and many distributed clients, the training data are vertically partitioned such that different features are privately stored on different clients. The problem of split VFL is to train a model split between the server and the clients. This paper aims to address two major challenges in split VFL: 1) performance degradation due to straggling clients during training; and 2) data and model privacy leakage from clients’...
Gradient Boosting Decision Tree (GBDT) and its variants are widely used in industry, due to their strong interpretability. Secure multi-party computation allows multiple data owners to compute a function jointly while keeping their input private. In this work, we present Squirrel, a two-party GBDT training framework on a vertically split dataset, where two data owners each hold different features of the same data samples. Squirrel is private against semi-honest adversaries, and no sensitive...
This paper introduces Flamingo, a system for secure aggregation of data across a large set of clients. In secure aggregation, a server sums up the private inputs of clients and obtains the result without learning anything about the individual inputs beyond what is implied by the final sum. Flamingo focuses on the multi-round setting found in federated learning in which many consecutive summations (averages) of model weights are performed to derive a good model. Previous protocols, such as...
Measuring people’s interactions that span multiple websites can provide unique insight that enables better products and improves people’s experiences, but directly observing people’s individual journeys creates privacy risks that conflict with the newly emerging privacy model for the web. We propose a protocol that uses the combination of multi-party computation and differential privacy that enables the processing of peoples’ data such that only aggregate measurements are revealed, strictly...
Joint computation on encrypted data is becoming increasingly crucial with the rise of cloud computing. In recent years, the development of multi-client functional encryption (MCFE) has made it possible to perform joint computation on private inputs, without any interaction. Well-settled solutions for linear functions have become efficient and secure, but there is still a shortcoming: if one user inputs incorrect data, the output of the function might become meaningless for all other users...
Private Stream Aggregation (PSA) schemes are efficient protocols for distributed data analytics. In a PSA scheme, a set of data producers can encrypt data for a central party so that it learns the sum of all encrypted values, but nothing about each individual value. Thus, a trusted aggregator is avoided. However, all known PSA schemes still require a trusted party for key generation. In this paper we propose the first PSA scheme that does not rely on a trusted party. We argue its security...
We introduce zkTree, a general framework for constructing a tree by recursively verifying children's zero-knowledge proofs (ZKPs) in a parent ZKP node, while enabling the retrieval of membership proofs for user-supplied zk proofs. We also outline a construction pipeline that allows zkTree to be built and verified on-chain with constant gas cost and low data processing pipeline overhead. By aggregating a large number of user proofs into a single root proof, zkTree makes ZKP on-chain...
Aggregate statistics derived from time-series data collected by individual users are extremely beneficial in diverse fields, such as e-health applications, IoT-based smart metering networks, and federated learning systems. Since user data are privacy-sensitive in many cases, the untrusted aggregator may only infer the aggregation without breaching individual privacy. To this aim, secure aggregation techniques have been extensively researched over the past years. However, most existing...
This paper proposes Prism, Private Verifiable Set Computation over Multi-Owner Outsourced Databases, a secret sharing based approach to compute private set operations (i.e., intersection and union), as well as aggregates over outsourced databases belonging to multiple owners. Prism enables data owners to pre-load the data onto non-colluding servers and exploits the additive and multiplicative properties of secret-shares to compute the above-listed operations in (at most) two rounds of...
The modern Internet is built on systems that incentivize collection of information about users. In order to minimize privacy loss, it is desirable to prevent these systems from collecting more information than is required for the application. The promise of multi-party computation is that data can be aggregated without revealing individual measurements to the data collector. This work offers a provable security treatment for "Verifiable Distributed Aggregation Functions (VDAFs)", a class of...
Privacy-preserving transaction systems on blockchain networks like Monero or Zcash provide complete transaction anonymity through cryptographic commitments or encryption. While this secures privacy, it inhibits the collection of statistical data, which current financial markets heavily rely on for economic and sociological research conducted by central banks, statistics bureaus, and research companies. Differential privacy techniques have been proposed to preserve individuals' privacy while...
Private heavy-hitters is a data-collection task where multiple clients possess private bit strings, and data-collection servers aim to identify the most popular strings without learning anything about the clients' inputs. In this work, we introduce PLASMA: a private analytics framework in the three-server setting that protects the privacy of honest clients and the correctness of the protocol against a coalition of malicious clients and a malicious server. Our core primitives are a...
Blockchain is a newly emerging technology, however, it has proven effective in many applications because it provides multiple advantages, mainly as it represents a trust system in which data is encrypted in a way that cannot be tampered with or forged. Because it contains many details such as smart contracts, consensus, authentication, etc. the blockchain is a fertile ground for researchers where they can continually improve previous versions of these concepts. This paper introduces a new...
We introduce a new notion of {\em multiverse threshold signatures} (MTS). In an MTS scheme, multiple universes -- each defined by a set of (possibly overlapping) signers, their weights, and a specific security threshold -- can co-exist. A universe can be (adaptively) created via a non-interactive asynchronous setup. Crucially, each party in the multiverse holds constant-sized keys and releases compact signatures with size and computation time both independent of the number of universes....
Secure cryptographic storage is one of the most important issues that both businesses and end-users take into account before moving their data to either centralized clouds or blockchain-based decen- tralized storage marketplace. Recent work [4 ] formalizes the notion of Proof of Storage-Time (PoSt) which enables storage servers to demonstrate non-interactive continuous availability of outsourced data in a publicly verifiable way. The work also proposes a stateful compact PoSt...
We focus on the problem of efficiently deploying a federated learning training task in a decentralized setting with multiple aggregators. To that end, we introduce a number of improvements and modifications to the recently proposed IPLS protocol. In particular, we relax its assumption for direct communication across participants, using instead indirect communication over a decentralized storage system, effectively turning it into a partially asynchronous protocol. Moreover, we secure it...
Attribute-based encryption (ABE) generalizes public-key encryption and enables fine-grained control to encrypted data. However, ABE upends the traditional trust model of public-key encryption by requiring a single trusted authority to issue decryption keys. If an adversary compromises the central authority and exfiltrates its secret key, then the adversary can decrypt every ciphertext in the system. This work introduces registered ABE, a primitive that allows users to generate secret keys...
PlonK is a prominent universal and updatable zk-SNARK for general circuit satisfiability. We present aPlonK, a variant of PlonK that reduces the proof size and verification time when multiple statements are proven in a batch. Both the aggregated proof size and the verification complexity of aPlonK are logarithmic in the number of aggregated statements. Our main building block, inspired by the techniques developed in SnarkPack (Gailly, Maller, Nitulescu, FC 2022), is a multi-polynomial...
Modern Singular Value Decomposition (SVD) computation dates back to the 1960s when the basis for the eigensystem package and linear algebra package routines was created. Since then, SVD has gained attraction and been widely applied in various scenarios, such as recommendation systems and principal component analyses. Federated SVD has recently emerged, where different parties could collaboratively compute SVD without exchanging raw data. Besides its inherited privacy protection, noise...
Due to the significant drop in prices for genome sequencing in the last decade, genome databases were constantly growing. This enabled genome analyses such as Genome-Wide Association Studies (GWAS) that study associations between a gene and a disease and allow to improve medical treatment. However, GWAS fails at the analysis of complex diseases caused by non-linear gene-gene interactions such as sporadic breast cancer or type 2 diabetes. Epistasis Analysis (EA) is a more powerful approach...
This paper introduces Ibex, an advertising system that reduces the amount of data that is collected on users while still allowing advertisers to bid on real-time ad auctions and measure the effectiveness of their ad campaigns. Specifically, Ibex addresses an issue in recent proposals such as Google’s Privacy Sandbox Topics API in which browsers send information about topics that are of interest to a user to advertisers and demand-side platforms (DSPs). DSPs use this information to (1)...
We consider the computation of sparse, $(\varepsilon, \delta)$-differentially private~(DP) histograms in the two-server model of secure multi-party computation~(MPC), which has recently gained traction in the context of privacy-preserving measurements of aggregate user data. We introduce protocols that enable two semi-honest non-colluding servers to compute histograms over the data held by multiple users, while only learning a private view of the data. Our solution achieves the same...
A union-only signature (UOS) scheme (informally introduced by Johnson et al. at CT-RSA 2002) allows signers to sign sets of messages in such a way that (1) any third party can merge two signatures to derive a signature on the union of the message sets, and (2) no adversary, given a signature on some set, can derive a valid signature on any strict subset of that set (unless it has seen such a signature already). Johnson et al. originally posed building a UOS as an open problem. In this...
The massive scale and performance demands of privacy-preserving data aggregation make integration of security and privacy difficult. Traditional tools in private computing are not well-suited to handle these challenges, especially for more limited client devices. Efficient primitives and protocols for secure and private data aggregation are a promising approach for private data analytics with resource-constrained devices. However, even such efficient primitives may be much slower than...
We consider a federated representation learning framework, where with the assistance of a central server, a group of $N$ distributed clients train collaboratively over their private data, for the representations (or embeddings) of a set of entities (e.g., users in a social network). Under this framework, for the key step of aggregating local embeddings trained at the clients in a private manner, we develop a secure embedding aggregation protocol named SecEA, which provides...
In order to analyze real-time power data without revealing user's privacy, privacy-preserving data aggregation has been extensively researched in smart grid. However, most of the existing schemes either have too much computation overhead and cannot achieve dynamic users, or require a trusted center. In this paper, we propose an efficient and robust multidimensional data aggregation scheme based on blockchain. In our scheme, a leader election algorithm in Raft protocol is used to select a...
Distributed secret sharing techniques, where a specific secret is encoded into its shares which are conveyed to the IoT device or its user via storage nodes, are considered. A verifiably distributed secret sharing (VDSS) provides a way for a legitimate user to verify the secret he reconstructs through the downloaded shares while the secrecy condition is satisfied in a weak or a perfect sense. This article examines the impact of minimizing verification information in a VDSS on the...
The shuffle model has been extensively investigated in the distributed differential privacy (DP) literature. For a class of useful computational tasks, the shuffle model allows us to achieve privacy-utility tradeoff similar to those in the central model, while shifting the trust from a central data curator to a ``trusted shuffle'' which can be implemented through either trusted hardware or cryptography. Very recently, several works explored cryptographic instantiations of a new type of...
We introduce Precio, a new secure aggregation method for computing layered histograms and sums over secret shared data in a client-server setting. Precio is motivated by ad conversion measurement scenarios, where online advertisers and ad networks want to measure the performance of ad campaigns without requiring privacy-invasive techniques, such as third-party cookies. Precio has linear (time and communication) complexity in the number of data points and guarantees differentially private...
Homomorphic Encryption (HE) is a useful primitive for secure computation, but it is not generally applicable when multiple parties are involved, as the authority is solely concentrated in a single party, the secret key owner. To solve this issue, several variants of HE have emerged in the context of multiparty setting, resulting in two major lines of work -- Multi-Party HE (MPHE) and Multi-Key HE (MKHE). In short, MPHEs tend to be more efficient, but all parties should be specified at the...
In crowd-sourced data aggregation, participants share their data points with curators. However, the lack of privacy guarantees may discourage participation, which motivates the need for privacy-preserving aggregation protocols. Unfortunately, existing solutions do not support public auditing without revealing the participants' data. In real-world applications, there is a need for public verifiability (i.e., verifying the protocol correctness) while preserving the privacy of the participants'...
Secure computing methods such as fully homomorphic encryption and hardware solutions such as Intel Software Guard Extension (SGX) have been applied to provide security for user input in privacy-oriented computation outsourcing. Fully homomorphic encryption is amenable to parallelization and hardware acceleration to improve its scalability and latency, but is limited in the complexity of functions it can efficiently evaluate. SGX is capable of arbitrarily complex calculations, but due to...
We propose OmniLytics, a blockchain-based secure data trading marketplace for machine learning applications. Utilizing OmniLytics, many distributed data owners can contribute their private data to collectively train an ML model requested by some model owners, and receive compensation for data contribution. OmniLytics enables such model training while simultaneously providing 1) model security against curious data owners; 2) data security against the curious model and data owners; 3)...
In this paper, we present three design flaws in the 802.11 standard that underpins Wi-Fi. One design flaw is in the frame aggregation functionality, and another two are in the frame fragmentation functionality. These design flaws enable an adversary to forge encrypted frames in various ways, which in turn enables exfiltration of sensitive data. We also discovered common implementation flaws related to aggregation and fragmentation, which further worsen the impact of our attacks. Our results...
We propose a novel primitive called NIVA that allows the distributed aggregation of multiple users' secret inputs by multiple untrusted servers. The returned aggregation result can be publicly verified in a non-interactive way, i.e. the users are not required to participate in the aggregation except for providing their secret inputs. NIVA allows the secure computation of the sum of a large amount of users' data and can be employed, for example, in the federated learning setting in order to...
Several emerging proof-of-work (PoW) blockchain protocols rely on a “parallel-chain” architecture for scaling, where instead of a single chain, multiple chains are run in parallel and aggregated. A key requirement of practical PoW blockchains is to adapt to mining power variations over time (Bitcoin’s total mining power has increased by a $10^{14}$ factor over the decade). In this paper, we consider the design of provably secure parallel-chain protocols which can adapt to such mining power...
This paper introduces Prio , a privacy-preserving system for the collection of aggregate statistics, with the same model and goals in mind as the original and highly influential Prio paper by Henry Corrigan-Gibbs and Dan Boneh (USENIX 2017). As in the original Prio, each client holds a private data value (e.g. number of visits to a particular website) and a small set of servers privately compute statistical functions over the set of client values (e.g. the ...
In modern times, data collected from multi-user distributed applications must be analyzed on a massive scale to support critical business objectives. While analytics often requires the use of personal data, it may compromise user privacy expectations if this analysis is conducted over plaintext data. Private Stream Aggregation (PSA) allows for the aggregation of time-series data, while still providing strong privacy guarantees, and is significantly more efficient over a network than related...
Histograms have a large variety of useful applications in data analysis, e.g., tracking the spread of diseases and analyzing public health issues. However, most data analysis techniques used in practice operate over plaintext data, putting the privacy of users’ data at risk. We consider the problem of allowing an untrusted aggregator to privately compute a histogram over multiple users’ private inputs (e.g., number of contacts at a place) without learning anything other than the final...
Federated learning (FL) is an emerging distributed machine learning paradigm which addresses critical data privacy issues in machine learning by enabling clients, using an aggregation server (aggregator), to jointly train a global model without revealing their training data. Thereby, it improves not only privacy but is also efficient as it uses the computation power and data of potentially millions of clients for training in parallel. However, FL is vulnerable to so-called inference attacks...
Many video classification applications require access to personal data, thereby posing an invasive security risk to the users' privacy. We propose a privacy-preserving implementation of single-frame method based video classification with convolutional neural networks that allows a party to infer a label from a video without necessitating the video owner to disclose their video to other entities in an unencrypted manner. Similarly, our approach removes the requirement of the classifier owner...
The concept of private stream aggregation (PSA) has been proposed by Shi et al. (NDSS 2011) to allow for data analysis in a privacy-preserving manner. In this work, we introduce the notion of labeled secret sharing (LaSS) schemes and show how to use it to construct PSA schemes. We also show how to realize LaSS using pseudorandom functions or alternatively with a hash function modeled as a random oracle and how it can be used to construct PSA schemes. Additionally, we revisit the security...
Nowadays, genomic sequencing has become much more affordable for many people and, thus, many people own their genomic data in a digital format. Having paid for genomic sequencing, they want to make use of their data for different tasks that are possible only using genomics, and they share their data with third parties to achieve these tasks, e.g., to find their relatives in a genomic database. As a consequence, more genomic data get collected worldwide. The upside of the data collection is...
Federated Learning (FL) is a collaborative machine learning approach allowing participants to jointly train a model without having to share their private, potentially sensitive local datasets with others. Despite its benefits, FL is vulnerable to so-called backdoor attacks, in which an adversary injects manipulated model updates into the federated model aggregation process so that the resulting model will provide targeted false predictions for specific adversary-chosen inputs. Proposed...
Private Stream Aggregation (PSA) protocols allow for the secure aggregation of time-series data, affording security and privacy to users' private data, with significantly better efficiency than general secure computation such as homomorphic encryption, multiparty computation, and secure hardware based approaches. Earlier PSA protocols face limitations including needless complexity, a lack of post-quantum security, or other practical issues. In this work, we present SLAP, a Simple...
In this work we describe a token-based solution to Contact Tracing via Distributed Point Functions (DPF) and, more generally, Function Secret Sharing (FSS). The key idea behind the solution is that FSS natively supports secure keyword search on raw sets of keywords without a need for processing the keyword sets via a data structure for set membership. Furthermore, the FSS functionality enables adding up numerical payloads associated with multiple matches without additional interaction. These...
Private stream aggregation (PSA) allows an untrusted data aggregator to compute statistics over a set of multiple participants' data while ensuring the data remains private. Existing works rely on a trusted third party to enable an aggregator to achieve fault tolerance, that requires interactive recovery, but in the real world this may not be practical or secure. We develop a new formal framework for PSA that accounts for user faults, and can support non-interactive recovery, while still...
Aggregate signcryption combines the functionalities of aggregate signature and encryption. Very recently, Zia & Ali [1] (Wireless Personal Communications, https://doi.org/10.1007/s11277-020-07637-z) proposed an elliptic curve cryptography (ECC) based multi-recipient aggregate signcryption scheme. The authors claimed that their scheme is correct, efficient, and secure against known attacks. However, by this comment, we show that their scheme is incorrect and the receiver(s) is unable to...
Polynomial commitment schemes (PCS) have recently been in the spotlight for their key role in building SNARKs. A PCS provides the ability to commit to a polynomial over a finite field and prove its evaluation at points. A succinct PCS has commitment and evaluation proof size sublinear in the degree of the polynomial. An efficient PCS has sublinear proof verification. Any efficient and succinct PCS can be used to construct a SNARK with similar security and efficiency characteristics (in the...
Recently introduced federated learning is an attractive framework for the distributed training of deep learning models with thousands of participants. However, it can potentially be used with malicious intent. For example, adversaries can use their smartphones to jointly train a classifier for extracting secret keys from the smartphones' SIM cards without sharing their side-channel measurements with each other. With federated learning, each participant might be able to create a strong model...
As a popular paradigm, crowd-sensing network emerges to achieve sensory data collection and task allocation to mobile users. On one hand these sensory data could be private and sensitive, and on the other hand, data transmission separately could incur heavy communication overhead. Fortunately, the technique of homomorphic encryption (HE) allows the addictive and/or multiplicative operations over the encrypted data as well as privacy protection. Therefore, several data aggregation schemes...
Paillier's scheme is a homomorphic public key encryption scheme which is widely used in practical. For instance, Paillier's scheme can be used in the data aggregation in smart grid. Damg$\mathring{a}$rd and Jurik generalized Paillier's scheme to reduce the ciphertext expansion factor. However, the decryption scheme of Damg$\mathring{a}$rd and Jurik's scheme is more complicated than Paillier's original scheme. In this paper, we propose a new generalization of Paillier's scheme and all the...
An aggregatable subvector commitment (aSVC) scheme is a vector commitment (VC) scheme that can aggregate multiple proofs into a single, small subvector proof. In this paper, we formalize aSVCs and give a construction from constant-sized polynomial commitments. Our construction is unique in that it has linear-sized public parameters, it can compute all constant-sized proofs in quasilinear time, it updates proofs in constant time and it can aggregate multiple proofs into a constant-sized...
Recent work has shown that cell phone mobility data has the unique potential to create accurate models for human mobility and consequently the spread of infected diseases. While prior studies have exclusively relied on a mobile network operator's subscribers' aggregated data in modelling disease dynamics, it may be preferable to contemplate aggregated mobility data of infected individuals only. Clearly, naively linking mobile phone data with health records would violate privacy by either...
We design a consecution of protocols which allows organizations to have secure strong access control of their users to their desktop machines based on biometry. It provides both strong secure authentication and privacy. Moreover, our mechanism allows the system admins to grant a various level of access to their end-users by fine tuning access control policy. Our system implements privacy-by-design. It separates biometric data from identity information. It is practical: we fully implemented...
We introduce Dynamic Decentralized Functional Encryption (DDFE), a generalization of Functional Encryption which allows multiple users to join the system dynamically, without relying on a trusted third party or on expensive and interactive Multi-Party Computation protocols. This notion subsumes existing multi-user extensions of Functional Encryption, such as Multi-Input, Multi-Client, and Ad Hoc Multi-Input Functional Encryption. We define and construct schemes for various functionalities...
Federated learning is gaining significant interests as it enables model training over a large volume of data that is distributedly stored over many users, while protecting the privacy of the individual users. However, a major bottleneck in scaling federated learning to a large number of users is the overhead of secure model aggregation across many users. In fact, the overhead of state-of-the-art protocols for secure model aggregation grows quadratically with the number of users. We propose a...
We consider a scenario where multiple organizations holding large amounts of sensitive data from their users wish to compute aggregate statistics on this data while protecting the privacy of individual users. To support large-scale analytics we investigate how this privacy can be provided for the case of sketching algorithms running in time sub-linear of the input size. We begin with the well-known LogLog sketch for computing the number of unique elements in a data stream. We show that this...
Federated Learning (FL) enables a large number of users to jointly learn a shared machine learning (ML) model, coordinated by a centralized server, where the data is distributed across multiple devices. This approach enables the server or users to train and learn an ML model using gradient descent, while keeping all the training data on users' devices. We consider training an ML model over a mobile network where user dropout is a common phenomenon. Although federated learning was aimed at...
We present a new 4-move special honest-verifier zero-knowledge proof of knowledge system for proving that a vector of Pedersen commitments opens to a so-called "one-hot" vector (i.e., to a vector from the standard orthonormal basis) from $\mathbb{Z}_p^n$. The need for such proofs arises in the contexts of symmetric private information retrieval (SPIR), end-to-end verifiable voting (E2E), and privacy-preserving data aggregation and analytics, among others. The key insight underlying the new...
Secure multi-party computation (MPC) allows multiple parties to jointly compute the output of a function while preserving the privacy of any individual party's inputs to that function. As MPC protocols transition from research prototypes to real-world applications, the usability of MPC-enabled applications is increasingly critical to their successful deployment and wide adoption. Our Web-MPC platform, designed with a focus on usability, has been deployed for privacy-preserving data...
Consider sources that supply sensitive data to an aggregator. Standard encryption only hides the data from eavesdroppers, but using specialized encryption one can hope to hide the data (to the extent possible) from the aggregator itself. For flexibility and security, we envision schemes that allow sources to supply encrypted data, such that at any point a dynamically chosen subset of sources can allow an agreed-upon joint function of their data to be computed by the aggregator. A primitive...
The sharing of biomedical data is crucial to enable scientific discoveries across institutions and improve health care. For example, genome-wide association studies (GWAS) based on a large number of samples can identify disease-causing genetic variants. The privacy concern, however, has become a major hurdle for data management and utilization. Homomorphic encryption is one of the most powerful cryptographic primitives which can address the privacy and security issues. It supports the...