Pollux is implemented and publicly available as part of an open-source project at https://github.com/petuum/adaptdl. Professor Veloso is on leave from Carnegie Mellon University as the Herbert A. Simon University Professor in the School of Computer Science, and the past Head of the Machine Learning Department. Pages should be numbered, and figures and tables should be legible in black and white, without requiring magnification. The key to our solution, Horcrux, is to account for the non-determinism intrinsic to web page loads and the constraints placed by the browsers API for parallelism. We present Storm, a web framework that allows developers to build MVC applications with compile-time enforcement of centrally specified data-dependent security policies. Papers not meeting these criteria will be rejected without review, and no deadline extensions will be granted for reformatting. Prepublication versions of the accepted papers from the summer submission deadline are available below. This motivates the need for a new approach to data privacy that can provide strong assurance and control to users. Paper abstracts and proceedings front matter are available to everyone now. Concurrency control algorithms are key determinants of the performance of in-memory databases. Swapnil Gandhi and Anand Padmanabha Iyer, Microsoft Research. If you are uncertain about how to anonymize your submission, please contact the program co-chairs, osdi21chairs@usenix.org, well in advance of the submission deadline. The main contribution of this paper is GoJournal, a verified, concurrent journaling system that provides atomicity for storage applications, together with Perennial 2.0, a framework for formally specifying and verifying concurrent crash-safe systems. This paper presents the design and implementation of CLP, a tool capable of losslessly compressing unstructured text logs while enabling fast searches directly on the compressed data. For example, talks may be shorter than in prior years, or some parts of the conference may be multi-tracked. HotCRP.com signin Sign in using your HotCRP.com account. Furthermore, such performance can be achieved without any modification in applications, network hardware, kernel CPU schedulers and/or kernel network stack. My paper has accepted to appear in the EuroSys2020; I will have a talk at the Hotstorage'19; The Paper about GCMA Accepted to TC; Second, it innovates on the underlying cryptographic machinery and constructs a new private information retrieval scheme, FastPIR, that reduces the time to process oblivious access requests for mailboxes. Mothy joined the Computer Science Department ETH Zurich in January 2007 and was named Fellow of the ACM in 2013 for contributions to operating systems and networking research. At a high level, Addra follows a template in which callers and callees deposit and retrieve messages from private mailboxes hosted at an untrusted server. Existing systems that hide voice call metadata either require trusted intermediaries in the network or scale to only tens of users. Registering abstracts a week before paper submission is an essential part of the paper-reviewing process, as PC members use this time to identify which papers they are qualified to review. Submitted November 12, 2021 Accepted January 20, 2022. 1 Acknowledgements: Paper prepared for the post-conference workshop on Food for Thought: Economic Analysis in Anticipation of the Next Farm Bill at the Agricultural and Applied Economics Association annual meeting, Austin, TX . Tej Chajed, MIT CSAIL; Joseph Tassarotti, Boston College; Mark Theng, MIT CSAIL; Ralf Jung, MPI-SWS; M. Frans Kaashoek and Nickolai Zeldovich, MIT CSAIL. Mothy's current research centers on Enzian, a powerful hybrid CPU/FPGA machine designed for research into systems software. AI enables principled representation of knowledge, complex strategy optimization, learning from data, and support to human decision making. In the Ethereum network, decentralized Ethereum clients reach consensus through transitioning to the same blockchain states according to the Ethereum specification. Jaehyun Hwang and Midhul Vuppalapati, Cornell University; Simon Peter, UT Austin; Rachit Agarwal, Cornell University. Using this property, MAGE calculates the memory access pattern ahead of time and uses it to produce a memory management plan. With an aim to improve time-to-accuracy performance in model training, Oort prioritizes the use of those clients who have both data that offers the greatest utility in improving model accuracy and the capability to run training quickly. Here, we focus on hugepage coverage. Forgot your password? In some cases, the quality of these artifacts is as important as that of the document itself. One classical approach is to increase the efficiency of an allocator to minimize the cycles spent in the allocator code. For example, traditional compute resources are replenishable while privacy is not: a CPU can be regained after a model finishes execution while privacy budget cannot. blk-switch uses this insight to adapt techniques from the computer networking literature (e.g., multiple egress queues, prioritized processing of individual requests, load balancing, and switch scheduling) to the Linux kernel storage stack. This distinction forces a re-design of the scheduler. Session Chairs: Deniz Altinbken, Google, and Rashmi Vinayak, Carnegie Mellon University, Tanvir Ahmed Khan and Ian Neal, University of Michigan; Gilles Pokam, Intel Corporation; Barzan Mozafari and Baris Kasikci, University of Michigan. The conference papers and full proceedings are available to registered attendees now and will be available to everyone beginning Wednesday, July 14, 2021. See the Preview Session page for an overview of the topics covered in the program. This budget is a scarce resource that must be carefully managed to maximize the number of successfully trained models. DeSearch then introduces a witness mechanism to make sure the completed tasks can be reused across different pipelines, and to make the final search results verifiable by end users. We identify that current systems for learning the embeddings of large-scale graphs are bottlenecked by data movement, which results in poor resource utilization and inefficient training. To this end, we propose GNNAdvisor, an adaptive and efficient runtime system to accelerate various GNN workloads on GPU platforms. News Baris Kasikci's Home Page - Electrical Engineering and Computer You must not improperly identify a PC member as a conflict if none of these three circumstances applies, even if for some other reason you want to avoid them reviewing your paper. Based on this observation, P3 proposes a new approach for distributed GNN training. This kernel is scaled across NUMA nodes using node replication, a scheme inspired by state machine replication in distributed systems. In this paper, we present P3, a system that focuses on scaling GNN model training to large real-world graphs in a distributed setting. These are hard deadlines, and no extensions will be given. We also propose two file system techniques for ZNS+-aware LFS. As a result, the design of a file system with respect to space management and crash consistency is simplified, requiring only 10.8K LOC for full functionality. Welcome to the 2021 USENIX Annual Technical Conference (ATC '21) submissions site! To adapt to different workloads, prior works mix or switch between a few known algorithms using manual insights or simple heuristics. Prior or concurrent publication in non-peer-reviewed contexts, like arXiv.org, technical reports, talks, and social media posts, is permitted. The papers will be available online to everyone beginning on the first day of the conference, July 14, 2021. Storm ensures security using a Security Typed ORM that refines the (type) abstractions of each layer of the MVC API with logical assertions that describe the data produced and consumed by the underlying operation and the users allowed access to that data. Qing Wang, Youyou Lu, Junru Li, and Jiwu Shu, Tsinghua University. People often assume that blockchain has Byzantine robustness, so adding it to any system will make that system super robust against any calamity. We observe that, due to their intended security guarantees, SC schemes are inherently oblivioustheir memory access patterns are independent of the input data. To remedy this, we introduce DeSearch, the first decentralized search engine that guarantees the integrity and privacy of search results for decentralized services and blockchain apps. Across a wide range of pages, phones, and mobile networks covering web workloads in both developed and emerging regions, Horcrux reduces median browser computation delays by 31-44% and page load times by 18-37%. Penglai also reduces the latency of secure memory initialization by three orders of magnitude and gains 3.6x speedup for real-world applications (e.g., MapReduce). High-performance tensor programs are critical for efficiently deploying deep neural network (DNN) models in real-world tasks. Writing a correct operating system kernel is notoriously hard. SC is being increasingly adopted by industry for a variety of applications. Session Chairs: Moshe Gabel, University of Toronto, and Joseph Gonzalez, University of California, Berkeley, John Thorpe, Yifan Qiao, Jonathan Eyolfson, and Shen Teng, UCLA; Guanzhou Hu, UCLA and University of Wisconsin, Madison; Zhihao Jia, CMU; Jinliang Wei, Google Brain; Keval Vora, Simon Fraser; Ravi Netravali, Princeton University; Miryung Kim and Guoqing Harry Xu, UCLA. Research Impact Score 9.24. . Uniquely, Dorylus can take advantage of serverless computing to increase scalability at a low cost. She also has made contributions in network security, including scalable data expiration, distributed algorithms despite malicious participants, and DDOS prevention techniques. This paper presents Dorylus: a distributed system for training GNNs. Computation separation makes it possible to construct a deep, bounded-asynchronous pipeline where graph and tensor parallel tasks can fully overlap, effectively hiding the network latency incurred by Lambdas. Sep 2021 - Present 1 year 7 months. With her students, she had led research in AI, with a focus on robotics and machine learning, having concretely researched and developed a variety of autonomous robots, including teams of soccer robots, and mobile service robots. Hence, kernel developers are constantly refining synchronization within OS kernels to improve scalability at the risk of introducing subtle bugs. When uploading your OSDI 2021 reviews for your submission to SOSP, you can optionally append a note about how you addressed the reviews and comments. Used Zotero to organize papers about the stress and diffusion between anode and electrolyte and made a summary . OSDI 2021 papers summary | hacklog Kyuhwa Han, Sungkyunkwan University and Samsung Electronics; Hyunho Gwak and Dongkun Shin, Sungkyunkwan University; Jooyoung Hwang, Samsung Electronics. GoJournal is implemented in Go, and Perennial is implemented in the Coq proof assistant. Simultaneous submission of the same work to multiple venues, submission of previously published work, or plagiarism constitutes dishonesty or fraud. Erhu Feng, Xu Lu, Dong Du, Bicheng Yang, and Xueqiang Jiang, Institute of Parallel and Distributed Systems, Shanghai Jiao Tong University; Engineering Research Center for Domain-specific Operating Systems, Ministry of Education, China; Yubin Xia, Binyu Zang, and Haibo Chen, Institute of Parallel and Distributed Systems, Shanghai Jiao Tong University; Shanghai AI Laboratory; Engineering Research Center for Domain-specific Operating Systems, Ministry of Education, China. Haojie Wang, Jidong Zhai, Mingyu Gao, Zixuan Ma, Shizhi Tang, and Liyan Zheng, Tsinghua University; Yuanzhi Li, Carnegie Mellon University; Kaiyuan Rong and Yuanyong Chen, Tsinghua University; Zhihao Jia, Carnegie Mellon University and Facebook. OSDI '22 Technical Sessions | USENIX Important Dates Abstract registrations due: Thursday, December 3, 2020, 3:00 pm PST Complete paper submissions due: Thursday, December 10, 2020, 3:00pm PST Author Response Period Editor in charge: Daniel Petrolia . Yet, existing efforts randomly select FL participants, which leads to poor model and system efficiency. Last year, 70% of accepted OSDI papers participated in the . Papers accompanied by nondisclosure agreement forms will not be considered. KEVIN combines a fast, lightweight, and POSIX compliant file system with a key-value storage device that performs in-storage indexing. (Oct 2018) Awarded an Intel Faculty Grant for Research on automated performance optimization (Sep. 2018) Our paper on Foreshadow is accepted to appear at USENIX Security. NrOS is primarily constructed as a simple, sequential kernel with no concurrency, making it easier to develop and reason about its correctness. Session Chairs: Sebastian Angel, University of Pennsylvania, and Malte Schwarzkopf, Brown University, Ishtiyaque Ahmad, Yuntian Yang, Divyakant Agrawal, Amr El Abbadi, and Trinabh Gupta, University of California Santa Barbara. Under different configurations of TPC-C and TPC-E, Polyjuice can achieve throughput numbers higher than the best of existing algorithms by 15% to 56%. Submission of a response is optional. OSDI will provide an opportunity for authors to respond to reviews prior to final consideration of the papers at the program committee meeting. This paper presents Zeph, a system that enables users to set privacy preferences on how their data can be shared and processed. A graph embedding is a fixed length vector representation for each node (and/or edge-type) in a graph and has emerged as the de-facto approach to apply modern machine learning on graphs. The co-chairs may then share that paper with the workshops organizers and discuss it with them. OSDI '21 Call for Papers | USENIX OSDI'20: 14th USENIX Conference on Operating Systems Design and ImplementationNovember 4 - 6, 2020 ISBN: 978-1-939133-19-9 Published: 04 November 2020 Sponsors: ORACLE, VMware, Google Inc., Amazon, Microsoft Get Alerts for this Conference Save to Binder Export Citation Bibliometrics Citation count 96 Downloads (6 weeks) 317 Downloads (12 months) Authors must limit their responses to (a) correcting factual errors in the reviews or (b) directly addressing questions posed by reviewers. PDF Why Has Personality Psychology Played an Outsized Role in the Submissions violating the detailed formatting and anonymization rules will not be considered for review. His work has included the Barrelfish multikernel research OS, as well as work on distributed stream processors, and using formal specifications to describe the hardware/software interfaces of modern computer systems. Call for Papers - EuroSys 2022 The key insight in blk-switch is that Linux's multi-queue storage design, along with multi-queue network and storage hardware, makes the storage stack conceptually similar to a network switch. Software Systems Laboratory Wins Best Paper Award at OSDI 2022 Copyright to the individual works is retained by the author[s]. Upon these two primitives, our system can scale to thousands of concurrent enclaves with high resource utilization and eliminate the high-cost initialization of secure memory using fork-style enclave creation without weakening the security guarantees. The experimental results show that Penglai can support 1,000s enclave instances running concurrently and scale up to 512GB secure memory with both encryption and integrity protection. Sijie Shen, Rong Chen, Haibo Chen, and Binyu Zang, Institute of Parallel and Distributed Systems, Shanghai Jiao Tong University; Shanghai Artificial Intelligence Laboratory; Engineering Research Center for Domain-specific Operating Systems, Ministry of Education, China. Fortunately, we observe that the backups for high availability in modern distributed OLTP systems can be retrofitted to bridge the analytical queries and transactions in HTAP workloads. However, your OSDI submission must use an anonymized name for your project or system that differs from any used in such contexts. If your paper is accepted and you need an invitation letter to apply for a visa to attend the conference, please contact conference@usenix.org as soon as possible. Pollux simultaneously considers both aspects. Our approach outperforms existing file systems on a block SSD by a wide margin 6.2 on average for metadata-intensive benchmarks. With the help of thousands of Lambda threads, Dorylus scales GNN training to billion-edge graphs. We present Nap, a black-box approach that converts concurrent persistent memory (PM) indexes into NUMA-aware counterparts. OSDI'21 accepted 31 papers and 26 papers participated in the AE, a significant increase in the participate ratio: 84%, compared to OSDI'20 (70%) and SOSP'19 (61%). For instance, FAST 21 and NSDI 21 have author-notification dates after the OSDI 21 abstract-registration deadline. We introduce a hybrid cryptographic protocol for privacy-adhering transformations of encrypted data. We present NrOS, a new OS kernel with a safer approach to synchronization that runs many POSIX programs. This paper demonstrates that it is possible to achieve s-scale latency using Linux kernel storage stack, even when tens of latency-sensitive applications compete for host resources with throughput-bound applications that perform read/write operations at throughput close to hardware capacity. Novel system designs, thorough empirical work, well-motivated theoretical results, and new application areas are all . We discuss the design and implementation of TEMERAIRE including strategies for hugepage-aware memory layouts to maximize hugepage coverage and to minimize fragmentation overheads. She has a PhD in computer science from MIT. VLDB 2021 - 47th International Conference on Very Large Data Bases We propose a learning-based framework that instead explicitly optimizes concurrency control via offline training to maximize performance. Machine learning (ML) models trained on personal data have been shown to leak information about users. Authors must make a good faith effort to anonymize their submissions, and they should not identify themselves or their institutions either explicitly or by implication (e.g., through the references or acknowledgments). Poor data locality hurts an application's performance. To resolve the problem, we propose a new LFS-aware ZNS interface, called ZNS+, and its implementation, where the host can offload data copy operations to the SSD to accelerate segment compaction. We observe that scalability challenges in training GNNs are fundamentally different from that in training classical deep neural networks and distributed graph processing; and that commonly used techniques, such as intelligent partitioning of the graph do not yield desired results. Precision Conservation: Linking Set-aside and Working Lands Policy PLDI 2019 - PLDI Research Papers - PLDI 2019 - SIGPLAN We first introduce two new hardware primitives: 1) Guarded Page Table (GPT), which protects page table pages to support page-level secure memory isolation; 2) Mountable Merkle Tree (MMT), which supports scalable integrity protection for secure memory. The 15th USENIX Symposium on Operating Systems Design and Implementation (OSDI '21) will take place as a virtual event on July 1416, 2021. Jiang Zhang, University of Southern California; Shuai Wang, HKUST; Manuel Rigger, Pinjia He, and Zhendong Su, ETH Zurich. We present the results of a 1% experiment at fleet scale as well as the longitudinal rollout in Googles warehouse scale computers. We convert five state-of-the-art PM indexes using Nap. We demonstrate that KEVIN reduces the amount of I/O traffic between the host and the device, and remains particularly robust as the system ages and the data become fragmented. A graph neural network (GNN) enables deep learning on structured graph data. We also show that Marius can scale training to datasets an order of magnitude beyond a single machine's GPU and CPU memory capacity, enabling training of configurations with more than a billion edges and 550 GB of total parameters on a single machine with 16 GB of GPU memory and 64 GB of CPU memory. SOSP 2021 - Symposium on Operating Systems Principles Only two types of supplementary material are permitted: source code described in the paper and formal proofs sketched in the paper. Existing algorithms are designed to work well for certain workloads. As the emerging trend of graph-based deep learning, Graph Neural Networks (GNNs) excel for their capability to generate high-quality node feature vectors (embeddings). Abstract registrations that do not provide sufficient information to understand the topic and contribution (e.g., empty abstracts, placeholder abstracts, or trivial abstracts) will be rejected, thereby precluding paper submission. (Visa applications can take at least 30 working days to process.) Additionally, there is no assurance that data processing and handling comply with the claimed privacy policies. Hence, CLP enables efficient search and analytics on archived logs, something that was impossible without it. SOSP 2021 - Symposium on Operating Systems Principles We implement a variant of a log-structured merge tree in the storage device that not only indexes file objects, but also supports transactions and manages physical storage space. Fluffy found two new consensus bugs in the most popular Geth Ethereum client which were exploitable on the live Ethereum mainnet. In this paper, we propose a software-hardware co-design to support dynamic, fine-grained, large-scale secure memory as well as fast-initialization. 2019 - Present. She has been recognized with many industry honors including induction into the National Academy of Engineering, the Inventor Hall of Fame, The Internet Hall of Fame, Washington State Academy of Science, and lifetime achievement awards from USENIX and SIGCOMM. To enable FL developers to interpret their results in model testing, Oort enforces their requirements on the distribution of participant data while improving the duration of federated testing by cherry-picking clients. Evaluations show that Vegito can perform 1.9 million TPC-C NewOrder transactions and 24 TPC-H-equivalent queries per second simultaneously, which retain the excellent performance of specialized OLTP and OLAP counterparts (e.g., DrTM+H and MonetDB). Mingyu Li, Jinhao Zhu, and Tianxu Zhang, Institute of Parallel and Distributed Systems, Shanghai Jiao Tong University; Shanghai AI Laboratory; Engineering Research Center for Domain-specific Operating Systems, Ministry of Education, China; Cheng Tan, Northeastern University; Yubin Xia, Institute of Parallel and Distributed Systems, Shanghai Jiao Tong University; Shanghai AI Laboratory; Engineering Research Center for Domain-specific Operating Systems, Ministry of Education, China; Sebastian Angel, University of Pennsylvania; Haibo Chen, Institute of Parallel and Distributed Systems, Shanghai Jiao Tong University; Shanghai AI Laboratory; Engineering Research Center for Domain-specific Operating Systems, Ministry of Education, China. USENIX Security '21 has three submission deadlines. The device then "calibrates" its interrupts to completions of latency-sensitive requests. sosp ACM Symposium on Operating Systems Principles. The hybrid segment recycling chooses a proper block reclaiming policy between segment compaction and threaded logging based on their costs. Welcome to the SOSP 2021 Website. Title Page, Copyright Page, and List of Organizers | Our evaluation shows that PET outperforms existing systems by up to 2.5, by unlocking previously missed opportunities from partially equivalent transformations. However, Addra improves message latency in this architecture, which is a key performance metric for voice calls. Conference Dates: Apr 12, 2021 - Apr 14, 2021. If you submit a paper to either of those venues, you may not also submit it to OSDI 21. As a result, data characteristics and device capabilities vary widely across clients. Pollux promotes fairness among DL jobs competing for resources based on a more meaningful measure of useful job progress, and reveals a new opportunity for reducing DL cost in cloud environments. Marius is open-sourced at www.marius-project.org. Manuela will present examples and discuss the scope of AI in her research in the finance domain. The symposium emphasizes innovative research as well as quantified or insightful experiences in systems design and implementation. OSDI '21 Technical Sessions All the times listed below are in Pacific Daylight Time (PDT). This approach misses possible optimization opportunities as transformations that only preserve equivalence on subsets of the output tensors are excluded. When further combined with a simple caching strategy, our evaluation shows that P3 is able to outperform existing state-of-the-art distributed GNN frameworks by up to 7. The NVMe zoned namespace (ZNS) is emerging as a new storage interface, where the logical address space is divided into fixed-sized zones, and each zone must be written sequentially for flash-memory-friendly access. We particularly encourage contributions containing highly original ideas, new approaches, and/or groundbreaking results. Existing frameworks optimize tensor programs by applying fully equivalent transformations, which maintain equivalence on every element of output tensors. Based on the observation that real-world workloads always feature skewed access patterns, Nap introduces a NUMA-aware layer (NAL) on the top of existing concurrent PM indexes, and steers accesses to hot items to this layer. Proceedings Cover | By monitoring the status of each job during training, Pollux models how their goodput (a novel metric we introduce that combines system throughput with statistical efficiency) would change by adding or removing resources.