The papers will be available online to everyone beginning on the first day of the conference, July 14, 2021. With the help of thousands of Lambda threads, Dorylus scales GNN training to billion-edge graphs. High-performance tensor programs are critical for efficiently deploying deep neural network (DNN) models in real-world tasks. Lukas Burkhalter, Nicolas Kchler, Alexander Viand, Hossein Shafagh, and Anwar Hithnawi, ETH Zrich. Contact your program co-chairs, osdi21chairs@usenix.org, or the USENIX office, submissionspolicy@usenix.org. Evaluation on a four-node machine with Optane DC Persistent Memory shows that Nap can improve the throughput by up to 2.3 and 1.56 under write-intensive and read-intensive workloads, respectively. We have made Fluffy publicly available at https://github.com/snuspl/fluffy to contribute to the security of Ethereum. We present selective profiling, a technique that locates data locality problems with low-enough overhead that is suitable for production use. MAGE outperforms the OS virtual memory system by up to an order of magnitude, and in many cases, runs SC computations that do not fit in memory at nearly the same speed as if the underlying machines had unbounded physical memory to fit the entire computation. Across a wide range of pages, phones, and mobile networks covering web workloads in both developed and emerging regions, Horcrux reduces median browser computation delays by 31-44% and page load times by 18-37%. Mothy received a PhD in 1995 from the Computer Laboratory of the University of Cambridge, where he was a principal designer and builder of the Nemesis OS. Based on the observation that invariants are often concise in practice, DistAI starts with small invariant formulas and enumerates all strongest possible invariants that hold for all samples. The conference papers and full proceedings are available to registered attendees now and will be available to everyone beginning Wednesday, July 14, 2021. The overhead of GPT is 5% for memory-intensive workloads (e.g., Redis) and negligible for CPU-intensive workloads (e.g., RV8 and Coremarks). Devices employ adaptive interrupt coalescing heuristics that try to balance between these opposing goals. However, existing enclave designs fail to meet the requirements of scalability demanded by new scenarios like serverless computing, mainly due to the limitations in their secure memory protection mechanisms, including static allocation, restricted capacity and high-cost initialization. Advisor: You have a past or present association as thesis advisor or advisee. Second, it innovates on the underlying cryptographic machinery and constructs a new private information retrieval scheme, FastPIR, that reduces the time to process oblivious access requests for mailboxes. Registering abstracts a week before paper submission is an essential part of the paper-reviewing process, as PC members use this time to identify which papers they are qualified to review. This change is receiving considerable attention in the architecture and security communities, for example, but in contrast, so-called OS researchers are mostly in denial. Second, GNNAdvisor implements a novel and highly-efficient 2D workload management tailored for GNN computation to improve GPU utilization and performance under different application settings. As the emerging trend of graph-based deep learning, Graph Neural Networks (GNNs) excel for their capability to generate high-quality node feature vectors (embeddings). Submitted November 12, 2021 Accepted January 20, 2022. Dorylus is up to 3.8 faster and 10.7 cheaper compared to existing sampling-based systems. Professor Veloso has been recognized with a multiple honors, including being a Fellow of the ACM, IEEE, AAAS, and AAAI. Conference Dates: Apr 12, 2021 - Apr 14, 2021. HotNets provides a venue for discussing innovative ideas and for debating future research agendas in networking. Prior or concurrent publication in non-peer-reviewed contexts, like arXiv.org, technical reports, talks, and social media posts, is permitted. This year, there were only 2 accepted papers from UK institutes. See the Preview Session page for an overview of the topics covered in the program. Our evaluation shows that PET outperforms existing systems by up to 2.5, by unlocking previously missed opportunities from partially equivalent transformations. All papers will be available online to registered attendees before the conference. A scientific paper consists of a constellation of artifacts that extend beyond the document itself: software, hardware, evaluation data and documentation, raw survey results, mechanized proofs, models, test suites, benchmarks, and so on. In particular, I'll argue for re-engaging with what computer hardware really is today and give two suggestions (among many) about how the OS research community can usefully do this, and exploit what is actually a tremendous opportunity. Authors should email the program co-chairs, osdi21chairs@usenix.org, a copy of the related workshop paper and a short explanation of the new material in the conference paper beyond that published in the workshop version. We present NrOS, a new OS kernel with a safer approach to synchronization that runs many POSIX programs. Authors of each accepted paper must ensure that at least one author registers for the conference, and that their paper is presented in-person at the conference. Typically, monolithic kernels share state across cores and rely on one-off synchronization patterns that are specialized for each kernel structure or subsystem. OSDI brings together professionals from academic and industrial backgrounds in what has become a premier forum for discussing the design, implementation, and implications of systems software. Qing Wang, Youyou Lu, Junru Li, and Jiwu Shu, Tsinghua University. When uploading your OSDI 2021 reviews for your submission to SOSP, you can optionally append a note about how you addressed the reviews and comments. Shaghayegh Mardani, UCLA; Ayush Goel, University of Michigan; Ronny Ko, Harvard University; Harsha V. Madhyastha, University of Michigan; Ravi Netravali, Princeton University. Horcruxs JavaScript scheduler then uses this information to judiciously parallelize JavaScript execution on the client-side so that the end-state is identical to that of a serial execution, while minimizing coordination and offloading overheads. Memory allocation represents significant compute cost at the warehouse scale and its optimization can yield considerable cost savings. Message from the Program Co-Chairs. However, memory allocation decisions also impact overall application performance via data placement, offering opportunities to improve fleetwide productivity by completing more units of application work using fewer hardware resources. DistAI generates data by simulating the distributed protocol at different instance sizes and recording states as samples. Of the 26 submitted artifacts: 26 artifacts received the Artifacts Available badge (100%). This fast path contains programmable hardware support for low latency transport and congestion control as well as hardware support for efficient load balancing of RPCs to cores. She developed the technology for making network routing self-stabilizing, largely self-managing, and scalable. A glance at this year's OSDI program shows that Operating Systems are a small niche topic for this conference, not even meriting their own full session. An evaluation of Addra on a cluster of 80 machines on AWS demonstrates that it can serve 32K users with a 99-th percentile message latency of 726 msa 7 improvement over a prior system for text messaging in the same threat model. Author Response Period The biennial ACM Symposium on Operating Systems Principles is the world's premier forum for researchers, developers, programmers, vendors and teachers of operating system technology. The OSDI Symposium emphasizes innovative research as well as quantified or insightful experiences in systems design and implementation. For example, traditional compute resources are replenishable while privacy is not: a CPU can be regained after a model finishes execution while privacy budget cannot. Under different configurations of TPC-C and TPC-E, Polyjuice can achieve throughput numbers higher than the best of existing algorithms by 15% to 56%. In 2023 I started another two-year term on the . The co-chairs may then share that paper with the workshops organizers and discuss it with them. 1 Acknowledgements: Paper prepared for the post-conference workshop on Food for Thought: Economic Analysis in Anticipation of the Next Farm Bill at the Agricultural and Applied Economics Association annual meeting, Austin, TX . Mothy's current research centers on Enzian, a powerful hybrid CPU/FPGA machine designed for research into systems software. Existing frameworks optimize tensor programs by applying fully equivalent transformations, which maintain equivalence on every element of output tensors. We also propose two file system techniques for ZNS+-aware LFS. Submissions may include as many additional pages as needed for references but not for appendices. J.P. Morgan AI Research partners with applied data analytics teams across the firm as well as with leading academic institutions globally. USENIX discourages program co-chairs from submitting papers to the conferences they organize, although they are allowed to do so. You must not improperly identify a PC member as a conflict if none of these three circumstances applies, even if for some other reason you want to avoid them reviewing your paper. Submission of a response is optional. To resolve the problem, we propose a new LFS-aware ZNS interface, called ZNS+, and its implementation, where the host can offload data copy operations to the SSD to accelerate segment compaction. This is unfortunate because good OS design has always been driven by the underlying hardware, and right now that hardware is almost unrecognizable from ten years ago, let alone from the 1960s when Unix was written. Graph Neural Networks (GNNs) have gained significant attention in the recent past, and become one of the fastest growing subareas in deep learning. 23 artifacts received the Artifacts Functional badge (88%). OSDI'21 accepted 31 papers and 26 papers participated in the AE, a significant increase in the participate ratio: 84%, compared to OSDI'20 (70%) and SOSP'19 (61%). Our evaluation on the SPEC benchmarks shows that SanRazor can reduce the overhead of sanitizers significantly, from 73.8% to 28.062.0% for AddressSanitizer, and from 160.1% to 36.6124.4% for UndefinedBehaviorSanitizer (depending on the applied reduction scheme). Starting with small invariant formulas and strongest possible invariants avoids large SMT queries, improving SMT solver performance. OSDI '21 Technical Sessions All the times listed below are in Pacific Daylight Time (PDT).