Parveen Patel

Parveen Patel

San Francisco Bay Area
5K followers 500 connections

Experience

  • Google Graphic

    Google

    San Francisco Bay Area

  • -

    San Francisco Bay Area

  • -

    San Francisco Bay Area

  • -

    Mountain View, California

  • -

    Redmond, Washington

  • -

    Redmond, Washington

  • -

  • -

  • -

  • -

  • -

Publications

  • Ananta: Cloud Scale Load Balancing

    SIGCOMM, Association for Computing Machinery, Inc.

    Layer-4 load balancing is fundamental to creating scale-out web
    services. We designed and implemented Ananta, a scale-out layer-4
    load balancer that runs on commodity hardware and meets the performance,
    reliability and operational requirements of multi-tenant
    cloud computing environments. Ananta combines existing techniques
    in routing and distributed systems in a unique way and splits
    the components of a load balancer into a consensus-based reliable
    control plane and a…

    Layer-4 load balancing is fundamental to creating scale-out web
    services. We designed and implemented Ananta, a scale-out layer-4
    load balancer that runs on commodity hardware and meets the performance,
    reliability and operational requirements of multi-tenant
    cloud computing environments. Ananta combines existing techniques
    in routing and distributed systems in a unique way and splits
    the components of a load balancer into a consensus-based reliable
    control plane and a decentralized scale-out data plane. A key component
    of Ananta is an agent in every host that can take over the
    packet modification function from the load balancer, thereby enabling
    the load balancer to naturally scale with the size of the data
    center. Due to its distributed architecture, Ananta provides direct
    server return (DSR) and network address translation (NAT) capabilities
    across layer-2 boundaries. Multiple instances of Ananta
    have been deployed in the Windows Azure public cloud with combined
    bandwidth capacity exceeding 1Tbps. It is serving traffic
    needs of a diverse set of tenants, including the blob, table and relational
    storage services. With its scale-out data plane we can easily
    achieve more than 100Gbps throughput for a single public IP address.
    In this paper, we describe the requirements of a cloud-scale
    load balancer, the design of Ananta and lessons learnt from its implementation
    and operation in the Windows Azure public cloud.

    Other authors
    See publication
  • Data center TCP (DCTCP)

    SIGCOMM, Association for Computing Machinery, Inc.

    Cloud data centers host diverse applications, mixing workloads that require small predictable latency with others requiring large sustained throughput. In this environment, today's state-of-the-art TCP protocol falls short. We present measurements of a 6000 server production cluster and reveal impairments that lead to high application latencies, rooted in TCP's demands on the limited buffer space available in data center switches. For example, bandwidth hungry "background" flows build up queues…

    Cloud data centers host diverse applications, mixing workloads that require small predictable latency with others requiring large sustained throughput. In this environment, today's state-of-the-art TCP protocol falls short. We present measurements of a 6000 server production cluster and reveal impairments that lead to high application latencies, rooted in TCP's demands on the limited buffer space available in data center switches. For example, bandwidth hungry "background" flows build up queues at the switches, and thus impact the performance of latency sensitive "foreground" traffic.

    To address these problems, we propose DCTCP, a TCP-like protocol for data center networks. DCTCP leverages Explicit Congestion Notification (ECN) in the network to provide multi-bit feedback to the end hosts. We evaluate DCTCP at 1 and 10Gbps speeds using commodity, shallow buffered switches. We find DCTCP delivers the same or better throughput than TCP, while using 90% less buffer space. Unlike TCP, DCTCP also provides high burst tolerance and low latency for short flows. In handling workloads derived from operational measurements, we found DCTCP enables the applications to handle 10X the current background traffic, without impacting foreground traffic. Further, a 10X increase in foreground traffic does not cause any timeouts, thus largely eliminating incast problems.

    Other authors
    See publication
  • The Nature of Data Center Traffic: Measurements and Analysis

    Internet Measurement Conference, Association for Computing Machinery, Inc.

    We explore the nature of traffic in data centers, designed to support the mining of massive data sets. We instrument the servers to collect socket-level logs, with negligible performance impact. In a 1500 server operational cluster, we thus amass roughly a petabyte of measurements over two months, from which we obtain and report detailed views of traffic and congestion conditions and patterns. We further consider whether traffic matrices in the cluster might be obtained instead via tomographic…

    We explore the nature of traffic in data centers, designed to support the mining of massive data sets. We instrument the servers to collect socket-level logs, with negligible performance impact. In a 1500 server operational cluster, we thus amass roughly a petabyte of measurements over two months, from which we obtain and report detailed views of traffic and congestion conditions and patterns. We further consider whether traffic matrices in the cluster might be obtained instead via tomographic inference from coarser-grained counter data.

    Other authors
    See publication
  • Change Is Hard: Adapting Dependency Graph Models For Unified Diagnosis in Wired/Wireless Networks

    Workshop: Research on Enterprise Networking, Association for Computing Machinery, Inc.

    Organizations world-wide are adopting wireless networks at an impressive
    rate, and a new industry has sprung up to provide tools to
    manage these networks. Unfortunately, these tools do not integrate
    cleanly with traditional wired network management tools, leading
    to unsolved problems and frustration among the IT staff. We explore
    the problem of unifying wireless and wired network management
    and show that simple merging of tools and strategies, and/or
    their trivial extension…

    Organizations world-wide are adopting wireless networks at an impressive
    rate, and a new industry has sprung up to provide tools to
    manage these networks. Unfortunately, these tools do not integrate
    cleanly with traditional wired network management tools, leading
    to unsolved problems and frustration among the IT staff. We explore
    the problem of unifying wireless and wired network management
    and show that simple merging of tools and strategies, and/or
    their trivial extension from one domain to another does not work.
    Building on previous research on network service dependency extraction,
    fault diagnosis, and wireless network management, we
    introduce MnM, an end-to-end network management system that
    unifies wired and wireless network management. MnM treats the
    physical location of end devices as a core component of its management
    strategy. It also dynamically adapts to the frequent topology
    changes brought about by end-node mobility. We have a prototype
    deployment in a large organization that shows that MnM’s rootcause
    analysis engine out-performs systems that do not take user
    mobility into account when localizing faults or attributing blame.

    Other authors
    See publication
  • VL2: A Scalable and Flexible Data Center Network

    SIGCOMM, Association for Computing Machinery, Inc.

    To be agile and cost effective, data centers should allow dynamic resource allocation across large server pools. In particular, the data center network should enable any server to be assigned to any service. Tomeet these goals, we presentVL2, a practical network architecture that scales to support huge data centers with uniform high capacity between servers, performance isolation between services, and Ethernet layer-2 semantics. VL2 uses (1) flat addressing to allow service instances to be…

    To be agile and cost effective, data centers should allow dynamic resource allocation across large server pools. In particular, the data center network should enable any server to be assigned to any service. Tomeet these goals, we presentVL2, a practical network architecture that scales to support huge data centers with uniform high capacity between servers, performance isolation between services, and Ethernet layer-2 semantics. VL2 uses (1) flat addressing to allow service instances to be placed anywhere in the network, (2) Valiant Load Balancing to spread traffic uniformly across network paths, and (3) end-system based address resolution to scale to large server pools, without introducing complexity to the network control plane. VL2’s design is driven by detailed measurements of traffic and fault data from a large operational cloud service provider. VL2’s implementation leverages proven network technologies, already available at lowcost in high-speed hardware implementations, to build a scalable and reliable network architecture. As a result, VL2 networks can be deployed today, and we have built a working prototype. We evaluate the merits of the VL2 design using measurement, analysis, and experiments. Our VL2 prototype shuffles 2.7 TB of data among 75 servers in 395 seconds – sustaining a rate that is 94% of the maximum possible.

    Other authors
    See publication
  • The Cost of a Cloud: Research Problems in Data Center Networks

    Computer Communications Review, Association for Computing Machinery, Inc.

    The data centers used to create cloud services represent a significant investment in capital outlay and ongoing costs. Accordingly, we first examine the costs of cloud service data centers today. The cost breakdown reveals the importance of optimizing work completed per dollar invested. Unfortunately, the resources inside the data centers often operate at low utilization due to resource stranding and fragmentation. To attack this first problem, we propose (1) increasing network agility, and (2)…

    The data centers used to create cloud services represent a significant investment in capital outlay and ongoing costs. Accordingly, we first examine the costs of cloud service data centers today. The cost breakdown reveals the importance of optimizing work completed per dollar invested. Unfortunately, the resources inside the data centers often operate at low utilization due to resource stranding and fragmentation. To attack this first problem, we propose (1) increasing network agility, and (2) providing appropriate incentives to shape resource consumption. Second, we note that cloud service providers are building out geo-distributed networks of data centers. Geo-diversity lowers latency to users and increases reliability in the presence of an outage taking out an entire site. However, without appropriate design and management, these geo-diverse data center networks can raise the cost of providing service. Moreover, leveraging geo-diversity requires services be designed to benefit from it. To attack this problem, we propose (1) joint optimization of network and data center resources, and (2) new systems and mechanisms for geo-distributing state.

    Other authors
    See publication
  • Towards Unified Management of Networked Services in Wired and Wireless Networks

    Microsoft Research

    Organizations world-wide are adopting wireless networks at an impressive rate, and a new industry has sprung up to provide tools to manage these networks. Unfortunately, these tools do not integrate cleanly with traditional wired network management tools, leading to unsolved problems and frustration among the IT staff. We explore the problem of unifying wireless and wired network management and show that simple merging of tools and strategies, and/or their trivial extension from one domain to…

    Organizations world-wide are adopting wireless networks at an impressive rate, and a new industry has sprung up to provide tools to manage these networks. Unfortunately, these tools do not integrate cleanly with traditional wired network management tools, leading to unsolved problems and frustration among the IT staff. We explore the problem of unifying wireless and wired network management and show that simple merging of tools and strategies, and/or their trivial extension from one domain to another does not work. Building on previous research on network service dependency extraction, fault diagnosis, and wireless network management, we introduce MnM, an endto- end network management system that unifies wired and wireless network management. MnM treats physical location of end devices as a core component of its management strategy. It also dynamically adapts to the frequent topology changes brought about by end-node mobility.We have a prototype deployment in a large organization that shows that MnM’s root-cause analysis engine easily out-performs systems that do not take user mobility into account in terms of correctly localizing faults and blame attribution.

    Other authors
    See publication
  • Towards a Next Generation Data Center Architecture: Scalability and Commoditization

    PRESTO Workshop at SIGCOMM

    Applications hosted in today’s data centers suffer from internal
    fragmentation of resources, rigidity, and bandwidth constraints imposed
    by the architecture of the network connecting the data center’s
    servers. Conventional architectures statically map web services
    to Ethernet VLANs, each constrained in size to a few hundred
    servers owing to control plane overheads. The IP routers used
    to span traffic across VLANs and the load balancers used to spray
    requests within a VLAN…

    Applications hosted in today’s data centers suffer from internal
    fragmentation of resources, rigidity, and bandwidth constraints imposed
    by the architecture of the network connecting the data center’s
    servers. Conventional architectures statically map web services
    to Ethernet VLANs, each constrained in size to a few hundred
    servers owing to control plane overheads. The IP routers used
    to span traffic across VLANs and the load balancers used to spray
    requests within a VLAN across servers are realized via expensive
    customized hardware and proprietary software. Bisection bandwidth
    is low, severly constraining distributed computation. Further, the
    conventional architecture concentrates traffic in a few pieces of
    hardware that must be frequently upgraded and replaced to keep
    pace with demand - an approach that directly contradicts the prevailing
    philosophy in the rest of the data center, which is to scale
    out (adding more cheap components) rather than scale up (adding
    more power and complexity to a small number of expensive components).
    Commodity switching hardware is now becoming available
    with programmable control interfaces and with very high port speeds
    at very low port cost, making this the right time to redesign the data
    center networking infrastructure. In this paper, we describe Monsoon,
    a new network architecture, which scales and commoditizes
    data center networking. Monsoon realizes a simple mesh-like architecture
    using programmable commodity layer-2 switches and
    servers. In order to scale to 100,000 servers or more, Monsoon
    makes modifications to the control plane (e.g., source routing) and
    to the data plane (e.g., hot-spot free multipath routing via Valiant
    Load Balancing). It disaggregates the function of load balancing
    into a group of regular servers, with the result that load balancing
    server hardware can be distributed amongst racks in the data
    center leading to greater agility and less fragmentation.

    Other authors
    See publication
  • Upgrading transport protocols using Mobile Code

    ACM Symposium on Operating System Principles (SOSP 2003)

    In this paper, we present STP, a system in which communicating
    end hosts use untrusted mobile code to remotely upgrade each
    other with the transport protocols that they use to communicate.
    New transport protocols are written in a type-safe version of C,
    distributed out-of-band, and run in-kernel. Communicating peers
    select a transport protocol to use as part of a TCP-like connection
    setup handshake that is backwards-compatible with TCP and incurs
    minimum connection setup…

    In this paper, we present STP, a system in which communicating
    end hosts use untrusted mobile code to remotely upgrade each
    other with the transport protocols that they use to communicate.
    New transport protocols are written in a type-safe version of C,
    distributed out-of-band, and run in-kernel. Communicating peers
    select a transport protocol to use as part of a TCP-like connection
    setup handshake that is backwards-compatible with TCP and incurs
    minimum connection setup latency. New transports can be
    invoked by unmodified applications. By providing a late binding
    of protocols to hosts, STP removes many of the delays and constraints
    that are otherwise commonplace when upgrading the transport
    protocols deployed on the Internet. STP is simultaneously able
    to provide a high level of security and performance. It allows each
    host to protect itself from untrusted transport code and to ensure
    that this code does not harm other network users by sending significantly
    faster than a compliant TCP. It runs untrusted code with
    low enough overhead that new transport protocols can sustain near
    gigabit rates on commodity hardware. We believe that these properties,
    plus compatibility with existing applications and transports,
    complete the features that are needed to make STP useful in practice.

    Other authors
    • Andrew Whitaker
    • David Wetherall
    • Jay Lepreau
    • Tim Stack
    See publication
  • TCP Meets Active Networks

    IEEE Workshop on Hot Topics in Operating Systems (HotOS IX)

    This paper argues that transport protocols such as TCP provide a rare domain in which protocol extensibility by untrusted parties is both valuable and practical. TCP continues to be refined despite more than two decades of progress, and the difficulties due to deployment delays and backwards-compatibility are well-known. Remote extensibility, by which a host can ship the transport protocol code and dynamically load it on another node in the network on a per-connection basis, directly tackles…

    This paper argues that transport protocols such as TCP provide a rare domain in which protocol extensibility by untrusted parties is both valuable and practical. TCP continues to be refined despite more than two decades of progress, and the difficulties due to deployment delays and backwards-compatibility are well-known. Remote extensibility, by which a host can ship the transport protocol code and dynamically load it on another node in the network on a per-connection basis, directly tackles both of these problems. At the same time, the unicast transport protocol domain is much narrower than other domains that use mobile code, such as active networking, which helps to make extensibility feasible. The transport level provides a well understood notion of global safety—TCP friendliness—while local safety can be guaranteed by isolation of per-protocol state and use of recent safe-language technologies. We support these arguments by outlining the design of XTCP, our extensible TCP framework.

    Other authors
    • David Wetherall
    • Jay Lepreau
    • Andrew Whitaker
    See publication
  • Hybrid Resource Control of Active Extensions

    IEEE International Conference on Open Architectures and Network Programming (OpenArch 2003)

    The ability of active networks technology to allow customized router computation critically depends on having resource control techniques that prevent buggy, malicious, or greedy code from affecting the integrity or availability of node resources. It is hard to choose between static and dynamic checking for resource control. Dynamic checking has the advantage of basing its decisions on precise real-time information about what the extension is doing but causes runtime overhead and asynchronous…

    The ability of active networks technology to allow customized router computation critically depends on having resource control techniques that prevent buggy, malicious, or greedy code from affecting the integrity or availability of node resources. It is hard to choose between static and dynamic checking for resource control. Dynamic checking has the advantage of basing its decisions on precise real-time information about what the extension is doing but causes runtime overhead and asynchronous termination. Static checking, on the other hand, has the advantage of avoiding asynchronous termination and runtime overhead, but is overly conservative. This paper presents a hybrid solution: static checking is used to reject extremely resource-greedy code from the kernel fast path, while dynamic checking is used to enforce overall resource control. This hybrid solution reduces runtime overhead and avoids the problem of asynchronous termination by delaying extension termination until times when no extension code is running, i.e., between processing of packets.

    This paper also presents the design and initial implementation of the key parts of a hybrid resource control technique, called RBClick. RBClick is an extension of the Click modular router, customized for active networking in Janos, an active network operating system. RBClick uses a modified version of Cyclone, a type-safe version of C, to allow users to download new router extensions directly into the Janos kernel. Our measurements of forwarding rates indicate that hybrid resource control can improve the performance of router extensions by up to a factor of two.

    Other authors
    • Jay Lepreau
    See publication

Recommendations received

View Parveen’s full profile

  • See who you know in common
  • Get introduced
  • Contact Parveen directly
Join to view full profile

Other similar profiles

Explore collaborative articles

We’re unlocking community knowledge in a new way. Experts add insights directly into each article, started with the help of AI.

Explore More

Others named Parveen Patel in United States

Add new skills with these courses