Modern datacenter applications struggle with the need to access thousands of servers while still providing a fast response time to the user. In these situations, the user's overall request is not complete until the slowest of the subrequests has completed, meaning that network services must offer not just low latency but predictable latency. We are developing operating system and application-level techniques for building systems with predictable response time. We have identified factors across the system stack that contribute to tail latency; by mitigating these factors, we can reduce the tail latency to within a few percent of optimal.