|
Distributed 3D Real-time Rendering at Washington |
Overview
We consider the problem of increasing the real-time rendering capabilities available to users of typical workstations. In particular, we propose to exploit clusters of workstations/PCs, where each node is equipped with a hardware graphics accelerator, to provide real-time rendering performance that is one to two generations ahead of what is achievable on a single machine.
While the commodity cluster is an attractive platform because of its ubiquity, low cost, and ease of expandability, it presents a number of challenges. In particular, a cluster-based distributed real-time
renderer must be structured to:
- leverage the multiple hardware graphics accelerators in the cluster,
- only impose overheads compatible with the 30-100 ms per-frame compute load,
- minimize the frame rate variance, and
- decouple communication bandwidth from scene complexity and cluster size to achieve application and system scalability.
To meet these challenges, we have designed a novel work partitioning technique called Image Layer Decomposition (ILD) to meet the challenges of real-time rendering on commodity clusters. ILD is well suited to this platform because:
- the rendering problem at each node looks exactly the same as if it were an independent rendering application, facilitating the use of hardware rendering and all attendant optimizations;
- using a small amount of preprocessing, we can predict the amount of data that each node must send per frame, allowing us to factor in the transmission time of each node when partitioning the work to minimize load imbalances;
- by replicating the scene on all nodes, we can avoid the need to communicate polygons, thereby decoupling the bandwidth requirement from scene complexity; and
- each node needs only send the pixels that it has rendered, such that, in the best case, the bandwidth required is only proportional to the size of the final image and not the size of the NOW.
We have implemented ILD in a prototype distributed rendering toolkit called DDDDRRaW. Results from this implementation (and a simulation study) show that:
ILD effectively exploits cluster resources to increase real-time rendering performance on two small clusters despite several significant trade-offs of performance for implementation simplicity, and
a carefully implemented ILD-based distributed renderer should scale well to moderately-sized clusters (~16 nodes).
We have also considered the
abstract problem of scheduling tasks with unpredictable service times on
distinct processing nodes so as to meet a real-time deadline, given that all
communication among nodes entails some overhead. In DDDDRRaW, this corresponds to the problem of
intra-frame scheduling, i.e., how to maximize the likelihood that all
rendering tasks will be completed on time for the current frame.
Despite the load balancing that is performed by ILD at the beginning of
each frame, significant load imbalances can still develop because of rendering
optimizations such as levels-of-details and object culling.
To address this problem, we have studied two distinct classes of
scheduling policies, static, in which task reassignments can only occur
at specific times, and dynamic, in which reassignments are triggered by
some node going idle. For both
classes, we have further examined global reassignment, in which all
nodes are rescheduled at a rescheduling moment, and local reassignment,
in which only a subset of the nodes engage in rescheduling at any one time.
We show that, over a range of parameterizations appropriate to
commodity clusters, global dynamic policies work best.
We have also designed a new policy, Dynamic with Shadowing, that
assigns each of a small number of tasks to the schedules of multiple nodes to
reduce the amount of communication required for load-balancing.
This policy dominates all other alternatives considered over most of
the parameter space.
- Abstract from the original NSF proposal.
- Scheduling Policies to Support Distributed 3D Multimedia Applications, Thu D. Nguyen and John Zahorjan, In Proceedings of the SIGMETRICS '98/ PERFORMANCE '98 Joint International Conference on Measurement and Modeling of Computer Systems, June 1998.
- Image Layer Decomposition for Distributed Rendering on NOWs, Thu D. Nguyen and John Zahorjan, In Proceedings of the International Parallel and Distributed Processing Symposium (IPDPS), May 2000.
- DRRRRaW: A Prototype Distributed 3D Real-Time Rendering Toolkit for Commodity Clusters. T. D. Nguyen, C. Peery, and J. Zahorjan. In Proceedings of the International Parallel and Distributed Processing Symposium (IPDPS), May 2001.
- TR (includes all details although slightly out-of-date compared to IPDPS version)
Current People
People Who Have Participated in DDDDRRaW
- Ruth E. Anderson
- Jason Griffith
- Edward Sumanaseni