APSIPA Transactions on Signal and Information Processing > Vol 8 > Issue 1

Reliable multicast using remote direct memory access (RDMA) over a passive optical cross-connect fabric enhanced with wavelength division multiplexing (WDM)

Industrial Technology Advances

Kin-Wai Leong, Viscore Technologies Inc, Canada AND Rockport Networks Inc., Canada, Zhilong Li, Viscore Technologies Inc, Canada, Yunqu Leon Liu, Viscore Technologies Inc, Canada, leon.liu@viscore.com
 
Suggested Citation
Kin-Wai Leong, Zhilong Li and Yunqu Leon Liu (2019), "Reliable multicast using remote direct memory access (RDMA) over a passive optical cross-connect fabric enhanced with wavelength division multiplexing (WDM)", APSIPA Transactions on Signal and Information Processing: Vol. 8: No. 1, e25. http://dx.doi.org/10.1017/ATSIP.2019.17

Publication Date: 23 Oct 2019
© 2019 Kin-Wai Leong, Zhilong Li and Yunqu Leon Liu
 
Subjects
 
Keywords
Passive optical cross-connectOptical switchRDMA over Converged EthernetReliable multicastFault tolerance
 

Share

Open Access

This is published under the terms of the Creative Commons Attribution licence.

Downloaded: 1825 times

In this article:
I. INTRODUCTION 
II. PACKET LOSS CHALLENGES OF MULTICAST AND PROPOSED SCALABLE SOLUTIONS 
III. LOW LATENCY AND LOW LOSS IMPLEMENTATION 
IV. RELATED WORK 
V. CONCLUSION AND FUTURE WORK 

Abstract

It has been well studied that reliable multicast enables consistency protocols, including Byzantine Fault Tolerant protocols, for distributed systems. However, no transport-layer reliable multicast is used today due to limitations with existing switch fabrics and transport-layer protocols. In this paper, we introduce a layer-4 (L4) transport based on remote direct memory access (RDMA) datagram to achieve reliable multicast over a shared optical medium. By connecting a cluster of networking nodes using a passive optical cross-connect fabric enhanced with wavelength division multiplexing, all messages are broadcast to all nodes. This mechanism enables consistency in a distributed system to be maintained at a low latency cost. By further utilizing RDMA datagram as the L4 protocol, we have achieved a low-enough message loss-ratio (better than one in 68 billion) to make a simple Negative Acknowledge (NACK)-based L4 multicast practical to deploy. To our knowledge, it is the first multicast architecture able to demonstrate such low message loss-ratio. Furthermore, with this reliable multicast transport, end-to-end latencies of eight microseconds or less (< 8us) have been routinely achieved using an enhanced software RDMA implementation on a variety of commodity 10G Ethernet network adapters.

DOI:10.1017/ATSIP.2019.17