A multi-computer system (MCS) offers the high speed and throughput
needed in solving computing-intensive problems. Two main components
constitute a MCS. These are the processing elements (PEs) and the
interconnection network (IN). A faulty IN can lead to data losses and/or
throughput degradation. Hence, it is necessary to consider the fault
tolerance and reliability aspects in assessing the efficacy of INs. This
paper provides coverage of the fault tolerance and reliability aspects of
hypercube multi-computer networks (HCNs). In particular, we cover three
broad aspects: task-based reliability, fault-tolerant routing, and
communication in faulty HCNs. Our coverage includes introducing the
particular HC architecture, analyze its reliability and assess its fault
tolerance. The analysis provided in the paper is deemed helpful to HCN
designers in making informed decisions about the appropriate approaches
that can be used to assess the reliability and fault tolerance of existing
HCNs.
|