ABSTRACT Modular arithmetic units for friendly moduli are crucial for residue number system (RNS) applications. Choosing a moduli set of the form is expected to provide maximum benefits for modular arithmetic performance. In this respect, this paper proposes a hardware architecture for the moduli set (with even), in particular an efficient architecture for the reverse conversion, which is often the bottleneck of RNS processors. The main idea is to remove the CRT multiplicative inverse from the reverse conversion stage, achieving scalable hardware with reduced complexity. We show that the modular multiplication of the multiplicative inverse can be transferred to the arithmetic channels with minimal overhead while reaching outstanding gains in the scalable reverse conversion stage. Furthermore, the presented method enables the utilization of recent hardware reutilization strategies to attain further improvements. Results show that the proposed reverse converter outperforms the state of the art by 17.48% in delay and 53.47% in area for . Advantages become even more substantial for larger values of .
Fernandes et al. (Mon,) studied this question.