# Low Power Dynamically Reconfigurable Rd4a-Bk Stone-Based Hybrid Adder

## Midhila Sundaran A<sup>1</sup>, Shahaziya Parvez M<sup>2</sup>

<sup>1</sup>Student, ECE, IES College of Engineering, Chittilappily, Kerala, India.
<sup>2</sup>Assistant Professor, ECE, IES College of Engineering, Chittilappilly, Kerala, India.

#### How to cite this paper:

Midhila Sundaran A¹, Shahaziya Parvez M², «Low Power Dynamically Reconfigurable Rd4a-Bk Stone-Based Hybrid Adder", IJIRE-V4103-39-44.

Copyright © 2023 by author(s) and 5th Dimension Research Publication. This work is licensed under the Creative Commons Attribution International License (CC BY 4.0).

http://creativecommons.org/licenses/by/4.0/

Abstract: In image-related applications like image processing, image recognition, machine learning, and digital signal processing, approximate computing effectively lowers computational costs. With this approach, there is a trade-off between computational accuracy and the circuit's area, delay, and power requirements. For various applications, the accuracy requirements could change, nevertheless. In some circumstances, precise results are necessary, whereas, in others, tolerable errors are permitted for reduced power and speed. A dynamically reconfigurable hybrid adder built on the radix-4 adder (RD4A) with Brent-Kung (BK) is included in the proposed work. The Accuracy configurable Radix 4 adder (ACRA) is a hybrid adder that can be dynamically reconfigured and is based on the radix-4 adder. The power gating approach is used by the ACRA or Radix-4 adder in the LSB section to dynamically turn off the partial logic gates of an adder element in order to compute accurate or approximative results. The Brent-Kung adder is a parallel prefix adder (PPA) that improves performance and requires less chip area because it has less wiring congestion and increases the regularity of the adder structure. It is used in the hybrid adder's MSB section. The partial sum of one adder element is changed while the adder is operating in the approximate mode in order to decrease the error gap between the inaccurate and accurate results. Additionally, the proposed adder's performance can be assessed when used in image processing applications like image smoothening.

Key Word: Approximate computing, Radix-4 adder, ACRA, Brent-Kung adder, Image smoothening, cost-efficient

## **I.INTRODUCTION**

Image-related applications including image processing, image recognition, machine learning, and digital signal processing currently require increasing computing work. Power consumption and performance are the computing requirements for programs dealing with changes. The circuit's size, delay time, and power consumption are all reduced via approximate calculations. It is a technology for error-tolerant applications that trade-off design expense and computational accuracy during the design process.

There are several different types of adders, including ripple carry adders, full adders, half adders, and carry look-ahead adders. Adders, which are frequently found in integrated circuits, are the fundamental elements of arithmetic operations. The Radix-4 Adder (RD4A), which calculates two bits simultaneously to improve the power-delay product (PDP), as well as the technology Accuracy Configurable Radix-4 Adder (ACRA), which aids in reducing the circuit's power consumption in approximative calculations, are examples of existing technology that performs calculations one bit at a time. The partial sum of one RD4A is dynamically changed in ACRA to reduce the error distance between the approximative and accurate findings. Power gating is employed to carry in the linked logic gates.

ACRA has two modes of operation: accurate mode and approximate mode. Thus, based on the application type, accurate and approximate results can be generated. This existing method is configured as an accurate or approximate adder by controlling the power supply of specific logic gates in the modified RD4A elements. The computation results obtained by ACRA include less power consumption, propagation delay time, and error distance. Compared to the proposed technology the main disadvantage of the ACRA is the delay time so in order to overcome this we proposed ACRA with Brent-Kung adder (PPA) and this will make it more efficient and better in performance. Here ACRA/RD4A is used in the LSB part and PPA is used in the MSB part of the hybrid adder.



Fig 1.1 Architecture of Conventional 2-Bit RD4A

ISSN No: 2582-8746

Fig1.1 shows the architecture of conventional 2-bit RD4A inputs were Ai, Ai+1, Bi, Bi+1, and Cin, and the outputs were Sumi, Sumi+1, and Cout. All inputs of the first and second stages operate in parallel without waiting for a carry. According to the design rule of the RD4A, the output Boolean function can be formulated as shown in Eq. (1.1) to Eq. (1.3). To reuse the logic gate, as displayed in Fig. 1, an XOR gate is divided into one AND gate and two NOR gates.

$$C_{out} = A_{i+1}B_{i+1} + (A_iB_i)(A_{i+1} + B_{i+1}) + C_{in}((A_i + B_i)(A_{i+1} + B_{i+1}))$$

$$Sum_{i+1} = (A_{i+1} \oplus B_{i+1}) \oplus (A_iB_i + C_{in}A_i + C_{in}B_i)$$

$$Sum_i = (A_i \oplus B_i) \oplus C_{in}$$

## **II.RELATED WORK AND MOTIVATION**

Considering inaccuracy-resistant applications, approximate computing can lessen the complexity of design while enhancing performance and power efficiency. In a majority of multimedia applications, we may learn significant information from slightly incorrect outputs. As a result, we are not required to produce accurate outcomes. Applying the relaxation of numerical exactness, this brief presents a new gate-level logic modification method for approximate full adders. A multiplier part greatly contributes to the overall power consumption in microprocessor and signal processing systems as a necessary logic component. The basic building block of a multiplier as well as a speed-limiting component is the adder. The carry-based approximate adder could be used for low-power applications and resulted in 98% power savings when compared to current implementations using accurate adders.[1]

The foundation of many multimedia applications utilized in portable devices is composed of Digital Signal Processing (DSP) building blocks. The majority of these DSP blocks perform algorithms for processing images and videos, with the end result being a picture or a video intended for human consumption. For portable multimedia devices using different signal-processing algorithms and architectures, low-power use is a crucial need. In the majority of multimedia applications, people can learn something from a little inaccurate output. As a consequence, we are not required to generate precisely accurate numerical outputs. [2].Offered various approximate or speculative full adder cells with reduced complexity at the transistor level and then utilized them to create concerned multi-bit adders to generate extremely short critical paths which make voltage scaling feasible.suggested approximate adders can result in power savings of up to 69% when compared to current implementations employing exact adders.

A practical approximate 8-point DTT hardware design that combines trimming and coefficient approximation Providing an alternative to the Discrete Cosine Transform, which is frequently used in picture coding, the Discrete Tchebichef Transform (DTT) represents a discrete class of the Chebyshev orthogonal polynomials. Modern approximation DTT matrices consist of the numbers 0, 1, 1, 2, and 2.A new approximation for the 8-point DTT that can reduce power dissipation by up to 65.4%, and reduce the circuit size by up to 43.6%, The results of FPGA reveal a rise in maximum clock frequency of 83.4%, a reduction in the area of 46.3%, and a reduction in power dissipation of up to 84.2%. [3]

The gains of power delay-area products (PDAP) offered by approximation multiplier designs are higher than those of state-of-the-art works at equivalent accuracy levels. Implementing approximation multipliers and measuring their dynamic power usage on a Field Programmable Gate Array (FPGA) board and multipliers are perfect for low-power, error-tolerant applications with great performance. Adders are basic component of Multipliers. [8]

The approximate radix-4 adder uses loop accumulation to produce a compensatory effect by cleverly altering the Karnaugh map. When compared to a fully accurate adder, the approximate adder uses less power and less energy.[4]

High-performance multiplexer-based radix-4 adder with improved carry to shorten propagation delays. The proposed radix-4 adder utilizes multiplexers that are controlled by the carry signal of the previous step to avoid the long carry chain and achieve great performance. All outputs are full voltage swing, which enhances their drivability.[5]

A carrier selection-based accuracy-configurable approximate adder with its longer carry chains offers improved accuracy, and its Carry Select Unit (CSU) gives it better delay features. In an approximation mode, the design exhibits higher accuracy.[6].

Numerous adders are utilized in various applications; they are the fundamental building blocks based on arithmetic operations and play a significant role in computational effort. The most difficult problem is circuit complexity and power consumption. A number of approximation adders, including ACA, Radix-4 adder (RD4A), and AC-CLA, were developed to prevent this. Which perform approximate computing. In terms of area, power, and delay time, several of them each have their own advantages and disadvantages. The most advantageous of all current adders is alow-power dynamically reconfigurable RD4A, also known as the ACRA (Accuracy Configurable Radix-4 Adder). Due to its dual nature, it may function in two modes, such as the approximate and accurate mode, with the use of a user selection line or control signal, which lowered the computing effort, so that ACRA is applicable to both applications like Approximate and Accurate. According to the ACRA experiment, ACRA consumes quite more power and has more delay than the ACRA-BK (Accuracy configurable radix4 added combined with Brent-Kung adder) method that was proposed. ACRA-BK can use many image-related applications like image processing, image recognition, machine learning, image smoothening, digital signal processing, etc.

Fig2.1 depict the architecture of the existing ACRA, while Sapp is 0, this ACRA will operate in accurate mode (Typical RD4A operation) and if Sapp is 1 then it will work in approximate mode. In approximate mode G5,G6, and G7 are in off state due to the P1 and P2 transistors working and G1 and G2 are also turned off due to Sapp being zero. So the power consumption is less in the approximate mode whereas in the accurate mode, all gates are active state.



Fig 2.1 Architecture of Existing ACRA

## III. ACRA WITH PARALLEL PREFIX ADDER BLOCK DIAGRAM

A Hybrid accuracy-configurable radix-4 adder (ACRA) is proposed by combining ACRA with Parallel Prefix Adder (PPA) in order to lower the delay of ACRA in some way. utilizes the power gating technique to dynamically turn on or off the partial logic gates of an adder element to produce accurate or approximative results. The Brent-Kung adder is the Parallel Prefix Adder utilized in the proposed method. The Sklansky adder, the Kogge-Stone adder, the Brent-Kung adder, and the Ladner-Fischer adder are only a few examples of the several hybrid adder types. Prefix operation is used Parallel Prefix Adders (PPAs) to perform efficient addition. These adders are appropriate for wide-word binary addition. The carry look-ahead adder is the ancestor of the Parallel prefix adders. As a result, it is regarded as one of the better tree adders for reducing wiring tracks, fan out, and gate count and is the foundation for numerous other networks. From LSB to MSB, the error correction process is carried out in order to gradually rectify the computing result. Parallel Prefix Adder (MSB), ACRA (LSB). In order to obtain SUMlsp, which has the least error, LSB will use approximation. SUMmsp is produced when the carry from the LSB and its inputs are combined in the MSB. On the LSB side, we can utilize RD4A if we simply require accurate results, and ACRA if we need both accurate and approximative results. Depending on the outcomes we require.



Fig3.1 Block Diagram of Proposed ACRA with Parallel Prefix Adder

The parallel prefix adder has three steps (i) Pre-processing-computation of carry generation and carry propagation signals based on the number of input bits. (ii) Prefix computation- known as carry graph, is the process of parallelizing all carry signals. (iii) Post-processing- evaluation of the total sum of the inputs.



Fig 3.2 PPA Mechanism

#### III (a) Internal Structure of BK-Adder

The Brent Kung Parallel Prefix Adder does not support extremely high-speed addition and has a long critical path despite having a low fan-out from each prefix cell. Despite this, this parallel adder is suggested as an optimized and regular design that addresses the issues with connecting gates to reduce chip area. As a result, it is regarded as one of the better tree adders for reducing wiring tracks, fan out, and gate count and is the foundation for numerous other networks.It is clear from looking at Table 3.1 that the Brent-Kung adder uses less power than the other adders.That's the reason we proposed BK adder with ACRA.



Fig3.3 Internal Architecture of BK Adder

The internal structure of BK adder shows three stages preprocessing, carry graph, and post-processing. It has diamond-shaped cells, black cells, grey cells, and rectangular-shaped cells. In the first stage, the diamond-shaped cells produce one propagate and generate signal then in the middle stage consists of black cells and grey cells. The output of each diamond cell goes to the black cell and produce two generate and two propagate signal in the grey cell output from the black cells and generate signal combines together and form generate signal only these generate signal and propagate signal combined and form output as S2 at rectangular cell. It is clearly depicted in the block details above.

| SL.<br>NO | Types                    | Delay<br>(ns) | Area (um²) | Power (mW) |
|-----------|--------------------------|---------------|------------|------------|
| 1         | Sklansky Adder           | 8.395         | 10         | 36.14      |
| 2         | Kogge-Stone<br>Adder     | 8.637         | 10         | 32.02      |
| 3         | Brent-Kung<br>Adder      | 8.397         | 8          | 30.14      |
| 4         | Ladner-<br>Fischer Adder | 8.122         | 7          | 36.14      |

Table 3.1 Performance Analysis of PPA

## IV. EXISTING SYSTEM

There are many adders like Ripple Carry Adder, Carry Look-Ahead Adder, Carry Save Adder, etc. Adders are the basic block for arithmetic computational operations. Based on the adder's variety these all have their own specifications in the field of Area, delay time, and power consumption. The computational effort in image-related applications' most challenging issue is power consumption and its circuit complexity based on that there are many adders invented. Fig1.1 shows RD4A is built for getting accurate results in its accurate mode and which calculates two bits simultaneously without waiting for a carry to propagate thus the delay time is reduced. To make this more advantageous a reconfigurable RD4A is invented and named ACRA which is dual in nature. ACRA operates by a user selection line so that it has two modes like accurate mode and approximate mode. It was depicted in Fig 2.1 that when Sapp is zero then it will operate in accurate mode and approximate mode when Sapp is one. Compare to RD4A with ACRA its area, power, and delay which is less. The approximate computation makes ACRA better results and that is useful to image-related applications.

## **V.PROPOSED SYSTEM**

The proposed system is designed to reduce power consumption in image-related applications. The suggested method is a combination of Accuracy Reconfigurable radix-4 adder (ACRA) with Brent-Kung Adder (BK). ACRA is nothing but a dynamically reconfigurable adder and BK adder is a parallel prefix adder. The block diagram of ACRA with Brent Kung adder shows in Fig 3.1. At MSB side carries BK adder (PPA) and the LSB side RD4A or ACRArespectively. LSB will go approximation in order to get the least error- SUMIsp and Carry generated from LSB go to MSB all together with its inputs will produce SUMmsp. The BK adder is a high-speed adder that has a lesser delay and better performance. Based on the experimental results confirmed that ACRA-BK has a lesser delay and power consumption than the existing method.

Xilinx ISE 8.1i was used to compare the existing and proposed systems for performance evaluation. Table 5.1 shows the comparison between the existing technology on the basis of area, delay, and power consumption. From the table, it is seen that the delay, and power of the proposed system have been improved without increasing the area compared to an existing method.

|       | Existing ACRA | Proposed BK-ACRA |
|-------|---------------|------------------|
| Area  | 96            | 96               |
| Delay | 18.457ns      | 16.293ns         |
| Power | 81mW          | 74Mw             |

Table 5.1 Performance Comparison

## VI.EXPIREMENTAL RESULTS

The proposed and current designs are coded and simulated using ModelSim software. The ISE Design suite can also be utilized, although Model Sim is easier to use and comes with a simulation environment, which the majority of ISE Design Suite's earlier editions lacked.



Figure 6.1 Simulated Output of 8-Bit ACRA



Figure 6.2 Simulated Output of Proposed BK-ACRA



Fig 6.3 Simulation output of Image Smoothening Application.



Fig 6.4 Design Summary of Existing ACRA Power



Fig 6.5 Design Summary of Proposed ACRA-BK Power

## VII.PROS AND CONS

The pros of this method are that it uses an approximate computation method to reduce the computational effort. The ADP factor (Area, delay, and power) is lesser than the Existing ACRA, and clearly observed in Table 5.1 that without an increase in area, the power consumption and delay time are minimized. BK adder is the most efficient and low power consumption adder so the proposed system gives better results. The multimode operation can be possible based on the user selection input and cost-effective. The proposed ACRA uses the power gating and control signal forceful settings to minimize circuit overhead. Future hybrid designs could also test out many alternative parallel prefix adders like Han-Carlson and Ladner Fischer. For a variety of image processing applications, the carry speculative adder and other low complexity approximation structures in the LSB component of ACRA can be used. The cons of the method are area is not that much reducing compared to the power and delay time.

#### VIII.CONCLUSION

The circuit design issues for an increasing number of image-related applications include power consumption and performance, which entails computational work. BK-ACRA presented a method to reduce this computational effort. To improve the performance of the traditional RD4A, ACRA is referred to as an existing technique that operates on two different modes as approximate and accurate modes via a user selection line or control signal. In order to compute precise or approximate results, the ACRA or Radix-4 adder in the LSB section uses the power gating approach to dynamically turn off the partial logic gates of an adder element. Due to less wiring congestion and more adder structural regularity, the Brent-Kung adder is a parallel prefix adder (PPA) at the MSB section that enhances performance and needs less chip area. The comparison of Area, Power, and Delay between existing and proposed technologies was done using Xilinx ISE 8.1i. The comparison reveals that the proposed BK-ACRA has significant improvements. Both power usage and delay time were reduced.

## Low Power Dynamically Reconfigurable Rd4a-Bk Stone-Based Hybrid Adder

#### References

- 1. M. Ramasamy, G. Narmadha, and S. Deivasigamani, "Carry-based approximate full adder for low power approximate computing," in Proc. 7th Int. Conf. Smart Comput. Commun. (ICSCC), Jun. 2019, pp. 1–4.
- 2. V. Gupta, D. Mohapatra, A. Raghunathan, and K. Roy, "Low-power digital signal processing using approximate adders," IEEE Trans. Comput.-Aided Design Integr. Circuits Syst., vol. 32, no. 1, pp. 124–137, Jan. 2013.
- 3. G. Paim, L. M. G. Rocha, G. M. Santana, L. B. Soares, E. A. C. da Costa, and S. Bampi, "Power-, area-, and compression efficient eight-point approximate 2-D discrete Tchebichef transform hardware design combining truncation pruning and efficient transposition buffers," IEEE Trans. Circuits Syst. I, Reg. Papers, vol. 66, no. 2, pp. 680–693, Feb. 2019, doi: 10.1109/TCSI.2018.2868513.
- 4. C. Yang and H. Jiao, "Low power Karnaugh map approximate adder for error compensation in loop accumulations," in Proc. Int. Conf. IC Design Technol. (ICICDT), Jun. 2019, pp. 1–4.
- 5. C.-H. Lai, Y.-C. Cheng, T.-C. Wu, and Y.-J. Chang, "Radix-4 adder design with refined carry," in Proc. IEEE Conf. Dependable Secure Comput., Aug. 2017, pp. 300–304. Ratnaparkhe International ConferenceonCommunicationandSignalProcessing,April6-8,2016,India
- 6. S. Natarajan, Imprecise and Approximate Computation, vol. 318. New York, NY, USA: Springer, 1995.
- 7. M. Biasielli, L. Cassano, and A. Miele, "An approximation-based fault detection scheme for image processing applications," in Proc. Design, Autom. Test Eur. Conf. Exhib. (DATE), Mar. 2020, pp. 1331–1334.
- 8. L. Jin and H. Liang, "Deep learning for underwater image recognition in small sample size situations," in Proc. OCEANS Aberdeen, Jun. 2017, pp. 1–4.
- 9. N.-C. Huang, S.-Y. Chen, and K.-C. Wu, "Sensor-based approximate adder design for accelerating error-tolerant and deep-learning applications," in Proc. Design, Autom. Test Eur. Conf. Exhib. (DATE), Mar. 2019, pp. 692–697.
- 10. O. Akbari, M. Kamal, A. Afzali-Kusha, and M. Pedram, "RAP-CLA: A reconfigurable approximate carry look-ahead adder," IEEE Trans. Circuits Syst. II, Exp. Briefs, vol. 65, no. 8, pp. 1089–1093, Aug. 2018.