A Review on Clock Gating Methodologies for power minimization in VLSI circuits

Harpreet Singh\(^1\), Dr. Sukhwinder Singh\(^2\)

\(^1\) PEC University of Technology, Chandigarh, India
\(^2\) PEC University of Technology, Chandigarh, India

Abstract
This research paper gives the introduction of the various clock gating techniques. It also provides the basic clock gating principles, benefits, limitations and enhancements in traditional clock gating scheme. Also it provides the details of parameters which can affect the implementation of the clock gating. As clock signal having great source of power consumption and this is a critical problem in every synchronous circuit. Clock gating is an effective way of reducing the dynamic power dissipation in digital circuits. In a typical synchronous circuit such as the general purpose microprocessor, only a portion of the circuit is active at any given time. Hence, by shutting down the idle portion of the circuit, the unnecessary power consumption can be prevented. One of the ways to achieve this is by masking the clock that goes to the idle portion of the circuit. In This project we will review existing clock gating approaches and also make a comparative analysis of those clock gating technique on some synchronous digital design like ALU (Arithmetic logical unit) and FIFO(first in first out) etc.

Keywords: clock gating, synchronous circuit, VLSI, EDA, DSP

1. Introduction
Clock signal is the most fundamental signal for any digital circuit. In earlier days it was considered that clock signal should be kept as clean as possible and no designer should manipulate clock signal. Very recently it was realize that clock signal consumes the most amount of the total power due to the signal itself and the unnecessary activities created due to clock in underlying logic. Up to 70\% or even more of the dynamic power can be spent in the clock buffers. This result makes intuitive sense since these buffers have the highest toggle rate in the system, there are lots of them, and they often have a high drive strength to minimize clock delay. In addition, the registers receiving the clock dissipate some dynamic power even if there is absolutely no change in input and output.

Clock gating is particularly useful for registers that need to maintain the same logic values over many clock cycles. The main challenges of clock gating are finding the best places to use it and creating the logic to shut off and turn on the clock at the proper times.

In the early days of RTL design, engineers would code clock gating circuits explicitly in the RTL. This approach is error prone in the sense that it is very easy to create a clock gating circuit that glitches during gating, producing functional errors. Today, most libraries include specific clock gating cells that are recognized by the synthesis tool. The combination of explicit clock gating cells and automatic insertion makes clock gating a simple and reliable way of reducing power. No change to the RTL is required to implement this style of clock gating.

During clock-gating, while evaluating clock network power, four contributions are considered: The input capacitances of the module and of the AND gate, the capacitance switched by the interconnection in the clock tree and by the interconnection that feeds the control signal to the gating logic.

2. Review on existing clock gating
There are following different techniques for clock gating as discussed below:
2.1 AND GATES
In sequential circuit one two-input AND gate is inserted in logic for clock gating. One input to AND gate is clock and while the second input is a signal used to control the output (means it will control the sequential circuit's clock). Figure 2 shows the clock gating technique for the counter by inserting one AND Gate. Figure 3 shows the output of counter when counter is negative edge triggered and enable ('en') changes from clock cycle starting from negative edge to the next negative edge, in this case output of the counter changes after one clock cycle of being en='1'. From Figure 4 we have observed that when counter is positive edge triggered and enable is changing starting from positive edge to the next positive edge, counter increments one extra time, due to tiny "Glitch", when it goes down due to more falling time of the enable, and the output in this case is wrong.

In Figure 5 we have shown a major problem of Hazards when any hazard at the enable could be pass on to the Gelk when clk='1' this situation is particularly very dangerous and could jeopardize the correct functioning of the entire system [11]. In the below circuit the AND gated clock gating is applied to an 8bit ALU. The GCLK is the signal is the gated clock signal which controls the clock signal of ALU. The en signal is used to enable and disable the clock to reduce the power consumption of ALU. The GCLK signal is passed only when both en and clk signal are high, thus out of two edges of clock the ALU is active for only one cycle of clock.

2.2 NOR GATES
NOR gate is a very suitable technique for clock gating where we need actions to be performed on Positive Edge of the Global clock [11][13]. For analysis using NOR gate, the circuit connection is shown in Figure 7; in this figure we can observe that Counter will work when enable turn "ON". Figure 8 shows the waveform for incorrect output of the Counter when enable changes to '1' at negative edge of the clock. Incorrect output is due to the small glitch when enable turns low at negative edge of the clock, counter increments one more clock. Figure 9 shows output of Counter when enable changes from positive edge to next positive edge but counter is negative edge triggered. Figure 10 shows correct output of the counter with positive edge triggered because enable is changing from positive edge of the clock to the next positive edge of the clock. In the figure 11 we have shown a major problem of Hazards. When any hazard at the enable could be pass on to the Gelk when clk='0' this situation is
particularly very dangerous and could jeopardize the correct functioning of the entire system [11].

En might have Hazards that must not propagate through AND gate when Global clock is ‘1’ [11][13][15]. However, the delay of the logic for the computation of En may on the critical path of the circuit will increase and its effect must be taken into account during time verification [11][14][15][13]. The counter will take one extra clock cycle delay to change its state and after that it will work normally until, En is de-asserted and this time also it will take one clock cycle extra to stop changing its state. When controlling latch is positive and counter is also positive edge triggered then output of the counter is incorrect because it increments once even when enable is turned down due to a tiny glitch.

2.3 LATCH BASED AND GATE CLOCK GATING

Latch Based AND Gated Clock circuit is shown in Figure 12. The enable signal 'En' is applied through a latch to overcome the previous problems of incorrect output in place of directly connected to AND gate. The Latch is needed for correct behaviour, because

2.4 LATCH BASED NOR GATE CLOCK GATING

Latch based NOR Gated Clock scheme is shown in Figure 17. Here enable signal is applied through latch in place of direct connection to NOR gate [11][13][18].We can observe from Figure that counter will take one extra clock cycle delay to change its state and after that it will work normally until En is de-asserted and this time also it will take one clock cycle extra to stop changing its state. In Figure 19 we have verified that unwanted outputs due to Glitches at the En are avoided. In Figure 21 waveform the case when controlling Latch is negative and Counter is also negative edge triggered is shown. The output of the counter is incorrect because it increments once even when enable is turned down due to a tiny glitch due to the fall time delay of enable.
2.5 MUX BASED CLOCK GATING
In mux based clock gating [13][11] we use multiplexer to close and open a feedback loop around a basic D-type flip-flop under control of the enable signal as shown in Figure 22. As the resulting circuit is simple, robust, and compliant with the rules of synchronous design this is a safe and often also a reasonable choice. On the negative side, this approach takes one fairly expensive multiplexer per bit and consumes more power. This is because any toggling of the clock input of a disabled flip-flop amounts to wasting of energy in discharging and recharging the associated node capacitances for nothing. In Figure waveform of Negative Edge triggered Counter is shown and in Positive edge triggered. When En turns ON then at each Negative and Positive Edge of the clock respectively counter increments and when En goes Low counter holds its state.

3. Clock Gating with Explicit Enabling signal
These techniques requires explicit clock enabling signal generated using some algorithms.
3.1 LOCAL EXPLICIT CLOCK GATING (LECG):

Here clock of Flip flop is gated explicitly by using enable signal. This enable signal increase the control of the circuit explicitly. Here as long as en=0 no clock is passed of flip flop and hence no power consumption, but power consumption starts when en is high i.e 1. If en=1 period is significantly high than over all power consumption increase due to additional circuitry which overweighs the savings.

3.2 ENHANCED CLOCK GATING (ECG):

This method combines both BSCG and LECG and make use of the advantages of both methods like in BSCG switching activity increase the power dissipation which is eliminated by using en signal which gated the circuit for that much period of time. But if aforementioned situation is not emerged then this method consume more power because of complex circuit than previous to methods.

4. Some clock gated flip flop designs

In FF based clock gating, FF is used as control element. When the negative edge of clock arrives, change of Enable will be reflected on FF output. If output of FF is high, clock is applied on sequential circuit. The sleep period is longer in FF based clock gating compared to Latch based clock gating. It means there is a greater chance to miss the change that happens on Enable signal.

4.1 AUTO GATED FLIP FLOP(AGFF)

In Auto gated flip-flops, master latch becomes transparent on the falling edge of the clock, where its output must stabilize no later than a setup time prior to the arrival of the clock’s rising edge, when the master latch becomes opaque and the XOR gate indicates whether or not the slave latch should change its state. If it does not, its clock pulse is stopped and otherwise it is passed. A significant power reduction was reported for register based small circuits, such as counters, where the input of each FF depends on the output of its predecessor in the register. AGFF can also be used for general logic.

4.2 DOUBLE GATED FLIP-FLOP

In double gating approach, both master and slave latches are gated which further increases the probability of power reduction.
The technique uses two gated latches in a master slave configuration. In Figure 29, the first gated latch is positive level-sensitive and so is AND gated with XOR comparator and the second one is negative level-sensitive. Since, as previously shown, gated latches do not present timing failure problems, the technique yield a reliable gated flip-flop that can be used with any clock duty-cycle and is suitable for standard cell design.

4.3 DATA DRIVEN CLOCK GATED FLIP-FLOP

Data driven flip-flop is shown above in the figure. This flip-flop can be made without latch but it is included here to hold the value of OR output for complete cycle and to avoid any glitches or hazards. This flip flop saves more power than AGFF because both the latches included in the D FF are gated in comparison to only slave in case of AGFF.

4.4 LOCK AHEAD CLOCK GATED FLIP-FLOP (LACG)

Look Ahead Clock Gating computes the clock enabling signals of each FF one cycle ahead of time, based on the present cycle data of those FFs on which it depends. Similarly to data driven gating, it is capable of stop- ping the majority of redundant clock pulses. It has however a big advantage of avoiding the tight timing constraints of AGFF and data driven, by allotting a full clock cycle for the enabling signals to be computed and propagate to their gates. Further more, unlike data driven gating whose optimization requires the knowledge of FF’s data toggling vectors, LACG is independent of those. The embedding of LACG logic in the RTL functional code is uniquely defined and easily derived from the underlying logic, independently of the target application. This simplification is advantageous as it significantly simplifies the gating implementation.

5. Limitations of clock gating techniques

The major limitations of the clock gating techniques are enlisted as below:

1. The main limitation is the timing of the clock signal and the ability to group latches with identical gating conditions. Sometimes the grouped latches are too small to be clock gated considering the penalties with compared to the amount of power saving achieved. Also with the increasing wire delays the, placement and routing of the latches close to the cone of the logic may conflict with the placement necessary to group a set of latches needed for clock gating.

2. Sometimes it is difficult to reach the timing closure if the clock gating signal have larger fan out and it is driving many clocks drivers if the latch group is very large.
3. Another problem for deep-submicron technology is the inductive noise due on supply voltage rails. To nullify this effects sometimes designers use on chip decoupling capacitors which can increase the leakage power significantly.

4. In traditional clock gating, it does not take into account the switching activities of the registers it involves.

5. It also does not consider the possibility of one part of the functional unit is in use while the other is not in use.

6. Clock gating reduces test-coverage of the circuit because clock-gated registers are not clocked unless the enable signal is high.

6. Conclusions

In first two techniques, AND and NOR based clock gating, we have output correctness problem due to Glitches and Hazards. Where in latch based AND and NOR techniques Hazards problem is removed. However Glitches problem still exists in them. In fifth techniques we does not have these problem but still we cannot consider it very good power saving technique. Mux based clock gating takes one fairly expensive multiplexer per bit and consumes more power. In FF based clock gating the sleep period is longer compared to Latch based clock gating. It means there is a greater chance to miss the change that happens on Enable signal. In previous all clock gating approaches are facing with the problem of size means those approach which are require less size there is some other issue of glitches and those approaches which having large size so there is no any glitches problem but those structure are increase static problem. In all previous approach only few approach will reduce the clock power but still some are facing the problem of clock power. In some previous architecture there is need of extra input and output pins as we know for VLSI chip increase in input and output pins will increase the cost of the whole process.

References


[3] Hai Li, Swarup Bhunia, Yiran Chen, T. N. Vijaykumar, and Kaushik Roy, —Deterministic Clock Gating for Microprocessor Power Reductionl,1285 EE Building, ECE Department, Purdue University


**Harpreet Singh** is a research scholar at PEC University of Technology, Chandigarh (e-mail: harpreetsingh20789@gmail.com)

---

**Dr. Sukhwinder Singh** is a Assistant Professor at PEC University of technology, Chandigarh (e-mail: sukhwindersingh@pec.ac.in)