Negative gate delay - is it possible

As discussed in our post ‘propagation delay’, the difference in time from the input reaching 50% of the final value of the transition to that of the output is termed as propagation delay. It seems a bit absurd to have negative value of propagation delay as it provides a misinterpretation of the effect happening before the cause. Common sense says that the output should only change after input. However, under certain special cases, it is possible to have negative delay. In most of such cases, we have one or more of the following conditions:
i)                    A high drive strength transistor
ii)                   Slow transition at the input
iii)                 Small load at the output

Under all of the above mentioned conditions, the output is expected to transition faster than the input signal, and can result in negative propagation delay. An example negative delay scenario is shown in the figure below. The output signal starts to change only after the input signal; however, the faster transition of the output signal causes it to attain 50% level before input signal, thus, resulting in negative propagation delay. In other words, negative delay is a relative concept.
The negative propagation delay can result in certain scenarios as shown in the figure below
Figure 1: Input and output transitions showing negative input delay


Propagation Delay


What is propagation delay: Propagation delay of a logic gate is defined as the time it takes for the effect of change in input to be visible at the output. In other words, propagation delay is the time required for the input to be propagated to the output. Normally, it is defined as the difference between the times when the transitioning input reaches 50% of its final value to the time when the output reaches 50% of the final value showing the effect of input change. Here, 50% is the defined as the logic threshold where output (or, in particular, any signal) is assumed to switch its states.


2 input AND gate
Figure 1: 2-input AND gate

Propagation delay example: Let us consider a 2-input AND gate as shown in figure 1, with input ‘I2’ making transition from logic ‘0’ to logic ‘1’ and 'I1' being stable at logic value '1'. In effect, it will cause the output ‘O’ also to make a transition. The output will not show the effect immediately, but after certain time interval. The timing diagram for the transitions are also shown. The propagation delay, in this case, will be the time interval between I2 reaching 50% while rising to 'O' reaching 50% mark while rising as a result of 'I2' making a transition. The propagation delay is labeled as “TP” in figure 2.

The propagation delay is the time from 50 percent of transitioning input to 50% of transitioning output
Figure 2: Propagation delay


On what factors propagation delay depends: The propagation delay of a logic gate is not a constant value, but is dependent upon two factors:

  1. Transition time of the input causing transition at the output: More the transition time at the input, more will be the propagation delay of the cell. For less propagation delays, the signals should switch faster.
  2. The output load being felt by the logic gate: Greater is the capacitive load sitting at the output of the cell, more will be the effort put (time taken) to charge it. Hence, greater is the propagation delay.
How Propagation delay of logic gates is calculated: In physical design tools, there can be following sources of calculation of propagation delay:

  • Liberty file: Liberty file contains a lookup table for the each input-to-output path (also called as cell arc) for logic gates as .lib models. The table contains values for different input transition times and output loads corresponding to cell delay. Depending upon the input transition and output load that is present in the design for the logic gate under consideration, physical design tools interpolate between these values and calculate the cell delay.
  • SDF file: SDF (Standard Delay Format) is the extracted delay information of a design. The current delay information, as calculated, can be dumped into SDF file. It can, then, be read back. In case SDF is read, delays are not calculated and SDF delays are given precedence.

Output transition time: The output transition time is also governed by the same two factors as propagation delay. In other words, larger transition time and load increase the transition time of the signal at the output of the logic gate. So, for better transition times, both of these should be less.

Read next : Negative delay- How is it possible


Also read:

Multicycle paths : The architectural perspective


Definition of multicycle paths: By definition, a multi-cycle path is one in which data launched from one flop is allowed (through architecture definition) to take more than one clock cycle to reach to the destination flop. And it is architecturally ensured either by gating the data or clock from reaching the destination flops. There can be many such scenarios inside a System on Chip where we can apply multi-cycle paths as discussed later. In this post, we discuss architectural aspects of multicycle paths. For timing aspects like application, analysis etc, please refer Multicycle paths handling in STA.

Why multi-cycle paths are introduced in designs: A typical System on Chip consists of many components working in tandem. Each of these works on different frequencies depending upon performance and other requirements. Ideally, the designer would want the maximum throughput possible from each component in design with paying proper respect to power, timing and area constraints. The designer may think to introduce multi-cycle paths in the design in one of the following scenarios:
      
       1)      Very large data-path limiting the frequency of entire component: Let us take a hypothetical case in which one of the components is to be designed to work at 500 MHz; however, one of the data-paths is too large to work at this frequency. Let us say, minimum the data-path under consideration can take is 3 ns. Thus, if we assume all the paths as single cycle, the component cannot work at more than 333 MHz; however, if we ignore this path, the rest of the design can attain 500 MHz without much difficulty. Thus, we can sacrifice this path only so that the rest of the component will work at 500 MHz. In that case, we can make that particular path as a multi-cycle path so that it will work at 250 MHz sacrificing the performance for that one path only.
     
     2)      Paths starting from slow clock and ending at fast clock: For simplicity, let us suppose there is a data-path involving one start-point and one end point with the start-point receiving clock that is half in frequency to that of the end point. Now, the start-point can only send the data at half the rate than the end point can receive. Therefore, there is no gain in running the end-point at double the clock frequency. Also, since, the data is launched once only two cycles, we can modify the architecture such that the data is received after a gap of one cycle. In other words, instead of single cycle data-path, we can afford a two cycle data-path in such a case. This will actually save power as the data-path now has two cycles to traverse to the endpoint. So, less drive strength cells with less area and power can be used. Also, if the multi-cycle has been implemented through clock enable (discussed later), clock power will also be saved.

Implementation of multi-cycle paths in architecture: Let us discuss some of the ways of introducing multi-cycle paths in the design:

      1)      Through gating in data-path: Refer to figure 1 below, wherein ‘Enable’ signal gates the data-path towards the capturing flip-flop. Now, by controlling the waveform at enable signal, we can make the signal multi-cycle. As is shown in the waveform, if the enable signal toggles once every three cycles, the data at the end-point toggles after three cycles. Hence, the data launched at edge ‘1’ can arrive at capturing flop only at edge ‘4’. Thus, we can have a multi-cycle of 3 in this case getting a total of 3 cycles for data to traverse to capture flop. Thus, in this case, the setup check is of 3 cycles and hold check is 0 cycle.
Figure 1: Introducing multicycle paths in design by gating data path



    Now let us extend this discussion to the case wherein the launch clock is half in frequency to the capture clock. Let us say, Enable changes once every two cycles. Here, the intention is to make the data-path a multi-cycle of 2 relative to faster clock (capture clock here). As is evident from the figure below, it is important to have Enable signal take proper waveform as on the waveform on right hand side of figure 2. In this case, the setup check will be two cycles of capture clock and hold check will be 0 cycle.
   
   
When the launch clock is half in frequency, it is better to make the path a multicycle of 2 because data will anyways be launched once every few cycles.
Figure 2: Introducing multi-cycle path where launch clock is half in  frequency to capture clock


        2) Through gating in clock path: Similarly, we can make the capturing flop capture data once every few cycles by clipping the clock. In other words, send only those pulses of clock to the capturing flip-flop at which you want the data to be captured. This can be done similar to data-path masking as discussed in point 1 with the only difference being that the enable will be masking the clock signal going to the capturing flop. This kind of gating is more advantageous in terms of power saving. Since, the capturing flip-flop does not get clock signal, so we save some power too.
    
Figure 3: Introducing multi cycle paths through gating the clock path
      Figure 3 above shows how multicycle paths can be achieved with the help of clock gating. The enable signal, in this case, launches from negative edge-triggered register due to architectural reasons (read here). With the enable waveform as shown in figure 3, flop will get clock pulse once in every four cycles. Thus, we can have a multicycle path of 4 cycles from launch to capture. The setup check and hold check, in this case, is also shown in figure 3. The setup check will be a 4 cycle check, whereas hold check will be a zero cycle check.

Pipelining v/s introducing multi-cycle paths: Making a long data-path to get to destination in two cycles can alternatively be implemented through pipelining the logic. This is much simpler approach in most of the cases than making the path multi-cycle. Pipelining means splitting the data-path into two halves and putting a flop between them, essentially making the data-path two cycles. This approach also eases the timing at the cost of performance of the data-path. However, looking at the whole component level, we can afford to run the whole component at higher frequency. But in some situations, it is not economical to insert pipelined flops as there may not be suitable points available. In such a scenario, we have to go with the approach of making the path multi-cycle.

References:



False paths basics and examples

False path is a very common term used in STA. It refers to a timing path which is not required to be optimized for timing as it will never be required to get captured in a limited time when excited in normal working situation of the chip. In normal scenario, the signal launched from a flip-flop has to get captured at another flip-flop in only one clock cycle. However, there are certain scenarios where it does not matter at what time the signal originating from the transmitting flop arrives at the receiving flop. The timing path resulting in such scenarios is labeled as false path and is not optimized for timing by the optimization tool.
Definition of false path: A timing path, which can get captured even after a very large interval of time has passes, and still, can produce the required output is termed as a false path. A false path, thus, does not need to get timed and can be ignored while doing timing analysis.
Common false path scenarios:  Below, we list some of the examples , where false paths can be applied:
Synchronized signals: Let us say we have a two flop synchronizer placed between a sending and receiving flop (The sending and receiving flops may be working on different clocks or same clock). In this scenario, it is not required to meet timing from launching flop to first stage of synchronizer. Figure 1 below shows a two-flop synchronizer. We can consider the signal coming to flop1 as false, since, even if the signal causes flop1 to be metastable, it will get resolved before next clock edge arrives with the success rate governed by MTBF of the synchronizer. This kind of false path is also known as Clock domain crossing (CDC).

The paths to synchronizer are false as the metastability is accounter for
Figure 1: A two flop synchronizer
However, this does not mean that wherever you see a chain of two flops, there is a false path to first flop. The two flops may be for pipelining the logic. So, once it is confirmed that there is a synchronizer, you can specify the signal as false.
Similarly, for other types of synchronizers as well, you can specify false paths.

False paths for static signals arising due to merging of modes: Suppose you have a structure as shown in figure 1 below. You have two modes, and the path to multiplexer output is different depending upon the mode. However, in order to cover timing for both the modes, you have to keep the “Mode select bit” unconstrained. This result in paths being formed through multiplexer select also. You can specify "set false path" through select of multiplexer as this will be static in both the modes, if there are no special timing requirements related to mode transition on this signal. Specifically speaking, for the scenario shown in figure 1,
          Mode 1 : set_case_analysis 0 MUX/SEL
          Mode 2 : set_case_analysis 1 MUX/SEL
          Mode with Mode1 and Mode2 merged together : set_false_path -through MUX/SEL
Select signal selects between two paths for different modes




Figure 2: Mode selection signal selecting between mode1 and mode2 paths
Architectural false paths: There are some timing paths that are never possible to occur. Let us illustrate with the help of a hypothetical, but very simplistic example that will help understand the scenario. Suppose we have a scenario in which the select signals of two 2:1 multiplexers are tied to same signal. Thus, there cannot be a scenario where data through in0 pin of MUX0 can traverse through in1 pin of MUX1. Hence, it is a false path by design architecture. Figure 3 below depicts the scenario.
Figure shows an example of a path that is false by architecture.
Figure 3: A hypothetical example showing architectural false path
Specifying false path: The SDC command to specify a timing path as false path is "set_false_path". We can apply false path in following cases:
  •      From register to register paths
    • set_false_path -from regA -to regB

  •      Paths being launched from one clock and being captured at another
    • set_false_path -from [get_clocks clk1] -to [get_clocks clk2]

  •      Through a signal
    • set_false_path -through [get_pins AND1/B]

Also read: