next up previous
Next: Rent's rule Up: Data transfer wires Previous: Global power model

RTL data transfer power estimation

Since we use the three-line local power model, we simulate the CDFG with its typical input traces to get all the three-line switching patterns for any two CDFG edge when their data transfers happens consecutively.(note that each CDFG edge corresponds to one data transfer, or, multi-bit wires in the RTL implementation). With information from the pattern power tables, we generate the unit-length switched capacitance for transitions from transferring one variable to another through profiling. This needs to be done only once for each behavior. Some of unit-length switched capacitance for the Diffeq benchmark is shown in Table II. It shows the unit-length switched capacitance when a row variable is transferred right after a column variable during multiple executions of the Diffeq behavior. Metal layer one pattern-power table is used (We call them basic unit-length switched capacitance). When two variables are output by the same DPU, they have to be transferred through the same output network. The table illustrates how CDFG binding would affect the data transfer energy consumption.
Table II: Profiling results of $Diffeq$.
t1 t2 t3 t4 t5 t6
$t1$ 0.24 0.27 0.25 0.23 0.23 0.27
$t2$ 0.25 0.25 0.28 0.25 0.27 0.25
$t3$ 0.23 0.30 0.29 0.25 0.27 0.22
$t4$ 0.25 0.28 0.29 0.24 0.25 0.24
$t5$ 0.25 0.31 0.30 0.25 0.24 0.23
$t6$ 0.25 0.22 0.20 0.21 0.22 0.28

Our high-level synthesis tool, as discussed later, is based on variable-depth iterative improvement of an initial RTL implementation of the behavior. Thus, at each stage of optimization, a complete RTL description of the circuit is available. We floorplan the RTL DPUs to get each DPU's position on the floorplan, then use the global model aforementioned to estimate the wire length and metal layer assignment. The sum of basic unit-length switched capacitance by data transfers between two units is used as their communication cost during floorplanning. We partition a datapath into a binary tree hierarchy which has balanced area and minimal communication cost between the resultant partitions. The algorithm is an extended version of the one given in [52]. It tends to put DPUs, with high unit-length switched capacitance data transfers between them, into the same partition, thus reduce the total switched capacitance. After floorplanning, output networks' length and metal layer assignment are estimated using the global model. These information are used with power-pattern tables to estimate power consumed by the entire wire as shown in Fig. 6. Binding of CDFG nodes to DPUs is most important to RTL interconnect power consumption. The wire switching activity depends on the switching activity of the outputs of DPUs they are connected to. The latter depends on binding given a schedule. Binding also impacts the topology of the output network, which affects wire length through floorplanning and routing.
next up previous
Next: Rent's rule Up: Data transfer wires Previous: Global power model
Lin Zhong 2003-10-11