Next: Rent's rule
Up: Data transfer wires
Previous: Global power model
Since we use the three-line local power model, we simulate the
CDFG with its typical input traces to get all the three-line
switching patterns for any two CDFG edge when their data transfers
happens consecutively.(note that each CDFG edge corresponds to one
data transfer, or, multi-bit wires in the RTL implementation).
With information from the pattern power tables, we generate the
unit-length switched capacitance for transitions from transferring
one variable to another through profiling. This needs to be done
only once for each behavior. Some of unit-length switched
capacitance for the Diffeq benchmark is shown in
Table II. It shows the unit-length switched
capacitance when a row variable is transferred right after a
column variable during multiple executions of the Diffeq
behavior. Metal layer one pattern-power table is used (We call
them basic unit-length switched capacitance). When two
variables are output by the same DPU, they have to be transferred
through the same output network. The table illustrates how CDFG
binding would affect the data transfer energy consumption.
Table II:
Profiling results of .
|
t1 |
t2 |
t3
|
t4 |
t5 |
t6
|
|
|
|
0.24 |
0.27 |
0.25 |
0.23 |
0.23 |
0.27 |
|
|
|
0.25 |
0.25 |
0.28 |
0.25 |
0.27 |
0.25 |
|
|
|
0.23 |
0.30 |
0.29 |
0.25 |
0.27 |
0.22 |
|
|
|
0.25 |
0.28 |
0.29 |
0.24 |
0.25 |
0.24 |
|
|
|
0.25 |
0.31 |
0.30 |
0.25 |
0.24 |
0.23
|
|
|
|
0.25 |
0.22 |
0.20 |
0.21 |
0.22 |
0.28 |
|
|
Our high-level synthesis tool, as discussed later, is based on
variable-depth iterative improvement of an initial RTL
implementation of the behavior. Thus, at each stage of
optimization, a complete RTL description of the circuit is
available. We floorplan the RTL DPUs to get each DPU's position on
the floorplan, then use the global model aforementioned to
estimate the wire length and metal layer assignment.
The sum of basic unit-length switched capacitance by data
transfers between two units is used as their communication cost
during floorplanning. We partition a datapath into a binary tree
hierarchy which has balanced area and minimal communication cost
between the resultant partitions. The algorithm is an extended
version of the one given in [52]. It tends to put DPUs,
with high unit-length switched capacitance data transfers between
them, into the same partition, thus reduce the total switched
capacitance. After floorplanning, output networks' length and
metal layer assignment are estimated using the global model. These
information are used with power-pattern tables to estimate power
consumed by the entire wire as shown in Fig. 6.
Binding of CDFG nodes to DPUs is most important to RTL
interconnect power consumption. The wire switching activity
depends on the switching activity of the outputs of DPUs they are
connected to. The latter depends on binding given a schedule.
Binding also impacts the topology of the output network, which
affects wire length through floorplanning and routing.
Next: Rent's rule
Up: Data transfer wires
Previous: Global power model
Lin Zhong
2003-10-11