-
Notifications
You must be signed in to change notification settings - Fork 8
Open
Description
Hi,
I've reading about the code in simulator/accelerator.py, I found I'm confused about the compute_cycles.
In get_core_compute_cycles:
"""
Compute instruction
args:
ic: Input Channels
oc: Output Channels
ow: Output Width
oh: Output Height
kw: Output Height
kh: Output Height
b: Batch Size
im2col: boolean. If true, we assume the cpu does im2col. Otherwise,
we do convolutions channel-wise
"""
overhead = 0
if im2col:
ni = kw * kh * ic
no = oc
batch = b * oh * ow
compute_cycles = batch * ceil_a_by_b(no, self.M) * \
(ceil_a_by_b(ni, self.N) + overhead)
else:
compute_cycles = b * ceil_a_by_b(oc, self.M) * \
ow * oh * kw * kh * \
(ceil_a_by_b(ic, self.N) + overhead)
return compute_cycles
My questions are:
- In a systolic array, the partial sums produced by the PEs need to propagate downward to the bottom each cycle. Is the forwarding latency considered (for example, in a 3×3 systolic array, the first output would need to wait three cycles, corresponding to the array’s height)?
- If the above assumption is correct, does the overhead account for this? If not, what exactly is the purpose of the overhead?
Metadata
Metadata
Assignees
Labels
No labels