Maximum possible parallelization design (in circuit logic) #1287
-
Hey, For the MVAU_hls IP, I believe I have set the required attributes correctly, however I am not sure about the ConvolutionInputGenerator_hls and the VVAU_hls. For ConvolutionInputGenerator there is a SIMD setting, Additionally, I am wondering about the difference and proper configuration of the Pool, streamingmaxpool, and globalaccpool_hls. Which one should I use for maximum performance on maximum pooling layers? The code of the experiments is available at: https://github.com/jurevreca12/c4ml_test_runs/blob/phd-experiments/test_finn.py More generally, is there some paper/documentation where the convolutional and maximum pooling IPs are explained? Thank you for your help and best regards. |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 8 replies
-
Hi, folding of MVAU, VVAU, and ConvInpGen is described here in the documentation: https://finn-dev.readthedocs.io/en/latest/internals.html#constraints-to-folding-factors-per-layer For the MaxPool you can use Pool (set PE to #Channels) or StreamingMaxPool (set PE to #Channels or don't set it at all, because the 2D case will always use max PE anyways (see here)). |
Beta Was this translation helpful? Give feedback.
Yes, at the bottom is also a more detailed description: https://finn-dev.readthedocs.io/en/latest/internals.html#folding
So, for the ConvInpGen you want parallel_window=1 and SIMD=C.
Well, to parallelize further would require parallelization across the spatial dimensions (H, W), which FINN does not support yet.
Some time ago I worked on this PR (#789) to add this degree of parallelization (controlled via the new folding parameter "M" or "MMV"), but it is currently outdated and I don't know when/if I'll find the time to rework it.