Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CUDA vs Naive Speedup? #1

Closed
glenn-jocher opened this issue Mar 11, 2021 · 1 comment
Closed

CUDA vs Naive Speedup? #1

glenn-jocher opened this issue Mar 11, 2021 · 1 comment

Comments

@glenn-jocher
Copy link

glenn-jocher commented Mar 11, 2021

@d-li14 hi, thanks for your contributions and for this amazing idea!

I'd like to try your involution() module in a non-mmdetection repo (YOLOv5), and was trying to figure out the best technical way to do this using your existing code here:

The naive implementation seems easier to integrate into new works, so I'd like to use that, and my main question is:
How much of a speed change do you see in training (and inference) when moving from naive to cuda? Thanks!

@d-li14
Copy link
Owner

d-li14 commented Mar 12, 2021

Thanks for your feedback!
We have not tried involution with the YOLO framework. Moreover, the practical change may depend on specific platforms and test settings. For reference, we consider another one-stage detector RetinaNet in our work. The inference speedup on a single NVIDIA V100 GPU is roughly 40%.
Another major drawback of the naive implementation is that it costs much GPU memory due to the harmful unfold operation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants