For those who are not familiar with CUDNN library, it supports different algorithms for running a forward convolution; some try to minimize GPU memory footprint, while some other focus on performance without worrying about memory-footprint. Perhaps the idea is to support different GPUs with different specifications (number of CUDA cores, memory size, etc), as well as different use-cases.
Sample output:
n, w, h, c, k, filter_dim, avg_time(us), max_time(us)
1, 3840, 2160, 1, 1, 3, 74383, 75142
1, 3840, 2160, 1, 1, 5, 88465, 88819
1, 3840, 2160, 1, 1, 9, 159752, 160324
Total time taken=322600 us.
Since the output is comma-separated, they can easily be parsed by spreadsheet processor.
Check the code out here:
https://github.com/blacksoil/CUDNN-BenchmarkUtility
No comments:
Post a Comment