Dataset ID No. | Naïve CUDA Implementation | Optimized CUDA Implementation | ||||||
Global Load Efficiency | Global Load Transactions per Request | Global Store Efficiency | Global Store Transactions per Request | Global Load Efficiency | Global Load Transactions per Request | Global Store Efficiency | Global Store Transactions per Request | |
1 | 31.21 | 4.82 | 13.33 | 29.98 | 52.75 | 2.83 | 82.75 | 4.83 |
2 | 28.16 | 8.47 | 13.33 | 30 | 41.57 | 5.74 | 82.76 | 4.83 |
3 | 28.14 | 8.58 | 13.33 | 30 | 38.71 | 6.24 | 82.76 | 4.83 |
4 | 28.18 | 8.54 | 13.33 | 30 | 38.85 | 6.19 | 100 | 4 |
5 | 28.07 | 8.79 | 13.33 | 30 | 34.08 | 7.24 | 82.76 | 4.83 |
6 | 28.22 | 8.45 | 13.33 | 30 | 33.83 | 7.07 | 88.89 | 4.5 |
7 | 28.32 | 8.26 | 13.33 | 30 | 33.77 | 6.92 | 82.76 | 4.83 |
8 | 27.47 | 10.41 | 13.33 | 30 | 32.65 | 8.65 | 82.76 | 4.83 |
9 | 27.81 | 9.34 | 13.33 | 30 | 30.01 | 8.47 | 88.89 | 4.5 |
10 | 27.68 | 9.85 | 13.33 | 30 | 29.96 | 8.92 | 82.76 | 4.83 |