Dataset ID No.

Naïve CUDA Implementation

Optimized CUDA Implementation

Global Load Efficiency

Global Load Transactions per Request

Global Store Efficiency

Global Store Transactions per Request

Global Load Efficiency

Global Load Transactions per Request

Global Store Efficiency

Global Store Transactions per Request

1

31.21

4.82

13.33

29.98

52.75

2.83

82.75

4.83

2

28.16

8.47

13.33

30

41.57

5.74

82.76

4.83

3

28.14

8.58

13.33

30

38.71

6.24

82.76

4.83

4

28.18

8.54

13.33

30

38.85

6.19

100

4

5

28.07

8.79

13.33

30

34.08

7.24

82.76

4.83

6

28.22

8.45

13.33

30

33.83

7.07

88.89

4.5

7

28.32

8.26

13.33

30

33.77

6.92

82.76

4.83

8

27.47

10.41

13.33

30

32.65

8.65

82.76

4.83

9

27.81

9.34

13.33

30

30.01

8.47

88.89

4.5

10

27.68

9.85

13.33

30

29.96

8.92

82.76

4.83