This version of coarse scan skeleton is only kept for comparison purposes. This is the slowest version of a coarsened scan skeleton, since it has 4N memory accesses (2N accesses in computing the divide phase scan and 2N accesses in t he update phase). More...
This version of coarse scan skeleton is only kept for comparison purposes. This is the slowest version of a coarsened scan skeleton, since it has 4N memory accesses (2N accesses in computing the divide phase scan and 2N accesses in t he update phase).
Extra accesses to memory cause a significant slow down of the algorithm, but more importantly, the allocation of intermediate buffers by each divide phase scan tasks causes this simple scan skeleton to be unusable in real applications.
S | the scan skeleton |
Tag | type of exclusive scan to use for the conquering phase |
CoarseTag | a tag to specify the required specialization for coarsening |
ExecutionTag | a tag to specify the execution method used for the coarsened chunks |