I have a kernel where I use some shared memory. I copy an automaton to shared memory, execute some pattern matching, get some results and exit.
After exiting this kernel, I call this same kernel and copy the same automaton to shared memory, but now the data that will be tested on the pattern matching is different.
I want to know if i can leave this data (automaton) in shared memory, so my program will be faster, copying the automaton to shared memory just in the first time.
Does there exist any sync function that can be called from device to tell to the host that a kernel has finished, so I can execute the kernel again from the beginning without clearing shared memory.
Any idea? Thanks.
I do not think it is possible. Shared memory data is logically associated to a specific thread block and physically associated to a specific streaming multiprocessor, but a thread block is not physically associated to a specific streaming multiprocessor.