How do fragments get generated by rasterizer in OpenGL

I came across the description of rasterization and it basically says that when an object gets projected onto screen that what happens is that a scan takes place over all the pixels on the window/screen and decides if the pixel/fragment is within the triangle and hence determines that the pixel/fragment is inside the triangle and follows with further processing for the pixel/fragment such as colouring etc

Now since i am studying OpenGL and i do know that OpenGL probably has its own implementations of this process i was wondering whether this also takes place with OpenGL since of the "Scan-Conversion" process of vertices that i have read in OpenGL tutorial

Now another question related to this i have is that i know that the image/screen/window of pixels is an image or 2d array of pixels also known as the default framebuffer that is linear

So what i am wondering is if that is the case, how would projecting the 3 vertices of a triangle define which pixels are covered in side it?

Does the rasterizer draw the edges of a triangle first and then scans through each pixel or 2d array of pixels (also known as the default framebuffer) and sees if the points are between the lines using some mathematical method or some other simpler process happens?

Solution

Quoting from GPU Gems: Parallel Prefix Sum (Scan) with CUDA, it describes how OpenGL does its scan conversion and compares it with CUDA which I think suffices as the answer of my question:

Prior to the introduction of CUDA, several researchers implemented scan using graphics APIs such as OpenGL and Direct3D (see Section 39.3.4 for more). To demonstrate the advantages CUDA has over these APIs for computations like scan, in this section we briefly describe the work-efficient OpenGL inclusive-scan implementation of Sengupta et al. (2006). Their implementation is a hybrid algorithm that performs a configurable number of reduce steps as shown in Algorithm 5. It then runs the double-buffered version of the sum scan algorithm previously shown in Algorithm 2 on the result of the reduce step. Finally it performs the down-sweep as shown in Algorithm 6.

Example 5. The Reduce Step of the OpenGL Scan Algorithm

1: for d = 1 to log2 n do 
2:     for all k = 1 to n/2 d  – 1 in parallel do 
3:          a[d][k] = a[d – 1][2k] + a[d – 1][2k + 1]]
Example 6. The Down-Sweep Step of the OpenGL Scan Algorithm

1: for d = log2 n – 1 down to 0 do 
2:     for all k = 0 to n/2 d  – 1 in parallel do 
3:          if i > 0 then 
4:             if k mod 2 U2260.GIF 0 then 
5:                  a[d][k] = a[d + 1][k/2]
6:             else 
7:                  a[d][i] = a[d + 1][k/2 – 1]

The OpenGL scan computation is implemented using pixel shaders, and each a[d] array is a two-dimensional texture on the GPU. Writing to these arrays is performed using render-to-texture in OpenGL. Thus, each loop iteration in Algorithm 5 and Algorithm 2 requires reading from one texture and writing to another.

The main advantages CUDA has over OpenGL are its on-chip shared memory, thread synchronization functionality, and scatter writes to memory, which are not exposed to OpenGL pixel shaders. CUDA divides the work of a large scan into many blocks, and each block is processed entirely on-chip by a single multiprocessor before any data is written to off-chip memory. In OpenGL, all memory updates are off-chip memory updates. Thus, the bandwidth used by the OpenGL implementation is much higher and therefore performance is lower, as shown previously in Figure 39-7.