I have been learning DirectX 11, and in the book I am reading, it states that the Rasterizer outputs Fragments. It is my understanding, that these Fragments are the output of the Rasterizer(which inputs geometric primitives), and in-fact are just 2D Positions(your 2D Render Target View)
Here is what I think I understand, please correct me.
The Rasterizer takes Geometric Primitives(spheres, cubes or boxes, toroids cylinders, pyramids, triangle meshes or polygon meshes) (https://en.wikipedia.org/wiki/Geometric_primitive). It then translates these primitives into pixels(or dots) that are mapped to your Render Target View(that is 2D). This is what a Fragment is. For each Fragment, it executes the Pixel Shader, to determine its color.
However, I am only assuming because there is no simple explanation of what it is (That I can find).
So my questions are ...
1: What is a Rasterizer? What are the inputs, and what is the output?
2: What is a fragment, in relation to Rasterizer output.
3: Why is a fragment a float 4 value (SV_Position)? If it just 2D Screen Space for the Render Target View?
4: How does it correlate to the Render Target Output (the 2D Screen Texture)?
5: Is this why we clear the Render Target View(to whatever color) because the Razterizer, and Pixel Shader will not execute on all X,Y locations of the Render Target View?
Thank you!
I do not use DirectXI but OpenGL instead but the terminology should bi similar if not the same. My understanding is this:
(scene geometry) -> [Vertex shader] -> (per vertex data)
(per vertex data) -> [Geometry&Teseletaion shader] -> (per primitive data)
(per primitive data) -> [rasterizer] -> (per fragment data)
(per fragment data) -> [Fragment shader] -> (fragment)
(fragment) -> [depth/stencil/alpha/blend...]-> (pixels)
So in Vertex shader you can perform any per vertex operations like transform of coordinate systems, pre-computation of needed parameters etc.
In geometry and teselation you can compute normals from geometry, emit/convert primitives and much much more.
The Rasterizer then convert geometry (primitive) into fragments. This is done by interpolation. It basically divide the viewed part of any primitive into fragments see convex polygon rasterizer.
Fragments are not pixels nor super pixels but they are close to it. The difference is that they may or may not be outputted depending on the circumstances and pipeline configuration (Pixels are visible outputs). You can think of them as a possible super-pixels.
Fragment shader convert per fragment data into final fragments. Here you are computing per fragment/pixel lighting,shading, doing all the texture stuff, compute colors etc. The output is also fragment which is basically pixel + some additional info so it does not have just position and color but can have other properties as well (like more colors, depth, alpha, stencil, etc).
This goes into final combiner which provides the depth test and any other enabled tests or functionality like Blending. And only that output goes into framebuffer as pixel.
I think that answered #1,#2,#4.
Now #3 (I may be wrong here due to my lack of knowledge about DirectX) in per fragment data you often need 3D position of fragments for proper lighting or what ever computations and as homogenuous coordinates are used we need 4D (x,y,z,w) vector for it. The fragment itself has 2D coordinates but the 3D position is its interpolated value from geometry passed from Vertex shader. So it may not contain the screen position but world coordinates instead (or any other).
#5 Yes the scene may not cover whole screen and or you need to preset the buffers like Depth, Stencil, Alpha so the rendering works as should and is not invalidated by previous frame results. So we need to clear framebuffers usually at start of frame. Some techniques require multiple clearings per frame others (like glow effect) clears once per multiple frames ...