How to layout vertex data for efficient usage in a compute shader

I want to write a small toy path tracer and I wondered what is the most performant/efficient way to layout the vertex data, I want positions, normals and tex coordinates, is it better to reduce memory or to conform to the 16-byte alignment of GPUs:

The naive way

struct Vertex {
   position: [f32;3],
   normal: [f32;3],
   texcoord: [f32;2],
}

The generous way

struct Vertex {
   position: [f32;4],
   normal: [f32;4],
   texcoord: [f32;4],
}

Everything properly aligned but more memory bandwidth used.

The hacky aligned way

struct Vertex {
   position: [f32;3],
   texcoord_u: f32,
   normal: [f32;3]
   texcoord_v: f32,
}

Packing u, v into the w coordinates, probably sacrificing reading speed because of worse locality?

Solution

So I found it has a very low impact on my pathtracer performance:

2 results in around 20.1ms on average per frame on an M1

3 around 20.2ms and

1 should be pretty much the same as 2, so I chose 3 to save some memory as it only uses 2/3 of the bytes.