The source data keeps throwing values for the field that keep getting bigger and bigger in length. Right now I'm using VARCHAR(200) but I might go for VARCHAR(400)
. Are there any disadvantages using a large number?
What do you mean "downside"? There is a really big downside if you don't make the column big enough -- you can't use it to store the values you want to store there.
As for additional overhead, you don't need to worry about that. A varchar()
type basically only takes up the storage needed for the value, plus a small overhead for the length. Also, "400" is not such a big number, especially when compared to "200".
So, if you need 400 bytes to store the value, change the table to store it. There may be overhead for changing the length of the value. I'm not sure if RedShift will feel the need to copy the data because the type changed. However, the effect on performance should be negligible.