The role of the first observation in backward algorithm

My question is related to the Backward algorithm.
The recursive formula for the algorithm is as follows:

sigma j = 1 to N (t+1(j) * a_ij * b_j(O_t+1))
Where t+1(j) is the recursive element, aij is the transition probability from state i to j, and b_j(O_t+1) is the omission probability of the observation O at time t+1.
Given the above, when I start calculating the backward probabilities, it seems that it does not matter what the first observation is, since the observation at time t is not taken into consideration for calculation of the corresponding backward probability.
For Example, for a sequence of observations A,T,G,A, it does not matter what the first observation , i.e, A in this case, is since its emission probability is not taken into consideration in the backward algorithm.
I just want to know if my intuition is correct. Otherwise please point me to some reference or explanation which will clarify my doubt.

Solution

Given the above, when I start calculating the backward probabilities, it seems that it does not matter what the first observation is, since the observation at time t is not taken into consideration for calculation of the corresponding backward probability.

That's correct.

b[t](i) is the probability of observing the sequence O[t+1], O[t+2], ..., O[T] given the hidden state i at the time t.

One way to think about it is this. The hidden state at the time t independently affects two things: the observation O[t], and the probability distribution of the subsequent states and observations. So, once the state is fixed, O[t] and O[t+1], O[t+2], ..., O[T] are conditionally independent (given the value of that state). That's why O[t] doesn't appear in the calculation of b[t](i).