For some time, I've been exploring and tinkering with Socket Programming in C. I come predominantly from a C# background, with some (older and now very rusty) experience in C++.
Take for example the following line of code:
if(connect(sockfd,(SA*)&servaddr,sizeof(servaddr)) != -1)
I know there would normally be braces and further code, of course, but I've left this out for the purposes of this query.
In my understanding, within the line above, the Parameters logically (to me) read as follows;
1: We want to attempt a connection to TCP/IP Server and for communication to be done via the named SOCKET Object (in this case, sockfd).
2: We are defining the Socket Parameters by passing/layering the 'SA struct' Object Data/Parameters to it, which itself we are further defining with the struct sockaddr_in servaddr
Data/Parameters. The use of &
in this context serves the dual purpose of pointing to the 'SA struct' Object/Memory Address and then copying the struct sockaddr_in servaddr
Object Data/Parameters to that Memory Address, thus defining the 'SA struct' Object, making it ready to be used by the Socket Object and the 'connect' Function.
3: To make sure we capture all of the Parameters passed to the struct sockaddr_in servaddr
Object, we use a sizeof()
and pass a reference of the aforementioned struct Object to it.
If everything stacks up correctly, a Connection should be successful.
Apologies if this seems like common sense to older/more experienced C developers. It makes logical sense to me, but I just want to make sure I am on the right track with this before I start trying further experiments I have planned. As a Senior Co-Worker once said to me;
'It's one thing to know something, it's another thing to understand it.'
For completeness, aside from OS-Specific differences, I haven't experienced any issues with regards to connectivity etc, generally-speaking I'm just somewhat obsessed with knowing what is going on under the hood, which is why I switched from C# to C in order to ultimately become a better Programmer.
Thanks in advance,
David
As stated above, aside from OS-Specific minor differences between Linux and Windows, I haven't experienced any major issues with connectivity. This question really is just to enhance my understanding of Socket Programming and perhaps catch some newbie issues and nip them in the bud before they become ongoing bad habits as a result of learning without fully understanding. As already I've noticed some cool, quicker methods of doing things that are the result of years of experience but with little explanation as to their usage or what is going on, how that approach differs over a more traditional approach, etc.
In time, a further step/series of experiments I plan to try is using struct Objects to define Parameters passed as part of HTTP Verbs and Calls and/or REST API interaction, but that is still to come.
The use of & in this context serves the dual purpose of pointing to the 'SA struct' Object/Memory Address and then copying the struct sockaddr_in servaddr Object Data/Parameters to that Memory Address
The above is incorrect; the &
operator has no dual-purpose and no implicit data copying. The &
operator is merely saying "give me a pointer to the object that follows". So e.g. if you have an object named serveraddr
, then &servaddr
evaluates as a pointer to that object.
To make sure we capture all of the Parameters passed to the struct sockaddr_in servaddr Object, we use a sizeof() and pass a reference of the aforementioned struct Object to it.
The reason for passing the size of the sockaddr_in
object is because the connect()
API is doing a bit of poor-man's runtime polymorphism: Since the BSD sockets API needs to support multiple Internet protocols (IPv4, IPv6, IPX, AppleTalk, X25, etc), and different protocols require different addressing-schemes, the API designers decided to define a "common base class" (aka struct sockaddr
) and then define different "subclasses" (i.e. struct sockaddr_in
for IPv4, struct sockaddr_in6
for IPv6, and so on) that would all start with struct sockaddr
object as their first member-object, followed by whatever protocol-specific data was needed for that particular type of networking.
That allowed them to define a single connect()
call that can be used in conjunction with any networking protocol:
int connect(int socket, const struct sockaddr *address, socklen_t address_len);
... but since the second argument is pointing only to the generic "base class" type (struct sockaddr
), the code inside connect()
can't know for sure how much valid data that address
pointer is actually pointing at, so it relies on the caller to tell it that. connect()
can then look at the sa_family
member-variable inside the struct sockaddr
header to figure out which networking protocol to use, and internally downcast the pointer to the appropriate type (struct sockaddr_in
or struct_sockaddr_in6
or whatever) based on that. (Note that you might think that means that passing a sizeof()
value separately is therefore unnecessary, since connect()
should also be able to determine the appropriate address-object-size based on the sa_family
value -- and in practice you'd be generally correct, but I think they were considering the possibility of networking protocols whose addresses could have varying sizes, and for those protocols, they would need to know the actual address-size on a per-address basis)
If everything stacks up correctly, a Connection should be successful.
That's a bit of an oversimplification -- in addition to getting the arguments correct, the network also has to be functioning, the device at the specified address (and port) has to be willing to accept the connection, and so on.