Zero to Hero — How the Internet works & network programming in C/C++
Today we are using one of the biggest resource out there which is the Internet, to do a variety of things in our day to day lives.
Sometimes it seems the whole Internet is something so complex and hard to understand in a manner of seconds, so I wanted to challenge myself, and explain from the application view to the hardware and even a sample of code, how to use the internet in terms of network programming.
What Is the Internet
I don’t know about you but I’m using the Internet on a daily basis.
There are days that I’m in couch potato mode, which means I just watch Netflix all day with my girlfriend :)
It really is amazing that we have tons of products for entertainment and we consume them hour by hour and second by second, some of them you might know:
- Online gaming on PS4, Xbox and etc..
- IPTV and Video streaming using Netflix.
- Virtual wallets.
- Online services(banks, post office, telecommunication, shopping , etc..)
In short we have a lot of applications in terms of services we can use on the whole Internet. Off course there are more, I just mentioned few of them because there are so many :)
Before we dive any deeper I want to note few important definitions:
— LAN — Local and small network, for example our home or a small office.
— WAN —You could say it’s the entire network which is not our home.
— Router — Responsible to route our requests from our LAN to another.
— Switch — Responsible to route our requests inside our LAN.
—WIFI — I hope you know, which is basically wireless LAN(WLAN).
—NIC— Network interface card in our computer/mobile phone that communicates with the available technology(Radio,Cables, etc..)
There are many more equipment out there, but these are the basic ones.
Any type of a equipment need a unified way to talk to each other which is a standard.
For that purpose, the Model OSI was created and TCP/IP followed afterwards.
What they do is simply describing stack of layers, which each layer has it’s own capabilities and set of protocols.
You don’t necessarily need to use multiple protocols to each layer but you need to at least choose one, because each layer has it’s own responsibilities and rules that need to be implemented and verified by each layer separately.
TCP/IP Model
The TCP/IP model consists of 4 layers going from the top to the bottom:
- Application Layer
- Transport Layer
- Internet Layer
- Link Layer
You can think of the model as a building or a computer which every layer starts from the very basic to the top.
We can consider the lowest level of TCP/IP model, the Link Layer, which is responsible for the hardware, cables, electric surges and data transfer.
All of those electric surges, in the end, transfer to binary data which gets used in the upper levels of the OSI model
In TCP/IP model, this layer is responsible also for LAN routing.
Which means it's responsible to route frames accordingly with MAC addresses, to the right receiver on the LAN. It holds even more functionalities but that is the basics.
In OSI model, the section of electric surges and hardware is separate from the LAN routing functionality, and we have Link Layer for the physical connection, and a Data Link layer for the LAN routing part.
Inside each operating system(Windows, Linux, macOS, etc..), the implementation for the entire networking stack exists as .c/.h files which implements variety of protocols for use, in short you can call all that “Netstack”.
The most casual and simple way to use the Netstack is using sockets.
We will further discuss them later in this article.
The article supposed to focus solely on how to do network programming with C/C++, so all the deeper descriptions about the tools and protocols from TCP/IP model is meant for an article for it self, so it would not be in this article :)
Operating Systems
The big question that is asked these days… How does our operating systems, or any other equipment which has it’s own OS, do all the functionality of OSI or TCP/IP model or even any other functionality if we already talking about this subject?
We already mentioned it before but each OS has it’s own Netstack.
The point which is worth mentioning, that for any modern OS there’s two sections of the entire OS.
- Kernel Space — The highest priveleges in the OS, which can talk to the computer hardware and pretty much can do whatever it wants.
- User Space — The simple space for casual work, which is being used by any casual user. It mainly consists the applications we use any day.
Each of those application, uses system calls which make requests for the kernel to make any kind of operation.
Off course if an operation is not permitted, the kernel throws an error.
Every program/code that gets executed in kernel space suppose to take care of it self in every term — memory(allocations, de allocation), illegal access to other sectors of memory which are not allowed and could tamper with another application/code execution in terms of security, memory leaks etc…
So it is important for the kernel really take care of it self, and be careful with each action it does, because if it crashes so do we.
After all that, remember about that Socket thingy we talked about?
Socket is considered a special file inside our OS. Actually most of I/O operations are considered a file.
When creating a socket for use, the system calls for the kernel, allocates us a file descriptor, which allows us to read/write to the socket file.
Socket file is considered a networking file that has configuration where to send the data to and where to receive it to on our OS, I know it sounds a little blurry right now but everything will be cleared later in this article.
Just for the summary and to explain more simply about the topic we discussed just now.
Imagine a car for example Audi Q5(I wish had one xD), you have the “user space” abilities such as the wheel, seats, multimedia system, windows, trunk. While the kernel space is the engine, radiator, plugs etc..
So there are tolls that we don’t know about but we actually use them without knowing.
Client/Server communications
Imagine for yourself that you want to buy a new pair of glasses, car, smartphone, etc…
Firstly you need to know which store to go and where they are at, right?
Well, any kind of stores has a location that it’s physically located and it’s known to the public.
If you don’t know, you search google or ask other person of it’s whereabouts.
It’s pretty much the same only in other names with Client/Server communications.
We can consider the server as the store we buy items from and the client as ourselves.
The way the client/server communicates are…. you might of guessed it already but yeah it’s with a Socket.
But with time that went by and the network with it’s services has grown each day, the desire for more complex operations than a mere socket were asked for.
Because of that, the sockets got wrapped up with data formats and ways handling particular data in different ways, which basically gave birth to some of the most popular application protocols, such as: HTTP, SMTP, FTP.
Down to business — what is C/C++ network programming all about?!
As explained before, Socket is a tool to communicate between 2 programs or/and for 2 endpoints to communicate over the internet(LAN or WAN).
Spoiler: You can create a file with it’s file descriptor for network programming using the system call socket().
The configuration for a usual socket are:
— IP Address: The IP address to connect or bind(will be explained later) to. Usually it’s IPv4 which presents the version of protocol of IP itself.
— Port: The port of the application we want to be associated with or connect to.
There are more configurations than these 2 but they are the basis for the socket.
We will also see that the configurations looks complex but are actually simple once you get the grip :)
Well, something that is must to be said, there are no sockets for C++!
You must be asking why I mentioned in this article title C/C++.
Well it’s because I think it’s worth mentioning it that sockets only exists in C.
By the way, what I mean by saying that sockets exists only in C, is that there are implementations of sockets in C language. Because C++ contains C functionality, you can create sockets using C++, just like how you would have done in C.
OS system calls for sockets
Before we drill down to each function and understand why we need it.
I want to abstract and show the “route” the server and the client needs to go through in order to make their functionality for receiving connections and also read/write data :)
Let’s say for the matter, that we have an hotel.
The server is some sophisticated robot from the future whom sole responsibility to make sure the hotel is up and running for our clients requests.
Basically he’s like our Alfred who can do pretty much anything. The clients are coming to have some kind of service, and for the matter request a room to stay in.
The clients don’t know about the other clients inside the hotel unless the robot is doing matchmaking between them.
If we dig more technically about the story we just imagined.
The robot needs to have full access to the entire hotel, while the clients have what the robot gives them.
The robot must have prepared himself to be able and provide all of that, and so our socket server needs to.
If we dig down to the technical part , let’s see the list of functions we have and what is the purpose of each one.
socket(domain, type, protocol)
Returns a file descriptor to a file inside the OS.
With this fd(file descriptor in short) we make other calls for the OS, so it will know which file to read/write or any other operation.bind(sock_fd, address, address_length)
Binding the fd with a specific port in our OS.
This step is important so only our socket will be able to receive data on that port, once data is arrived to the computer NIC.
As you can see the port is important to distinguish between applications and protocols.accept(sock_fd, address, address_length)
For every fd there's a pending incoming connections queue to the computer for that fd.
When we call accept() we are simply pop the first client whom entered the queue.
If no clients present the accept() call will block.listen(sock_fd, backlog)
With listen() we can restrict the maximum number of pending incoming connections to our fd.connect(sock_fd, address, address_length)
Tells the OS to connect to the address specified using the sock_fd that is on our system.
Meaning, the sock_fd will be for our use as a gateway to our NIC to send/receive messages from the address given to the connect().send/recv
As their name, they are both functions that allows us to send or receive data from the sock_fd
Note: We mentioned that the sockets are a file. So I'll let you on a little secret, you can also call write/read the functions for simple files :)shutdown(sock_fd, type)
Allows us to shutdown the socket for specific operations.
type=1 => the socket will not receive data.
type=2 => the socket will not send data.
type=3 => the socket will not receive/send data.
For any given value to the shutdown, if we try to send data after calling shutdown with type=2, we will receive an error.close(sock_fd)
Close is actually like calling shutdown(sock_fd, type=3).
Now that we are networking experts in terms of communication and OS, we can start writing a little bit of code.
See what I did there? ;)
What does the Server has do to?!
What we need to do is simply call the following functions in the right order:
- socket()
- bind()
- listen()
- accept
- send/recv
We already gone over the functions but for short, we simply need to create a socket and bind him to a port on the OS.
Afterwards we state the maximum pending connections we want to have for that socket of ours.
In the end we accept clients whom pending for our socket and simply send/receive data.
What does the Client has do to?!
Do you know the phrase “The client is always right”?
Just the existence of that phrase makes the life of the client a lot more easier than the business owner :)
What we need to do is simply call the following functions in the right order:
- socket()
- connect()
- send/recv
This time for short as well, we simply need to create a socket and connect to the endpoint we wish.
In the end simply send/receive data.
Let’s sum things up
We discussed about the internet, it’s benefits, how to use it, OS’s and deep parts of it, and now we saw a simple socket example.
We have tons of protocols for each layer inside the OSI model, so when we develop using sockets we have wide variety of choice, so if you want TCP socket or UDP socket you have different system calls and ways of handling them.
Just for more a deeper example, in our modern day to day cars we have a computer inside that uses CANBUS protocol to communicate between the car components.
That PC, inside it’s OS there is a driver support for CANBUS communication.
What it means, is basically that the Netstack on that OS has .c/.h implementation of the CANBUS protocol.
I wanted to note this point because the OS of that computer has to have a CANBUS driver to be able to allow the use of commands that which allows us the programmers communicate on that line/bus.
If you’re interested to read more about it, you can find out more in the links below.
Operating System differences
It is important to note that each OS has it’s own system calls and the way they handle them.
Windows for example has the winsock2.h library, which provide system calls for network programming.
When you develop networking programming on windows, you have to request or initialize the “Windows Socket DLL” to use it.
Sometimes the OS might prevent because of other application whom intensively use it also or other cases of CPU-Bound operations from any kind of reason.
The way we request to use the winsock2 API’s has it’s own API that you can read on Microsoft website which will be included down below.
They have a great tutorial for getting started to know their api from scratch.
Today we have POSIX api, which is a standard for operating system how to write their system calls.
Further knowledge and brainstorming
Now that we know what is the Internet and the whole approach of the operating system from the ground up, I want to make sure the entire knowledge you acquired so far will be assimilated deep inside your head :)
If for some reason we want to develop a web server, that can serve web pages and we already have the code that can serve web pages, but we need to simply implement a function that listen for connections and call some other dude, who will handle the socket and return him the web page.
All we need to do is create a “server socket” that will bind himself to the infamous port 80 which is the HTTP protocol.
By doing so all pending connections for HTTP requests will come to our socket and we can response back with a web page.
You might be thinking right now, isn’t HTTP a complex protocol that has it’s data format and etc..?
Well you’re right, because of that it’s called a protocol which has it’s own set of rules and how to read/send it’s data.
That’s the reason we have libraries of other developers or communities that have developed the ease of use of those protocols for us to develop faster.
Final words
If you have really tolerated this article this far, I hope you really have enjoyed reading and got the idea from the start until the end.
Also, I wanted to mention that we went a really fast journey but there is a lot more to learn about each of subjects mentioned in this article.
Again, I hope you enjoyed this article and if you have suggestions on how to improve it I would like to hear so, because I had hard time to come up with the entire knowledge by myself so I wanted to make it easier for everyone else who enters this subject, and let them have a softer landing.
Thank you for your time and have a great day! :)
Sources and links:
- Geeks4geeks — TCP Server&Client communication over socket — https://www.geeksforgeeks.org/tcp-server-client-implementation-in-c/
- Tutorialspoints — Unix Socket tutorial — https://www.tutorialspoint.com/unix_sockets/index.htm
- Microsoft tutorial of how to use Winsock api — https://docs.microsoft.com/en-us/windows/win32/winsock/getting-started-with-winsock
- SocketCAN — driver for OS to allow the use of CANBUS communication in our sockets that we create — https://medium.com/@idomagor/zero-to-hero-how-the-internet-works-network-programming-in-c-c-68582873730a.
- OSI Model — https://en.wikipedia.org/wiki/OSI_model
- CANBUS — https://en.wikipedia.org/wiki/CAN_bus