What is the OSI model anyway?
November 10th 2021 | ~ 5 minute read
Introduction
You might've heard of it somewhere, might've even learned about it, but didn't quite understand it. I'll try to explain the subject as best I can without delving into too much detail.
Definition
Let's start with an obvious question. What is the OSI model (Open Systems Interconnection) anyway?
To put it simply, it's an abstraction of how computer systems transfer data over a network. It's useful to think of it in terms of layers on a stack. Each layer "houses" a number of protocols within it and those protocols are responsible for a set of specific tasks. Each layer encapsulates into the layer below it. That's important and we'll get to what that means exactly later on.
For now let's first focus on what the layers are. There's seven of them and in a very particular order those layers are:
- Application
- Presentation
- Session
- Transport
- Network
- Data-Link
- Physical
The layers are represented in this order based on how close they operate to the end user. Application being the closest, while Physical being the furthest.
Let's backtrack for a moment and explain what each of the layers actually does.
Application - As has been said, this layer sits closest to end users and is responsible for direct interaction with it. Every time you use any program, you're operating in the framework of an OSI Application layer.
Presentation - This layer is concerned with making sure that the data originating from one application can be understood by another. In practice this means dealing with how the data is represented, how it's stored, which formats are used and so on. An example would be an encoding scheme such as JPEG.
Session - Concerned with establishing, maintaining, terminating and controlling the connections and data flow between computers. In practice this layer is virtually non-existent and is simply part of TCP.
Transport - Operates and controls segmentation (segmentation being the division of data into smaller units) and transfer of data across the network. In general there's two kinds of possible modes of transfer. Reliable and unreliable. As such this layer may or may not provide error checking, reordering, re-transmission and throttling of the data as well as formally establishing a connection with a handshake scheme.
Network - Provides functionality required to transmit data across different networks. This includes routing of the data to a particular destination using one or more routing protocols as well as establishing an addressing scheme, such as the one used by IP for example.
Data-Link - If the Network layer was responsible for the transfer of data across different networks, the Data-Link layer is responsible for the transfer of data within the same network, more specifically for the transfer of data between several directly connected nodes. It also defines an addressing scheme, but this time these represent physical rather than logical addresses. This is represented as a MAC address unique to each NIC (Network Interface Card). There also exists an operation analogous to routing used within this layer called switching. Devices that operate switching are accordingly named "Switches".
Physical - Finally, all this data needs to be physically transferred over the network as different kinds of signals, be they electrical, optical or some other sort. This is the job of the physical layer. Representation of digital bytes as analogue electrical or optical impulses that are sent across a physical medium such as a copper or an optical cable or even radio signals in the case of wireless transfer.
Key concepts
Encapsulation
As stated, each layer encapsulates into the one below it, but what does that mean exactly? In abstract terms each layer is fully independent from every other layer and "knows" nothing about them. That's to say that each layer could, hypothetically, be pulled from the stack and replaced with something else and the system would still function.
This is achieved by representing the entirety of operations of a particular layer to the one below it as nothing more than a stream of bytes, nothing more than data. For instance, in practice this would mean encapsulating the entirety of a TCP segment into the header of an IP packet.
Protocols
Now that I mentioned TCP and IP I have to reiterate that each of the OSI layers houses a number of protocols that each facilitate a number of functions associated with them.
Without delving into details, I'll list a number of protocols associated with each layer as to give you a better understanding of the functions described above.
- Application - HTTP, FTP, SMTP, DHCP
- Presentation - JPEG, HEVC, JSON, XML
- Session - N/A, part of TCP
- Transport - TCP, UDP
- Network - IP, OSPF, EIGRP, RIP
- Data-Link - Ethernet, WiFi
- Physical - Various networking devices, etc...
Some of these should be familiar, like HTTP or FTP. These are the protocols you've most likely interacted with directly (HTTP, being used to display this web page). The explanation of others (if needed) will be left as an exercise for the reader.
Conclusion
Key takeaways are the concept of layers, the protocols within them and the encapsulation of their data into other lower level protocols.
Do note that this article assumes that the process of encapsulation and, consequently, the reverse of it (de-encapsulation) are essentially one and the same. It's just that the order of operations are, naturally, in reverse of the other.
This has been a high level overview of the OSI model, that I feel that each software engineer should, at the very least, be familiar with. While not explicitly required it's beneficial for us to understand how the applications we build interact with the network on a fundamental level.