An overview of Ethernet 10BASE-T
In this post I provide a high-level overview of how Ethernet works, focusing on the 10BASE-T variant in particular. I'll discuss Ethernet's electrical characteristics, and I'll describe how the Ethernet spec is divided into two major layers: the physical (PHY) and the medium access control (MAC) layers, each of which focuses on a different set of responsibilities. This post is part of a series of posts relating to Niccle, my Ethernet 10BASE-T bit banging project.
Table of Contents
Introduction
Ethernet is a family of networking technologies defined in the IEEE 802.3 standard.1 The standard describes the electrical characteristics of the network, as well as the protocol that is carried over the physical network. It describes a number of versions of the technology, ranging from the first version called 10BASE5 which would run on thick coax cables at a speed of up to 10 Mbps, to the first version which ran over twisted pair cabling called 10BASE-T, to much more recent versions like 10GBASE-T which supports speeds up to 10 Gbps.
Almost all Ethernet implementations commonly found in devices today run over twisted pair cabling like CAT-5 and support, at the very least, the 10BASE-T standard with 10 Mbps speeds. Devices that support 10BASE-T as well as higher-speed versions of Ethernet can generally still be interconnected with devices that only support 10BASE-T.
This makes 10BASE-T Ethernet an appealing target for creating bit banged implementations: it is fairly simple to understand and implement, it uses signal frequencies of at most 10 MHz (within the capabilities of a number of commonly available microcontrollers), and there are many devices available to interface with. Most of the newer versions of Ethernet are a bit harder to bit bang, often requiring signal frequencies of 31.25 MHz or higher, and they're also more difficult to implement due to their use of more complex signal encoding techniques to achieve ever increasing data rates.2
Ethernet can be considered to consist of two major layers: the physical (PHY) layer, the medium access control (MAC) layer. The PHY layer maps to the physical layer of the OSI model, while the MAC layer for the most part corresponds to the data link layer in the OSI model.3 While in the past it was common to have separate discrete components handling the PHY and MAC layers, these days it's much more common for a single component to handle both layers' functions in a single package.
In the next few sections I'll describe the PHY and MAC layers' responsibilities in more detail. Note that many of these details, especially for the physical layer, are specific to the 10BASE-T version of Ethernet, and that later versions operate in significantly differing ways (such as using different signaling voltage levels, signal frequencies, and line codes) which I won't discuss here. The Wikipedia page on Ethernet over twisted pair is a great resource to start with, if you're interested in how those other versions differ, however.
Physical layer
The physical (PHY) layer defines how signals are actually transmitted over the physical medium (the cable). It can be described in broad strokes as follows (with footnotes referencing the sections of the IEEE 802.3 standard, for some of the key details):
-
The signal is transmitted over twisted pair wiring like CAT-3 ("telephone cables") or CAT-5 ("Ethernet cables"), which can be up to 100 m in length.4 It only uses two of the four pairs available in a CAT-5 cable, one for transmitting outgoing data (TX) and one for receiving incoming data (RX). There's nominally a convention that specifies which pair serves as TX and which as RX for each type of device (see "straight" vs "crossover" cables), but in practice all modern devices support a feature called Auto MDI-X which means that the TX/RX wires are interchangeable and automatically agreed upon when two devices are first connected.
-
Each twisted pair carries a differential signal, and during data transmissions the signals are driven to two nominal voltage levels: one high/positive level between +2.2 V and +2.8 V, and one low/negative level between -2.2 V and -2.8 V.5 Do note that by the time the signals reach the receiver on the other end of the cable they are allowed to have a much smaller voltage (the peak voltages could be as small as +0.585 V and -0.585 V), to account for signal attenuation across the cable.6
- Side note: An energy-efficient version of the 10BASE-T standard called 10BASE-Te takes advantage of this fact, by specifying that lower nominal output voltages of only between 1.54 V to 1.96 V are allowed, as long as a CAT-5 cable is used (i.e. it disallows CAT-3 cabling, which has fewer twists per unit of distance and therefore higher signal attenuation).4 A 10BASE-Te compliant transmitter is still considered backward compatible with a 10BASE-T receiver, since the signal levels on the other end of the cable will still cross the 0.585 V minimum voltage threshold expected by such a receiver.
In a standard-compliant implementation, the cable must also be galvanically isolated from the host device.7
-
When the line is not being used to transmit any actual data, devices are required to emit periodic link test pulses (LTPs).8 Later versions of Ethernet also call these Normal Link Pulses or NLPs. These are short pulses of a positive differential signal between 0.585 V and 3.1 V that last at least 60 ns, and which are repeated every 16 ms +/- 8 ms. These let the device on the other side know that the cable has been connected. It's these pulses that will make the lights on a router/switch light up the moment you connect to it, or will pop up the "A cable has been connected" notification on your computer, even before any IP configuration has been provided.
-
Data is transmitted over the wire using Manchester line code. In this scheme, each
0
or1
data bit gets encoded into a "bit symbol" which consists of two halves, with a transition in the middle. A0
data bit gets encoded as a transition from a high to low signal (or10
for short), and a1
data bit gets encoded as a low to high transition (or01
for short).9 Each bit symbol takes 100 ns to transmit (hence the data transfer rate is 10 Mbps), with the signal transition happening at the 50 ns mark.-
With this encoding, the fastest-changing signals that are ever transmitted on the wire are
01 01 01 ...
or10 10 10 ...
, corresponding to a string of encoded1
data bits, or a string of0
data bits. In those cases, the signal period (the time it takes to complete one cycle from0
to1
and back to0
or vice versa) is 100 ns, and therefore the signal's highest possible fundamental frequency, or its bandwidth, is 10 MHz.10 -
The slowest-changing signal would be
01 10 01 10 ...
, corresponding to a string of encoded alternating 1 and 0 data bits). The signal in this case has a frequency of exactly 5 MHz. Hence for Ethernet 10BASE-T the signal signal frequencies will always range between 5 MHz and 10 MHz. See the image below for an illustration of these encoded waveforms.
-
-
After the PHY transmits a packet, it transmits a "start of idle" signal (also called TP_IDL) consisting of at least 250 ns of a positive signal signal between 0.585 V and 3.1 V.8 This indicates the end of the transmission, and the start of an idle period, to the receiver.
This brief description of the PHY layer for Ethernet 10BASE-T's covers most of the important details of the full duplex mode of operation, in which each of the two twisted wire pairs is only used by a single device to transmit data. This is the most commonly used operating mode. The specification does also describe a half duplex mode of operation, which expands the PHY's responsibilities, but it's not used much anymore and I won't support it in my project.11
For more details on the waveform characteristics of data transmissions, link test pulses, TP_IDL signals etc. I highly recommend taking a look at this presentation from the UNH InterOperability Laboratory. I'll also go into more details on this aspect in my next post.
PHY sublayers
It's also worth noting that the Ethernet 10BASE-T specification actually further separates the PHY layer into the following sublayers:
-
The Physical Signaling (PLS) sublayer,
-
The Medium Attachment Unit (MAU) sublayer, which itself consists of
-
the Physical Medium Attachment (PMA) sublayer, and
-
the Medium Dependent Interface (MDI) sublayer.
-
-
The Attachment Unit Interface (AUI), a cable which connects the PLS and MAU sublayers together.
In this model, the PLS layer handles Manchester encoding/decoding (point 4 above), while the MAU layer handles the actual physical connection and galvanic isolation (points 1 & 2 above), as well as the generation of the link test pulses and TP_IDL signals (points 3 & 5).
This model originates from a time when it was common for the PLS and MAU layers to truly be implemented as separate devices connected by a short cable (see Wikipedia), but the distinction is less useful these days as the whole PHY layer is now generally implemented in single IC package. However, since the IEEE 802.3 specification references these sublayers in many places it is useful to be aware of the sublayer terminology.
Medium access control layer
The medium access control (MAC) layer defines how data from higher layers (e.g. IPv4 or IPv6 packets) is packaged up into a MAC frame and packet, and passed onto the physical layer.12
-
Data from higher layers is wrapped in an Ethernet frame, which specifies:
-
the destination & source MAC addresses,
-
either the length of the payload or — more commonly used in practice — the type of packet contained in the frame's payload (e.g. 0x0800 to indicate IPv4),13
-
the payload itself, and
-
a frame check sequence (a CRC32 checksum covering all of the previous octets in the frame). The MAC layer is responsible for generating this checksum for outgoing transmissions, and for validating the checksum of incoming frames.
The frame size must be at least 64 octets and at most 1,518 octets long, resulting in minimum and maximum payload sizes of 46 octets and 1,500 octets. If the payload provided by the higher network layer is actually less than 46 octets long, it has to be padded to 46 octets by the MAC layer.
-
-
Before this frame is passed to the PHY for transmission, it is prefixed with a preamble consisting of 7 octets of data that when Manchester-encoded results in the signal
01 10 01 10 01 10 01 10
repeated 7 times, followed by a start frame delimiter (SFD) which is a single octet of data that when encoded looks like01 10 01 10 01 10 01 01
(very similar to the preamble signal, but the last bit symbol in the octet is01
instead of10
). The Manchester-encoded frame data is transmitted after the SFD.-
The preamble results in a line signal with a frequency of exactly 5 MHz.
-
It ensures that receivers have enough time (5.6 μs) before the SFD begins to prepare to receive an incoming frame, and it allows them to clock-align their receiver with the incoming signal (removing the need for a separate clock signal).
-
The SFD allows the receiver to determine where the actual frame transmission begins.
-
Just as the MAC layer is responsible for prepending the preamble and SFD, it is also responsible for stripping the preamble and SFD from incoming Ethernet packets handed to it by the PHY layer.
-
-
The combination of preamble, SFD, and the Ethernet frame data is referred to as an Ethernet packet.
-
Since the preamble and SFD make up 8 octets, Ethernet packets have minimum and maximum sizes of 72 octets and 1,526 octets.
-
Since each data bit takes 100 ns to transmit (the "bit time" or "bit period"), that means that an Ethernet packet transmission will take between 57.6 μs and 1,221 μs, 6.4 μs of which is spent on the preamble & SFD.
-
Note: The term "Ethernet packet" is not to be confused with the term "IP packet" from the higher network layer. The IP packet would actually be the payload within the Ethernet frame (i.e. an Ethernet packet wraps an Ethernet frame, which in turn wraps an IP packet).
-
-
After every Ethernet packet transmission, the MAC also has to enforce an interpacket gap (IPG) of at least 96 bit times (9.6 μs) during which the line is left idle (returning to a 0 V differential voltage).14 This gives receivers enough time to prepare for receipt of the next Ethernet packet.
Conclusion
In a real Ethernet implementation most or all of the functions described in the overview above are implemented in hardware. In most cases an operating system would then layer a software stack on top of this hardware-implemented MAC layer, e.g. a TCP/IP stack to provide actual internet connectivity. The bit banged implementation in my Niccle project will need to implement most of these functions (except for the electrical circuit itself) in software instead.
In my next post I'll discuss the electrical circuit design I chose for connecting the microcontroller's GPIO pins to the CAT-5 cable, and in subsequent posts I'll cover how we can implement the PHY and MAC layer functionality in software.
Footnotes
The actual standard can be downloaded for individual use for free via the IEEE GET program. 10BASE-T-specific protocol details are specified in clause 14, while clauses 3, 4 and 7 cover other relevant but non-10BASE-T-specific details as well (such as the MAC frame format etc.).
10BASE-T1S and 10BASE-T1L, introduced somewhat recently in the 802.3cg-2019 standard, are interesting exceptions to this trend. They can support 10 Mbps transfers over a single twisted pair, using 12.5 MHz and 3.75 MHz of bandwidth respectively. They were developed to support automotive and low-power IoT use cases. It'd probably be feasible to bit bang these with many microcontrollers as well, but there aren't yet many devices to interface with that support these versions of the protocol.
The standard actually also describes a third layer, the logical link control (LLC) layer, so it's technically more accurate to say that the MAC and LLC layers together make up the OSI data link layer. For our purposes we can mostly ignore the LLC layer, however.
IEEE 802.3, Clause 14.1.1.3 Twisted pair media.
IEEE 802.3, Clause 14.3.1.2.1 Differential output voltage. Note that this clause discusses a "Twisted Pair Model" (TPM), which is an electrical circuit that is used to simulate the effects that 100 m of twisted pair cabling can have on a transmitted signal (e.g. signal attenuation).
IEEE 802.3, Clause 14.3.1.3.1 Receiver differential input signals.
IEEE 802.3, Clause 14.3.1.1 Isolation requirement.
IEEE 802.3, Clause 14.2.1.1 Transmit function requirements.
IEEE 802.3, Clause 7.3.1.1 Data encoding.
This initially tripped me up a bit: we send a bit symbol consisting of two halves for each data bit, with each half taking 50 ns, and hence we'll see up to 20,000,000 signal transitions per second. As a result, a number of other websites mention 10BASE-T using "up to 20 MHz of bandwidth", but this is incorrect. The formal definition of spectral bandwidth is based on the fundamental frequency of the signal being transmitted, which is defined as $f = \frac{1}{T}$ where $T$ is the period of the repeating signal. The period is the time it takes to complete one cycle of transitions (e.g. going from 0 to 1 and back to 0). As we showed above, for Ethernet 10BASE-T the fastest possible signal period is 100 ns and hence it uses 10 MHz of bandwidth, even though there's a signal transition every 50 ns in that case.
In half duplex mode multiple devices may transmit data on the same twisted wire pair, meaning that collisions can occur and need to be detected and handled. In this mode of operation the PHY layer has to implement CSMA/CD: before starting a transmission it has to sense whether the line is free to transmit data, and during transmission it has to sense whether a collision occurred. While half duplex mode is still commonly supported, it's not commonly used, and implementing it correctly is more complex than full duplex Ethernet. I will only focus on the full duplex mode of operation in my project.
IEEE 802.3, Clause 3.2 Elements of the MAC frame and packet.
The history behind why this field can hold either the length or the type of the payload is described in the Wikipedia page about Ethernet frames. In practice most implementations will use the field to store the payload type, and hence in a bit banged implementation we can limit ourselves to supporting only that case.
IEEE 802.3, Clause 4.4.2 MAC parameters.