An overview of Ethernet 10BASE-T

Published on Oct 04, 2023
• Updated on Dec 21, 2023
• 16 min read
• Tags:
ethernet,
niccle

In this post I provide a high-level overview of how Ethernet works, focusing on the 10BASE⁠-⁠T variant in particular. I'll discuss Ethernet's electrical characteristics, and I'll describe how the Ethernet spec is divided into two major layers: the physical (PHY) and the medium access control (MAC) layers, each of which focuses on a different set of responsibilities. This post is part of a series of posts relating to Niccle, my Ethernet 10BASE⁠-⁠T bit banging project.

Introduction

Ethernet is a family of networking technologies defined in the IEEE 802.3 standard.¹ The standard describes the electrical characteristics of the network, as well as the protocol that is carried over the physical network. It describes a number of versions of the technology, ranging from the first version called 10BASE5 which would run on thick coax cables at a speed of up to 10 ⁠Mbps, to the first version which ran over twisted pair cabling called 10BASE⁠-⁠T, to much more recent versions like 10GBASE-T which supports speeds up to 10 ⁠Gbps.

Almost all Ethernet implementations commonly found in devices today run over twisted pair cabling like CAT-5 and support, at the very least, the 10BASE⁠-⁠T standard with 10 ⁠Mbps speeds. Devices that support 10BASE⁠-⁠T as well as higher-speed versions of Ethernet can generally still be interconnected with devices that only support 10BASE⁠-⁠T.

This makes 10BASE⁠-⁠T Ethernet an appealing target for creating bit banged implementations: it is fairly simple to understand and implement, it uses signal frequencies of at most 10 ⁠MHz (within the capabilities of a number of commonly available microcontrollers), and there are many devices available to interface with. Most of the newer versions of Ethernet are a bit harder to bit bang, often requiring signal frequencies of 31.25 ⁠MHz or higher, and they're also more difficult to implement due to their use of more complex signal encoding techniques to achieve ever increasing data rates.²

Ethernet can be considered to consist of two major layers: the physical (PHY) layer, the medium access control (MAC) layer. The PHY layer maps to the physical layer of the OSI model, while the MAC layer for the most part corresponds to the data link layer in the OSI model.³ While in the past it was common to have separate discrete components handling the PHY and MAC layers, these days it's much more common for a single component to handle both layers' functions in a single package.

In the next few sections I'll describe the PHY and MAC layers' responsibilities in more detail. Note that many of these details, especially for the physical layer, are specific to the 10BASE⁠-⁠T version of Ethernet, and that later versions operate in significantly differing ways (such as using different signaling voltage levels, signal frequencies, and line codes) which I won't discuss here. The Wikipedia page on Ethernet over twisted pair is a great resource to start with, if you're interested in how those other versions differ, however.

Physical layer

The physical (PHY) layer defines how signals are actually transmitted over the physical medium (the cable). It can be described in broad strokes as follows (with footnotes referencing the sections of the IEEE 802.3 standard, for some of the key details):

The signal is transmitted over twisted pair wiring like CAT-3 ("telephone cables") or CAT-5 ("Ethernet cables"), which can be up to 100 ⁠m in length.⁴ It only uses two of the four pairs available in a CAT-5 cable, one for transmitting outgoing data (TX) and one for receiving incoming data (RX). There's nominally a convention that specifies which pair serves as TX and which as RX for each type of device (see "straight" vs "crossover" cables), but in practice all modern devices support a feature called Auto MDI-X which means that the TX/RX wires are interchangeable and automatically agreed upon when two devices are first connected.
Each twisted pair carries a differential signal, and during data transmissions the signals are driven to two nominal voltage levels: one high/positive level between +2.2 ⁠V and +2.8 ⁠V, and one low/negative level between -2.2 ⁠V and -2.8 ⁠V.⁵ Do note that by the time the signals reach the receiver on the other end of the cable they are allowed to have a much smaller voltage (the peak voltages could be as small as +0.585 ⁠V and -0.585 ⁠V), to account for signal attenuation across the cable.⁶
- Side note: An energy-efficient version of the 10BASE⁠-⁠T standard called 10BASE⁠-⁠Te takes advantage of this fact, by specifying that lower nominal output voltages of only between 1.54 ⁠V to 1.96 ⁠V are allowed, as long as a CAT-5 cable is used (i.e. it disallows CAT-3 cabling, which has fewer twists per unit of distance and therefore higher signal attenuation).⁴ A 10BASE⁠-⁠Te compliant transmitter is still considered backward compatible with a 10BASE⁠-⁠T receiver, since the signal levels on the other end of the cable will still cross the 0.585 ⁠V minimum voltage threshold expected by such a receiver.
In a standard-compliant implementation, the cable must also be galvanically isolated from the host device.⁷
When the line is not being used to transmit any actual data, devices are required to emit periodic link test pulses (LTPs).⁸ Later versions of Ethernet also call these Normal Link Pulses or NLPs. These are short pulses of a positive differential signal between 0.585 ⁠V and 3.1 ⁠V that last at least 60 ⁠ns, and which are repeated every 16 ⁠ms +/- 8 ⁠ms. These let the device on the other side know that the cable has been connected. It's these pulses that will make the lights on a router/switch light up the moment you connect to it, or will pop up the "A cable has been connected" notification on your computer, even before any IP configuration has been provided.
Data is transmitted over the wire using Manchester line code. In this scheme, each 0 or 1 data bit gets encoded into a "bit symbol" which consists of two halves, with a transition in the middle. A 0 data bit gets encoded as a transition from a high to low signal (or 10 for short), and a 1 data bit gets encoded as a low to high transition (or 01 for short).⁹ Each bit symbol takes 100 ⁠ns to transmit (hence the data transfer rate is 10 ⁠Mbps), with the signal transition happening at the 50 ⁠ns mark.
- With this encoding, the fastest-changing signals that are ever transmitted on the wire are 01 01 01 ... or 10 10 10 ..., corresponding to a string of encoded 1 data bits, or a string of 0 data bits. In those cases, the signal period (the time it takes to complete one cycle from 0 to 1 and back to 0 or vice versa) is 100 ⁠ns, and therefore the signal's highest possible fundamental frequency, or its bandwidth, is 10 ⁠MHz.¹⁰
- The slowest-changing signal would be 01 10 01 10 ..., corresponding to a string of encoded alternating 1 and 0 data bits). The signal in this case has a frequency of exactly 5 ⁠MHz. Hence for Ethernet 10BASE⁠-⁠T the signal signal frequencies will always range between 5 ⁠MHz and 10 ⁠MHz. See the image below for an illustration of these encoded waveforms.
After the PHY transmits a packet, it transmits a "start of idle" signal (also called TP_IDL) consisting of at least 250 ⁠ns of a positive signal signal between 0.585 ⁠V and 3.1 ⁠V.⁸ This indicates the end of the transmission, and the start of an idle period, to the receiver.

A picture of three examples of Manchester-encoded waveforms — An example of the Manchester coding scheme showing how a repeating "0" data signal , a repeating "1" data signal, and an alternating "1", "0" data signal are encoded on the wire. Since one "bit time" equals 100 ⁠ns, the first two waveforms have signal periods of 100 ⁠ns and thus frequencies of 10 ⁠MHz, while the third waveform has a signal period of 200 ⁠ns and thus a frequency of 5 ⁠MHz.
(Adapted from Figure 7–10 of the 802.3 standard document.)

This brief description of the PHY layer for Ethernet 10BASE⁠-⁠T's covers most of the important details of the full duplex mode of operation, in which each of the two twisted wire pairs is only used by a single device to transmit data. This is the most commonly used operating mode. The specification does also describe a half duplex mode of operation, which expands the PHY's responsibilities, but it's not used much anymore and I won't support it in my project.¹¹

For more details on the waveform characteristics of data transmissions, link test pulses, TP_IDL signals etc. I highly recommend taking a look at this presentation from the UNH InterOperability Laboratory. I'll also go into more details on this aspect in my next post.

PHY sublayers

It's also worth noting that the Ethernet 10BASE⁠-⁠T specification actually further separates the PHY layer into the following sublayers:

The Physical Signaling (PLS) sublayer,
The Medium Attachment Unit (MAU) sublayer, which itself consists of
- the Physical Medium Attachment (PMA) sublayer, and
- the Medium Dependent Interface (MDI) sublayer.
The Attachment Unit Interface (AUI), a cable which connects the PLS and MAU sublayers together.

In this model, the PLS layer handles Manchester encoding/decoding (point 4 above), while the MAU layer handles the actual physical connection and galvanic isolation (points 1 & 2 above), as well as the generation of the link test pulses and TP_IDL signals (points 3 & 5).

This model originates from a time when it was common for the PLS and MAU layers to truly be implemented as separate devices connected by a short cable (see Wikipedia), but the distinction is less useful these days as the whole PHY layer is now generally implemented in single IC package. However, since the IEEE 802.3 specification references these sublayers in many places it is useful to be aware of the sublayer terminology.

Medium access control layer

The medium access control (MAC) layer defines how data from higher layers (e.g. IPv4 or IPv6 packets) is packaged up into a MAC frame and packet, and passed onto the physical layer.¹²

A diagram showing the structure of an Ethernet MAC frame and packet. — The structure of a basic Ethernet MAC packet and frame. The frame consists of addresses, length, data and a CRC. The packet consists of the frame prefixed with a preamble and Start-of-Frame delimiter.
(Adapted from Figure 3–1 of the IEEE 802.3 standard document.)

Data from higher layers is wrapped in an Ethernet frame, which specifies:
- the destination & source MAC addresses,
- either the length of the payload or — more commonly used in practice — the type of packet contained in the frame's payload (e.g. 0x0800 to indicate IPv4),¹³
- the payload itself, and
- a frame check sequence (a CRC32 checksum covering all of the previous octets in the frame). The MAC layer is responsible for generating this checksum for outgoing transmissions, and for validating the checksum of incoming frames.
The frame size must be at least 64 octets and at most 1,518 octets long, resulting in minimum and maximum payload sizes of 46 octets and 1,500 octets. If the payload provided by the higher network layer is actually less than 46 octets long, it has to be padded to 46 octets by the MAC layer.
Before this frame is passed to the PHY for transmission, it is prefixed with a preamble consisting of 7 octets of data that when Manchester-encoded results in the signal 01 10 01 10 01 10 01 10 repeated 7 times, followed by a start frame delimiter (SFD) which is a single octet of data that when encoded looks like 01 10 01 10 01 10 01 01 (very similar to the preamble signal, but the last bit symbol in the octet is 01 instead of 10). The Manchester-encoded frame data is transmitted after the SFD.
- The preamble results in a line signal with a frequency of exactly 5 ⁠MHz.
- It ensures that receivers have enough time (5.6 ⁠μs) before the SFD begins to prepare to receive an incoming frame, and it allows them to clock-align their receiver with the incoming signal (removing the need for a separate clock signal).
- The SFD allows the receiver to determine where the actual frame transmission begins.
- Just as the MAC layer is responsible for prepending the preamble and SFD, it is also responsible for stripping the preamble and SFD from incoming Ethernet packets handed to it by the PHY layer.
The combination of preamble, SFD, and the Ethernet frame data is referred to as an Ethernet packet.
- Since the preamble and SFD make up 8 octets, Ethernet packets have minimum and maximum sizes of 72 octets and 1,526 octets.
- Since each data bit takes 100 ⁠ns to transmit (the "bit time" or "bit period"), that means that an Ethernet packet transmission will take between 57.6 ⁠μs and 1,221 ⁠μs, 6.4 ⁠μs of which is spent on the preamble & SFD.
- Note: The term "Ethernet packet" is not to be confused with the term "IP packet" from the higher network layer. The IP packet would actually be the payload within the Ethernet frame (i.e. an Ethernet packet wraps an Ethernet frame, which in turn wraps an IP packet).
After every Ethernet packet transmission, the MAC also has to enforce an interpacket gap (IPG) of at least 96 bit times (9.6 ⁠μs) during which the line is left idle (returning to a 0 ⁠V differential voltage).¹⁴ This gives receivers enough time to prepare for receipt of the next Ethernet packet.

Conclusion

In a real Ethernet implementation most or all of the functions described in the overview above are implemented in hardware. In most cases an operating system would then layer a software stack on top of this hardware-implemented MAC layer, e.g. a TCP/IP stack to provide actual internet connectivity. The bit banged implementation in my Niccle project will need to implement most of these functions (except for the electrical circuit itself) in software instead.

In my next post I'll discuss the electrical circuit design I chose for connecting the microcontroller's GPIO pins to the CAT-5 cable, and in subsequent posts I'll cover how we can implement the PHY and MAC layer functionality in software.

Footnotes

The actual standard can be downloaded for individual use for free via the IEEE GET program. 10BASE⁠-⁠T-specific protocol details are specified in clause 14, while clauses 3, 4 and 7 cover other relevant but non-10BASE⁠-⁠T-specific details as well (such as the MAC frame format etc.).

10BASE⁠-⁠T1S and 10BASE⁠-⁠T1L, introduced somewhat recently in the 802.3cg-2019 standard, are interesting exceptions to this trend. They can support 10 ⁠Mbps transfers over a single twisted pair, using 12.5 ⁠MHz and 3.75 ⁠MHz of bandwidth respectively. They were developed to support automotive and low-power IoT use cases. It'd probably be feasible to bit bang these with many microcontrollers as well, but there aren't yet many devices to interface with that support these versions of the protocol.

The standard actually also describes a third layer, the logical link control (LLC) layer, so it's technically more accurate to say that the MAC and LLC layers together make up the OSI data link layer. For our purposes we can mostly ignore the LLC layer, however.

⁴

IEEE 802.3, Clause 14.1.1.3 Twisted pair media.

⁵

IEEE 802.3, Clause 14.3.1.2.1 Differential output voltage. Note that this clause discusses a "Twisted Pair Model" (TPM), which is an electrical circuit that is used to simulate the effects that 100 ⁠m of twisted pair cabling can have on a transmitted signal (e.g. signal attenuation).

⁶

IEEE 802.3, Clause 14.3.1.3.1 Receiver differential input signals.

⁷

IEEE 802.3, Clause 14.3.1.1 Isolation requirement.

⁸

IEEE 802.3, Clause 14.2.1.1 Transmit function requirements.

⁹

IEEE 802.3, Clause 7.3.1.1 Data encoding.

¹⁰

This initially tripped me up a bit: we send a bit symbol consisting of two halves for each data bit, with each half taking 50 ⁠ns, and hence we'll see up to 20,000,000 signal transitions per second. As a result, a number of other websites mention 10BASE⁠-⁠T using "up to 20 ⁠MHz of bandwidth", but this is incorrect. The formal definition of spectral bandwidth is based on the fundamental frequency of the signal being transmitted, which is defined as $f = \frac{1}{T}$ where $T$ is the period of the repeating signal. The period is the time it takes to complete one cycle of transitions (e.g. going from 0 to 1 and back to 0). As we showed above, for Ethernet 10BASE⁠-⁠T the fastest possible signal period is 100 ⁠ns and hence it uses 10 ⁠MHz of bandwidth, even though there's a signal transition every 50 ⁠ns in that case.

¹¹

In half duplex mode multiple devices may transmit data on the same twisted wire pair, meaning that collisions can occur and need to be detected and handled. In this mode of operation the PHY layer has to implement CSMA/CD: before starting a transmission it has to sense whether the line is free to transmit data, and during transmission it has to sense whether a collision occurred. While half duplex mode is still commonly supported, it's not commonly used, and implementing it correctly is more complex than full duplex Ethernet. I will only focus on the full duplex mode of operation in my project.

¹²

IEEE 802.3, Clause 3.2 Elements of the MAC frame and packet.

¹³

The history behind why this field can hold either the length or the type of the payload is described in the Wikipedia page about Ethernet frames. In practice most implementations will use the field to store the payload type, and hence in a bit banged implementation we can limit ourselves to supporting only that case.

¹⁴

IEEE 802.3, Clause 4.4.2 MAC parameters.

Table of Contents