How do Wireless Headphones Work?Script: Teddy TablanteModelling Animation: Victor Duarte To most of us, it’s a complete mystery as to how wireless earbuds work. With wired headphones, it makes sense that the electricity flows from your smartphone, through the plug, up the tangle of wires, and to the headphones. But with wireless headphones, how does the audio of your favorite music or podcast get transmitted from your smartphone, through the air, and into these wireless earbuds? Well, in this episode we’re going to answer a segment of that question. wireless headphones are incredibly complexand inside these tiny plastic earbuds there are 9 distinct technologies that we’re going to explore. These technologies are the speaker, audio codecs, Bluetooth, the System on a Chip or SoC, the Printed Circuit board or PCB, accelerometers, the Lithium Ion Battery, MEMS microphones, and noise cancellation. Each of these topics is rather involved, and thus an episode is dedicated to each technology, rather than fit all 9 technologies into a single, feature length movie. Here, as you may have figured from this video’s title and thumbnail, we’re going to focus on the audio codecs, and we’ll explore how sound waves can be represented digitally using just a bunch of numbers. This video will be broken up into 4 parts.
First, we’ll break open these Apple AirPods 2 and explore the different components and where they’re located. Second, we’ll give you a conceptual overview of how these earbuds work. Then we’ll explore the basics of how the audio, or the sound waves of your favorite song, can be represented as digital information using numbers alone. Finally, we’ll provide you with a set of increasingly complicated details which will fill out this explanation, along with a brief discussion of audio file formats. And by the way, this video is sponsored by PCBWay, a provider of all kinds of printed circuit boards. So, let’s begin. There are many different types of wireless headphones and earbuds, but they all use the same basic principles and technology. Here’s a pair of Apple AirPods 2 we decided to tear open and use as our example. One thing to note is that opening up these AirPods is much more difficult and destructive than we show in these animations. There’s a lot of glue inside, and if you try this at home with your earbuds, they probably won’t work afterwards. We did it so you don’t have to. Now, let’s take a look inside. Underneath the outer plastic earpiece and mesh dust cover we have a rubber protective shell, along with an optical sensor. Below these sets of covers is the speaker that generates the sound, with its 4 key components: the diaphragm, the suspension or spider, the voice coil, and the magnet. Behind the speaker, we find some insanely complex circuitry folded into a teeny tiny package Let’s pull out this circuitry and see what makes the earbuds tick. This circuitry is essentially three separate printed circuit boards neatly folded into the earbud, with flexible wires connecting them to make a single PCB. On this top board, which is glued to the backside of the speaker, we find two points where the speaker connects, along with a larger microchip, which handles Bluetooth connectivity and decodes the compressed audio stream sent from the smartphone. We also have a set of accelerometers, along with a programmable SoC. This circuit board is connected to and folded on top of a second circuit board which contains a low power stereo audio processing chip or audio codec. Also connected to this circuitry is a flat cable for the antenna which lies adjacent to the battery, a microphone situated at the back of the earbud which is used for noise reduction, and a third circuit board which is connected to the battery.
Down below the battery is an additional small circuit board that holds the main MEMS microphone which is about the size of a grain of rice, and below that is a mesh dust cover, along with the contacts for charging the earbuds. It’s truly an impressive amount of engineering that goes into making these earbuds so light weight and so incredibly small. Now that we’ve seen everything that’s packed inside these earbuds, let’s talk about how they work. When you turn on your already paired wireless headphones near your smartphone, a Bluetooth communication channel is established in order to send information back and forth. As soon as you start playing music or a podcast, your smartphone grabs the audio data from its flash storage chip, decompresses the audio, and stores it in your phone’s working memory. This audio is represented digitally as a long set of numbers, and in order to send it to your earbuds, your smartphone compresses and divides the information into packets according to Bluetooth specifications. Next your smartphone converts these packets into electromagnetic waves or photons and sends the data to the earbuds over the Bluetooth connection. The earbuds receive the data and disassemble and decompress the packets back into long sets of values. These values are then sent to the audio codec, which converts the digital values into an analog electrical waveform.
This waveform is then sent to the voice coil, which is attached to the back of the diaphragm. The voice coil moves back and forth depending on the given waveform, thus moving the diaphragm which in turn creates pressure waves in the air. These pressure waves are sensed by your ear and interpreted as sound by your brain. But wait, what’s a codec? Well, codec stands for coding and decoding, and in general, it’s either a piece of software or hardware that converts data or information from one format into another format thereby compressing or decompressing data. In the scenario we just talked about, the audio codec converts the music or podcast data from a set of digital values or numbers, into an analog waveform. This is the process of decoding the audio file. In this scenario, the codec performs a digital to analog conversion or DAC. However, audio codecs can also do the reverse by encoding the analog signal from the microphone, into a digital set of values, which is an analog to digital conversion or ADC. Codecs are used in every piece of technology we use in order to convert certain types of data into other types while compressing or decompressing the data. In fact, this video you’re watching is downloaded as a compressed video file, and a video codec is actively decompressing this video’s data as you watch it. Now that we have a conceptual overview, let’s take a look at how an analog sound waveform can be turned into and represented by 1’s and 0’s. Here’s an analog audio waveform. It has some peaks and troughs, it’s incredibly detailed, and, depending on the length of the audio file, it can get pretty long in duration. So, how do you turn this audio waveform into a digital long list of numbers? You might think that it involves some crazy mathematics with a ton of sines and cosines, along with multivariable equations, but it’s actually a lot less complicated than that. Rather, the analog audio waveform is placed on a graph, and all the values that it passes through at a set time interval are placed into a list of values. That’s it, the digital version of the audio waveform is just a long list of values or points that the waveform passes through, and in this scenario, we’re going to have 23 microseconds between each data point. When the digital information, or the long list of numbers, is fed into the audio codec, the audio codec plots all the points on the graph, connects the dots and smooths the line between the points, and sends the analog waveform to the speaker which generates sound. In fact, if you were to open up an audio file in some audio editing software and zoom in, you would see all the points that constitute the audio. A music file or podcast’s data isn’t an analog waveform like this, but rather it’s just a long set of points equally spaced apart with their associated values. the process of turning an analog waveform into a set of numbers is called digitization, or analog to digital conversion. That’s the basic concept. Next, we’ll add on a few more details, and then further, more complex details. But before that, let’s briefly talk about this video’s sponsor, PCBWay.
These earbuds contain a Rigid-Flex printed circuit board, where certain segments are flexible and able to bend, while others are rigid and able to support rather complex circuitry. PCBWay offers all kinds of PCB prototyping and manufacturing services including these Rigid-Flex PCBs, aluminum PCBs, PCBs with dozens of layers, and PCBs designed by your kid that are used with push buttons to light up an LED, or this LED cube. PCBWay also offers Turnkey PCB Assembly services to assist you in populating your PCBs. Check them out at their website linked below. Let’s move on and dive into the details of how sound is represented digitally. There are two key aspects that need to be addressed regarding this system. First, on this graph, the X axis is time, and as we mentioned, every data point, or sample of the analog audio waveform is 23 microseconds apart. If we wanted, the spacing could be smaller, at, let’s say 1 microsecond between values, or samples, which would yield a million samples every second, or a million hertz sampling rate, and would result in audio files that would be over a hundred megabytes for 60 seconds of audio. So, why 23 microseconds? Well the short answer is that the spacing between each data point depends directly on the average human ear’s ability to perceive sound. The human ear can hear sounds up to around 20kilohertz, or one wave every 50 microseconds. If the waves were closer together like this, the sound would have a higher frequency which humans wouldn’t be able to hear. Scientists and engineers decided to not really concern themselves with frequencies that humans can’t hear. So, in order to capture a waveform with a maximum 20 kilohertz frequency, two data points are required per full wave, and thus they use one data point every 23 microseconds, which is a rate of 44.1 kilohertz, or 44 thousand 1 hundred samples every second. Note that this sampling rate or sampling frequency of 44.1 kilohertz is the most common rate for recorded audio such as music and podcasts, and 48 kilohertz is the second most common sampling rate. But why is the number slightly more than double the frequency of human hearing? Well, you can find those specifics along with details as to why music played over the telephone always sounds terrible, and finer points on the Nyquist Theorem and aliasing in the creator’s comments. Also, we would greatly appreciate it if you could take a quick second to tell us what you think about this video in the comments below. Knowing whether you find a section confusing, boring, really interesting, whether it has great graphics, or whatever, is extremely useful, and it helps us to improve on future videos. So, thank you. Let’s get back to our long list of digital values that represent the analog waveform. Our next question is, how do we represent these values in binary? Or, essentially, how many 1’s or 0’s are we going to use for each sample? Let’s try representing each sample by using a single bit, either 1, or 0. To do that, let’s take the original waveform, assign each sample either a 1 or a 0, and here we have the resulting digital data. But how accurate is our analog to digital conversion? To check, we reassemble the graph using these values, 1’s up here, 0’s down here, and smooth a line between the points. Now we have an analog waveform created from the digital data, which was created from the original audio waveform, and… this recreation looks nothing like the original audio, and thus 1 bit isn’t good enough. So, let’s say we use two bits, which means each sample could be one of 4 different values. Let’s take the original audio, round each value to the closest 2 bits equivalent, and here we have the long set of values. When recreating the audio, we would again plot all the points on the graph, smooth out the line but… it still looks pretty bad as it doesn’t really match the original waveform. Really, the question is, what’s the optimal number of values in the vertical axis needed to accurately represent the original audio waveform? And the answer is that it varies, but an audio CD for example, uses 16 bits, for every single sample. With 16 bits that means that there are 2 to the 16 or 65536 different values along the Y axis, or using technical jargon, we say our audio file has a bit depth of 16 bits. The process of turning an analog signal into a set of values is called quantization and assigning bits to each value is called pulse code modulation. Furthermore, an audio bit depth of 16 bits is pretty common, however higher quality audio files use 24 or 32 bits per sample or higher. Okay, so if you came to this video wondering about MP3, AAC, WAV, FLAC, or other audio file formats, we’ll briefly talk about them here. Digital audio data, which is this long list of 16-bit values at 44.1kilohertz, is uncompressed and takes up a lot of space at around 10 and a half megabytes per 60 seconds of 2 channels of audio. MP3 files at 320kbps stereo, reduce the file size to around 2.4megabytes by processing every millisecond of audio, and finding elements in the uncompressed audio that humans aren’t good at hearing. The psychoacoustic algorithm finds elements in the audio that have exceptionally low volume sounds, very high-pitched sounds, or sounds very close together which the mp3 compression discards, thereby saving space and making it a lossy compression format. Lossless compression formats such as .alac or .flac don’t discard any data but rather compress the data by finding patterns, or redundant data, and representing those patterns more efficiently than when the audio is uncompressed. But file compression is a complex topic and we’re planning entire videos dedicated to just that topic. As mentioned in the intro, there’s a lot of technology that goes into these wireless earbuds. Thus far we have videos made on these topics, and we’re working on videos to explain these other topics. So if you’re interested in getting a complete understanding of how wireless headphones work, check out those videos on our channel page, and subscribe and hit the bell so that you’ll be notified when we release future videos on this topic. As usual there are even more details in the creator’s comments which can be found in the English Canada subtitles. Thanks again to PCBWay for sponsoring this video as well as our Patreon and YouTube Membership Sponsors for helping us to produce these videos. Thanks for watching, and finally, always remember to consider the conceptual simplicity, yet structural complexity in the world around you.