Geek magazine hacker daily blog

2 years, 2 months ago
Simple words about a digital and analog sound

In line transcript of the tenth release (22.05.2014) of a podcast "Sound". In Dmitry Kabanov is mute talks to Anatoly Dmitriyevich Arsyonov, to. the so-called, physicist by training, the expert in the field of IT and a digital sound, the engineer in the F-Lab company on a digital and analog sound.

[To listen to this release]

[In more detail about a podcast]

Transcripts of other transfers


Dmitry Kabanov: We continue to talk to experts and engineers of "Audiomaniya", and today we will try to dig more deeply, to look at the nature of a digital and analog sound, and, probably, we will begin with a question of what is a sound in principle. Than in basic understanding, simple words, the analog sound differs from a digital sound or analog representation of a sound and digital representation of a sound?

Anatoly Arsyonov: Answering this question, I think, to pertinently give the simple models familiar, maybe, [from] a school course to any Russian educated person. In particular, sound history [as] digital, [so] and analog begins long ago, strangely enough, even before emergence of digital devices. The voice transfer of the person by means of the normal landline telephone is familiar to all. This [also is] a real example of transfer of an analog sound at distance. In this case speaking has before himself the handset in which there are a microphone and the membrane fluctuating according to voice of the person on the opposite end there is the return procedure, that is the membrane of phone which is at the subscriber's ear fluctuates.

What is transferred on a cable? We have a signal of an alternating voltage: current in a cable changes how the person speaks, here so to say not to go into details. What is a digital sound? Here [it is possible to give a similar example] from the same time – a cable signal transmission, the Morse alphabet. In this case the announcer has before himself some text, but he has to know the Morse alphabet. Further, who codes the text? The person who knows how to transfer a letter "A" how to transfer a letter "B" etc. What goes to a signal line? Signals go: the point and a dash, as approximately the sound – zero and units is coded now, are transferred by two signals two statuses.

What the subscriber on the opposite side has to make [if he] wants to understand, accept this text, to receive this message? He has to know the Morse alphabet, it has to receive these points and a dash, and knowing them, already to understand what it there is a speech about. Here all, as a matter of fact, difference. In one case the signal which has character of model of a voice of the person who is transferred by electric signals in the second case is transmitted we have transfer of characters which are coded in some conditional way. In this case it were points and a dash. Many years later, in present period, we already have two types of signal transmission which very far differed among themselves from that old story.

Dmitry: It turns out that the digital sound or digital representation of a sound can be understood how a certain compromise which we receive, taking an analog sound and transforming it to digit.

Anatoly: Well, compromise it or not … A compromise with what? With equipment opportunities? Yes, it is a compromise. Further, with requirements of modern equipment to transfer bigger amount of information for a unit of time to more long distance with high quality and capability to the subsequent correction? Yes, it is a compromise. Of course, to transmit an analog sound on a long distance with high quality, the equipment has to have the corresponding capacities, and I will not tell that it will be cheap, it will be always material-intensive.

At a certain stage of development of technology it was the most productive to transmit signals not in an explicit form as it occurs in the analog equipment, and in the form of some model, the table of numbers, here I can give a similar example from several other practice of the acquaintance, probably, to everyone too. Means, having a map … here is how it is possible to transfer to the companion information if there is a task to reach from one point another? It is necessary to take the card, to draw a pencil the line as you went or as you are going to go, and to send this card, here to you, please – we transfer information in an explicit form.

It is possible to arrive in any other manner – knowing that the companion has just the same card, to transfer the plate with coordinates of points. What will be transferred in this case? A leaflet on which the table will be written: width, longitude, width, longitude, width, longitude etc. In this case it will be just the table of numbers. The companion, having received this table, having taken the card and having noted these points on coordinates, will at once define how to go. What did we transfer in [this] case? The card with a route or we transferred the table, some coding?

This everything occurs also in digital equipment. An indispensable element in digital equipment is the encoder or a raskodirovshchik, well it was so told earlier, in digital equipment it is accepted to say that this digital-to-analog conversion.

Dmitry: The excellent example, seems to me, and whether we should hook on [subject] of storage here? A format, understanding of formats, understanding of their difference because there are many myths about what formats we have – with losses, without loss, differently squeezing the file etc.

Anatoly: Apparently from the given examples, the digital format is a conditional form of a signal transmission - it is system of formalization if to speak mathematical language. The signal is transmitted in a conditional form of a mathematical model — if to speak even more deeply, then it is a matrix which contains the certain numbers [characterizing] a signal in each timepoint.

If to say in relation to a sound, what digits transfer? Digits transmit a signal range, its amplitude, volume. Frequencies of this signal, high, low as these frequencies are connected among themselves tembralno etc. - it are the spectral characteristic transferred to a numerical form which is transferred [on the device].

At the beginning of the computer equipment of a possibility of personal computers were not really wide. To implement simple tasks, the computer device needed to have the sufficient storage capacity and performance of the central processor. It did not allow a digital format to display in details written sound. Simple example: if to attach the sound card to the old computer of fifteen-year prescription, to connect the microphone, to digitize the voice, then I do not think that [result] would be pleasant to much, [namely] quality of a recorded voice.

Well objectively, why? On an input of the sound card the signal from the microphone was given. Frequency responses of a digital path were then rather modest and therefore conversion of an analog signal, that is a sound in the scheme which allows to display in a digital form this sound in computers … it was difficult process and, naturally, vendors and developers of devices of that time, trying to save memory and performance of the processor, created simple encoding schemes of a sound in that form in which it can be stored in the computer.

What it led to? To losses. In sound quality first of all. With growth of performance of the computer equipment, performance of the central processor, increase in memory sizes, this problem gradually began to be removed from the agenda, but nevertheless approaches which were created at that time, left the mark on development of digital equipment. In due time, if my memory does not fail me it 1994, works of Fraungofersky institute on creation of the MP3 format – this format [were conducted] and today it is very popular for storage of music and different audiodata in portable equipment, in particular, in smartphones.

Dmitry: Let's provide the short viki-help: MP3 (more precisely, English MPEG-1/2/2.5 Layer 3; but not MPEG3) is the codec of the third level developed by the MPEG command, the licensed file format for storage of audioinformation. The MP3 is developed by the working group of institute of Fraunhofer under the leadership of Karlheinz Brandenburg from a universtitet Erlangen-Nuremberg in cooperation with AT&T; Bell Labs and Thomson.

The experimental ASPEC codec (Adaptive Spectral Perceptual Entropy Coding) formed a basis of development of a MP3. The L3Enc program released in the summer of 1994 became the first encoder in the MP3 format. One year later the first program MP3 player – Winplay3 appeared. When developing algorithm tests were carried out on quite specific popular compositions. The main became Syuzanna Vega's song Tom’s Diner. From there was a joke that "the MP3 was created only for the sake of comfortable listening of a favourite song of Brandenburg", and Vega began to call "mother of MP3".

Anatoly: What is it characterized by? [What] its difference from a sound which in no way, except conversion to digit, does not differ from an analog signal (these files we called wave-forms earlier)? Who is familiar with computers of Apple, there [such] files had a format which is called AIFF as far as I remember.

Dmitry: Yes, indeed.

Anatoly: The form of these two files, a format of this file, is just digital display of an analog sound. But in computers of that time it occupied very large volume and such files could be stored in the computer a little. What did the MP3 differ in?

Mathematics of Fraungofersky institute, approaching this problem, decided to simplify this mathematical model, that is to clean from digital model of a real sound what will not be apprehended in any way by the person when listening. What moments underwent mathematical processing first of all? Fundamental laws of acoustics were used. One of them says, in particular: if any signal sounded, well, let us assume, blow of a bell or someone struck a chord on a grand piano and in same it is a high time some low sound which difference in volume [with the first sound] exceeds 90 dB – unit by means of which measure sound pressure – this sound in no way was distributed, by any person with wonderful ears it will not be heard.

Dmitry: Therefore information can be thrown out.

Anatoly: Nobody [this sound] will hear. If the difference between the loudest and most low sound is more than 90 dB of time at present, then it is quietly possible to delete, cut these sounds from record. It is one of methods. Specialists [call] what here occurs masking of a low-level signal a signal of higher level.

Other method: as a rule, the Hi-Fi equipment allows to record signals with certain frequencies – if to speak about frequencies and not to use such concepts as high, low and mid frequencies. Signals with frequencies from 20 Hz to 20000 Hz is a band which the equipment can reproduce. Whether the person will hear all this range? If to look from the point of view of perception of the person and to enter such term as psychoacoustics, then it is [also] possible to make some simplifications of a signal.

Simple words about a digital and analog sound
That who wants to check auditory acuity and to compare sounding of different audio systems, Audiomaniya offers service of listening of equipment at home. On a photo – work of installers of Audiomaniya

Most of adults – those who passed for teenage age as a rule do not hear frequency over 16 kHz, so range over 16 kHz too somehow can be reduced mathematically and, thus, to clean this information from that file which was written by means of the digital microphone as it too will not be adequately apprehended by the listener. The same occurs also in the low range: those who are busy with human physiology, know that any person if he is normal, of course, and he has no pathology, lower than 16 Hz an ear do not perceive low-frequency signals – he perceives [such signals] or taktilno, or body organs.

Zanachit, all these sounds can be [deleted] without serious consequences too, without having lost the main quality of a sound signal if it, for example, was the musical piece. In principle, these methods exists quite a lot today: schemes which are used in a digital sound, the MP3 formats, masking of true tone by noise etc. etc.

For a short illustration of what is it: after procedures of conversion of digital model of an analog sound which we see in the wave or AIFF formats to the MP3 format – after these procedures are made (masking, removal of those sounds which cannot be apprehended by the person) – the sound at an intermediate stage is not really comfortable for listening, it carries on itself(himself) a knocking over print, hearing of the person, especially the musician, can feel discomfort therefore to hide flaws at the last stage, the noise signal of low amplitude level "is mixed" with digital formats.

It becomes special algorithm. In principle, it is possible to illustrate it with such example: if you are in any room and in the neighboring room someone talks, and it disturbs you, turn on the vacuum cleaner. Noise of the vacuum cleaner is more low-frequency signal in relation to the speech of the person, and low-frequency signals always mask high-frequency signals, but not on the contrary. You will cease to hear importunate interlocutors. Approximately the same occurs also in digital formats, at the last stage after digitization there is an admixture of a noise signal of a certain amplitude, a certain spectral content, it can be a kind of a white noise.

Dmitry: Well, then let's try to talk about those cases when we can claim that we after all lose something, using a MP3 – it is not always ideal for application, not always it approaches, some class of the equipment is able to afford us something bigger.

Anatoly: Perfectly, a MP3 as a format for compact storage of audiodata in the computer equipment and as one of the oldest formats, gradually, eventually, began to lose popularity. Why? Well [first of all], the computer equipment increased the performance and memory sizes, [and it means that] the need for compression, knocking over of sound data disappeared, there is no such strength – memory at us on modern computers enough now, performance of processors is sufficient therefore we can listen not to a compressed digital sound.

What were taken leaving steps from compact wound of music in due time? First of all, there were competing formats for compressed storage of a sound. They know those who use computers of the Apple company and tablet computers, smartphones, iPhones in what format music in Apple Store [iTunes] is on sale – if I am not mistaken, it is MP4, huh?

Dmitry: Yes.

Anatoly: Someone will tell that it is a digital sound and too compressed too and that it [also] has shortcomings. Well. Only it appeared later, than a MP3, works on this format began in somewhere in 1997, that is nearly 3-4 later [creations] of a MP3, so those developers who developed this system of coding of a compressed sound, considered problems and shortcomings which were in the previous formats, improved [product].

To what I give these examples: a digital sound, having arisen at a certain stage, at emergence of computer devices passed a certain evolution, formats as uncompressed storage of sound data, and formats [storages] of a compressed sound evolved. The modern method of coding of a sound in the MP3 formats or similar is rather perfect.

Having received popularity at a certain stage, now [format] was actually recorded on a certain group of devices: first of all on portable equipment of mobile communication – smartphones, phones, players etc. Owing to small dimensions, small power, low opportunities of the loudspeakers which are built in smartphones he organically fitted into this structure. If to speak about the equipment serious, for house listening, in particular, to the Hi-Fi equipment, then here, of course, not any exacting listener will agree that digital formats of storage of audiodata in a condensed form are good.

Simple words about a digital and analog sound
For those who do not accept the digital data storage formats in a condensed form Audiomaniya has analog solutions. On a photo – an installation fragment from Audiomaniya

Our materials on a subject:


To continue our conversation, probably, it is pertinent from the characteristic audiointerfeysa the modern computer which is a basis for a modern digital sound. Further, on the course of conversation it will become clear how it [belongs] to a subject of our conversation, to a high-class audio equipment, for example. So, the modern audiocard of the personal computer or the notebook has several characteristics which just entirely describe possibilities of this computer in storage or reproduction of a digital sound. What do I mean? Frequencies at which the audiocard and digit capacity of this audiocard works. Probably, such digits as 16 bits and 44 kHz are familiar to the user.

Dmitry: Of course.

Anatoly: These are base characteristics of any audiopath of the modern computer, be it desktop or portable. The same characteristics (that is digit capacity of processors) have also standard compact disk players. Without going into details, it is necessary to tell that this standard appeared long ago. Vendors of a household audio equipment which at all is very popular with us – Phillips, Sony, Toshiba developed the standard of storage of audiodata such (16 bits and 44 kHz). In process of development of the computer equipment of the audiocard purchased additional opportunities, in particular, a number of frequencies at which the audiocard can work increased – 48 kHz, 96 kHz, 192 KHZ, digit capacity of the processor which is installed on the audiocard, increased too – 16 bits, 24 bits …

Dmitry: 32 …

Anatoly: And now 32. If to speak a vernacular, then the frequency of 44 kHz is that necessary frequency which allows to save a wave form of a sound signal, for example, of the musical piece or a voice of the person. From where there was this number and why the audiocard has to work at this frequency? There was such mathematician Kotelnikov who the theorem proved this border of an engineering device which allows to digitize a signal with rather high quality.

It is pertinent to give such example: the elementary sound, for example, a sound of a pipe and a children's pipe … the form of its sound signal is similar to a sinusoid, so to say. What is 44 kHz? It is the frequency of operation of the audiocard. Such signal, having been included in the audiomap, it will be cut instantly on 44 thousand vertical stripes. What will we receive as a result of this cutting? We will receive value of volume of a signal in each point of time – one forty four thousand second.

Dmitry: And now we need to cipher all these strips.

Anatoly: Now we need to cipher and save these strips in the computer. How we can cipher [them]? It is possible to remember value of volume in each stripe. There now here just already also plays value other characteristic of the audiocard – its digit capacity. In particular, 16 bits. What is 16 bits? Programmers speak so: two in the sixteenth degree.

Dmitry: So.

Anatoly: What this number, 65 thousand with kopeks? It turns out that I can use number from zero to 65 536 if to speak precisely to express height of this stripe. It will be some number. In one case it will be 60 thousand, in other case – 30 thousand etc. [Mean], in this case we will receive the table which will contain 44 thousand digits for a second of time, each of which will be expressed by number from zero to 65 536. This table will also be the uncompressed sound file.

Dmitry: Now we work with this table further …

Anatoly: What do we see here? That if the speed of operation of the audiocard was higher, [then], probably, we would receive the much bigger number of these digits which would describe our signal more precisely. Naturally, aspiration of developers and vendors – to approach a true waveform. Here from there [is] an aspiration of designers of equipment to increase frequencies. From year to year, so to speak, from one class of devices, to another etc.

This development led to the fact that [beginning] with the frequency of 44 kHz slowly these frequencies increased. I applied the unsuccessful word "gradually" because actually development was much more difficult, all frequencies were used: both 32 kHz, and 24 kHz. The listener or someone curious can ask: "And where these frequencies are used?" because it is explicit that the sound [when using frequencies below 44 kHz] will be more rough. For example, by transfer of telesignals in telephone equipment. There is no need very precisely to describe a signal, and here by transfer of an aggregate musical signal, some concert batch as it appeared, 44 kHz do not meet requirements of exacting hearing. Therefore frequency responses of cards permanently, from generation to generation, increased.

To finish conversation on this subject and not to go into details, perhaps, it is worth giving such example – the HD audio birth, it was 2004, the Intel company developed just this year specifications of HD audio which consist in the following two values: 32 bits and 192 kHz. Means after specifications of HD audio … were developed what such HD how we will decrypt it?

Dmitry: High definition. High resolution.

Anatoly: High resolution, that is this audio of high resolution. Such standard can already be basic for very qualitative audio equipment, for signal sources which, for example, will compete, I will not be afraid of this word, with vinyl. What did history of development of HD of audio end with? Intel transferred the developments to three manufacturing companies of interfaces, and then, on the basis of these interfaces, the companies which make audiocodecs already for specific engineering devices, since Realtek and finishing Wolfson, developed codecs, everyone for the digital data processors.

This article is a translation of the original post at geektimes.ru/post/260806/
If you have any questions regarding the material covered in the article above, please, contact the original author of the post.
If you have any complaints about this article or you want this article to be deleted, please, drop an email here: sysmagazine.com@gmail.com.

We believe that the knowledge, which is available at the most popular Russian IT blog geektimes.ru, should be accessed by everyone, even though it is poorly translated.
Shared knowledge makes the world better.
Best wishes.