A tutorial for beginners in 3D audio. (Excerpt from the technical papers written during the development of Ambisonic Au...
Introduction to Ambisonics – Francesca Ortolani – Rev. 2015a
Introduction to Ambisonics A tutorial for beginners in 3D audio Francesca Ortolani
[email protected] Ironbridge Electronics (Excerpt from the technical papers written during the de velopment of Ambisonic Auralizer)
1.1 Introduction to Surround and 3D audio techniques During the development of techniques and technologies in the audio world, engineers have tried since the early years of the twentieth century to reproduce recorded sources or live takes in a realistic manner, with the aim ai m of giving spaciousness to sound creations. Research is divided into several branches. Even today most of the sound reproduction systems sold, either consumer or professional, are based on two audio channels. The main reason for that is the high cost of amplifiers, signal processors and speakers, which often limited to 2 the number of channels for musical productions, television, radio, etc. In cinemas and theaters the widespread use of multi-channel systems started earlier than at home, since it is easier to sell sound systems of medium-to-low quality with relatively low prices for hometheatre applications. However, only a few commercial post production studios are suitable for multi-channel mixing. The vast majority of post-production control rooms are equipped with 2 (at most 3) speakers for stereo playback and a subwoofer. Multi-channel audio spread especially over film, theater and video games/virtual reality, whereas the 98% of music production is still stereophonic. This is not only due to a problem of costs, but also to the diffusion of the audio formats on which the CD (2-ch, 44100 Hz, 16-bit), and earlier tapes and vinyl, prevailed. Engineers also tried with a very little success to carry the 4-channel information on stereo devices. Among these attempts a remarkable solution was the use of subcarriers on vinyl records. Despite the introduction of new formats such as DVD-Audio and Super Audio CD (SACD) or multichannel wave files (Wave Ex), music is still stereo almost in its entirety. Another issue that discouraged sound engineers from working on multichannel mixes has been the need to keep compatibility of the finalized audio when the number of the channels is scaled down. For example, it is a good practice to check how a stereo mix sounds when its channels are summed to mono. The compatibility with monaural listening is a crucial problem and it should not be underestimated. Because of the sum of the left and right channels, you may have phase cancellations that affect countless hours of fine-tuning during the mixing.
1
Trusted by over 1 million members
Try Scribd FREE for 30 days to access over 125 million titles without ads or interruptions! Start Free Trial Cancel Anytime.
Trusted by over 1 million members
Try Scribd FREE for 30 days to access over 125 million titles without ads or interruptions! Start Free Trial Cancel Anytime.
Trusted by over 1 million members
Try Scribd FREE for 30 days to access over 125 million titles without ads or interruptions! Start Free Trial Cancel Anytime.
Trusted by over 1 million members
Try Scribd FREE for 30 days to access over 125 million titles without ads or interruptions! Start Free Trial Cancel Anytime.
Trusted by over 1 million members
Try Scribd FREE for 30 days to access over 125 million titles without ads or interruptions! Start Free Trial Cancel Anytime.
Trusted by over 1 million members
Try Scribd FREE for 30 days to access over 125 million titles without ads or interruptions! Start Free Trial Cancel Anytime.
Introduction to Ambisonics – Francesca Ortolani – Rev. 2015a
Initially, for example, sound engineers were asked to preserve the sound quality down-mixing from CD (stereo) to TV or radio (in the past these devices were only mono). However, the problem still exists today in the case of live music in which, because of the size of venues and stadiums, most listeners do not benefit from stereophonic listening. Obviously, passing from surround to stereophonic or monophonic systems is even more critical.
Figure 1.1 In order to have a correct perception of the stereophonic sound, the speaker pair and the listener must be located at the vertices of an equilateral triangle.
Over the past few decades, since the second half of the twentieth century, sound artists tried to give more and more spatial dimension to sound and their own artistic creations. This has led to the development of techniques aimed at rendering 3D sound, which would have been alternative to classic surround - according to its several standards imposed upon the market by Dolby - and stereo techniques. These include binaural audio, Wave Field Synthesis, OPSI and Ambisonics. In particular Ambisonics can coexist with stereo and surround sound systems such as 5.1 or 7.1, etc.. This 3D audio technique was introduced in the 70s by the team led by Gerzon, Fellgett and Barton supported by the National Research and Development Council and the British Technology Group. It is compatible with a wide variety of speaker array configurations (either regular-symmetrical or irregular-asymmetrical with various shapes). In-depth explanation of the physical principles on which Ambisonics is founded are given later on; for the moment it is useful to know that this technique is in in part an extension of the basic principles of Mid-Side miking technique [1] patented by Blumlein. This technique uses sum and difference signals between a microphone (from the family of cardioids) with
Trusted by over 1 million members
Try Scribd FREE for 30 days to access over 125 million titles without ads or interruptions! Start Free Trial Cancel Anytime.
Trusted by over 1 million members
Try Scribd FREE for 30 days to access over 125 million titles without ads or interruptions! Start Free Trial Cancel Anytime.
Trusted by over 1 million members
Try Scribd FREE for 30 days to access over 125 million titles without ads or interruptions! Start Free Trial Cancel Anytime.
Trusted by over 1 million members
Try Scribd FREE for 30 days to access over 125 million titles without ads or interruptions! Start Free Trial Cancel Anytime.
Trusted by over 1 million members
Try Scribd FREE for 30 days to access over 125 million titles without ads or interruptions! Start Free Trial Cancel Anytime.
Introduction to Ambisonics – Francesca Ortolani – Rev. 2015a
Figure 1.3 Example of Mid-Side configuration configuration - polar diagram (MID: cardioid microphone, SIDE: figure-of- 8 microphone). LEFT = (MID+SIDE)/2, (MID+SIDE)/2, RIGHT RIGHT = (MIDSIDE)/2 [14]
Figure 1.3 Quadraphonic system
Ambisonics should not be confused with "traditional" surround. First of all, Ambisonics allows including information relative to the height (classic surround techniques are 2D instead). The principles of acoustics on which this technique is based will be explained in detail in the next section. Furthermore, for example, considering a classic quadraphonic system, while the phase difference between the signals received at the front speakers is processed quite effectively by the auditory system (at least for low frequencies), this is not the case for the rear speaker pair, so classic surround systems, quadraphonic or larger systems, do not allow a good source localization. This is due to the fact that sources in classic surround are recorded according according to "discrete" channels, that is, independent on each other, and the differences in level between channel pairs are used [2], [3]. Hence the layout of the loudspeakers relative to the listener becomes crucial: you can experience it even in a simple stereo system where, if the listener and the speakers are not perfectly placed at the vertices of an equilateral triangle, the exact source localization is lost. Testing assessed that the 1 quality of the ghost images between speaker pairs is poor if these are spaced by an angle greater than 60 degrees (i.e. the equilateral triangle mentioned above). In quadraphonic sound, for example, speakers are spaced by 90 degrees causing a feel of "hole in the middle". A homogeneous sound reproduction system is defined as a system in which no direction is treated with any particular preference. Typical cinema surround systems are not homogeneous, in fact the sound coming from the front stage (screen) is usually controlled more accurately than the rear channels, since a solid match between sound and image is searched with the objective not to distract the audience. We can say, however, that surround systems are coherent , within certain limits, in the sense that the sound image remains stable, that is, not subject to significant discontinuities, if the listener changes position [4]. The consistency of the front image is guaranteed by the presence of sounds uncorrelated from the rest of the system. This can be achieved, for example, by delaying and spreading the signals sent to the surround system. What we ideally want is that the reproduced sound
Trusted by over 1 million members
Try Scribd FREE for 30 days to access over 125 million titles without ads or interruptions! Start Free Trial Cancel Anytime.
Trusted by over 1 million members
Try Scribd FREE for 30 days to access over 125 million titles without ads or interruptions! Start Free Trial Cancel Anytime.
Trusted by over 1 million members
Try Scribd FREE for 30 days to access over 125 million titles without ads or interruptions! Start Free Trial Cancel Anytime.
Trusted by over 1 million members
Try Scribd FREE for 30 days to access over 125 million titles without ads or interruptions! Start Free Trial Cancel Anytime.
Trusted by over 1 million members
Try Scribd FREE for 30 days to access over 125 million titles without ads or interruptions! Start Free Trial Cancel Anytime.
Introduction to Ambisonics – Francesca Ortolani – Rev. 2015a
In Ambisonics, on the other hand, the signals sent to the speakers contain information from each microphone capsule used in the recording with different relations resulting from a decoding matrix. The effect of spatialization here is much more robust than in traditional surround techniques, in the way that the sweet spot, i.e. the optimum optimum listening position, is wider. Ambisonics is not limited to a precise number of speakers: the higher the number the better the directional resolution you can get. The reason for that will be explained next, by introducing the concept of order in Ambisonics.
1.2 The Physics in Ambisonics A comparison with other techniques In sound field description, source characterization is one of the most important jobs of Auralization. Auralization involves creating audio files from simulated, measured or synthesized numerical data [5]. For example, it is possible to represent multipole or extended sources by summing a certain number 2 of monopoles , i.e. point sources whose dimensions are much smaller than the wavelength of the incident sound wave, or integrating over a distribution of monopoles or infinitesimally small surface elements. Each source contributes in terms of sound pressure to the acoustic field. According to source distribution, a specific spatial radiation pattern is created depending on the position and the distance of the sources. In other words, this is expressed by Huygens’ principle saying a wavefront can be considered as a secondary distribution of sources. For example, the 3D audio technique Wave Field Synthesis is based on this and works as the acoustic equivalent of holography. In practice, the sound field can be considered emitted by the original source or by a secondary source belonging to the wavefront. In mathematical terms, this is equivalent to saying that we can r
obtain the sound pressure on the area A, knowing the sound pressure p0 and its gradient ∇ p0 on the boundary of A, by calculating the Kirchhoff-Helmholtz integral: r
r ur R r p0 e − jkR p ( r ) = ∫∫ ∇p0 ⋅n − ⋅ n (1 + jkR ) dS 0 , π R R R 4 ∂ A r
r
∀r ∈ A
(1.1)
r
where k is is the wave-number and R is the vector connecting the source with the listening point [6]. A detailed analysis of integral (1.1) shows how each secondary source is composed of a monopole (relative to the pressure gradient signal) and a dipole (relative to the pressure signal). However, there are slight conceptual differences in the formulations by Kirchhoff-Helmholtz and Huygens. The former is more general. The shape of the boundary does not depend on the wavefront, in addition the Kirchhoff-Helmholtz integral itself carries information relating to both amplitude and phase of the acoustic signal, whereas in Huygens’ principle it is assumed that the secondary sources are located on equiphase surfaces. In practice, we can conclude that the Kirchhoff-Helmholtz integral generalizes
Trusted by over 1 million members
Try Scribd FREE for 30 days to access over 125 million titles without ads or interruptions! Start Free Trial Cancel Anytime.
Trusted by over 1 million members
Try Scribd FREE for 30 days to access over 125 million titles without ads or interruptions! Start Free Trial Cancel Anytime.
Trusted by over 1 million members
Try Scribd FREE for 30 days to access over 125 million titles without ads or interruptions! Start Free Trial Cancel Anytime.
Trusted by over 1 million members
Try Scribd FREE for 30 days to access over 125 million titles without ads or interruptions! Start Free Trial Cancel Anytime.
Trusted by over 1 million members
Try Scribd FREE for 30 days to access over 125 million titles without ads or interruptions! Start Free Trial Cancel Anytime.
Introduction to Ambisonics – Francesca Ortolani – Rev. 2015a
In practice, Kirchhoff-Helmholtz integral is used as represented by Figure 1.4 (Wave 1.4 (Wave Field Synthesis):
Figure 1.4 Application of the Kirchhoff-Helmholtz integral in holophony/WFS
The listening area is surrounded by pairs of transducers composed of a pressure microphone and a velocity/pressure gradient microphone. In section 1.8 some basics on these types of microphones are given. So, the recorded field is due to the sources external to the microphone array. Then, the Kirchhoff-Helmholtz integral can be interpreted considering that each secondary source can be split into two elemental sources: DIPOLE SOURCE:
fed by a pressure signal
p0
MONOPOLE SOURCE:
fed by a pressure gradient signal
∇ p0
r
During playback a specular action is operated: arrange speakers having physical characteristics as shown in Figure 1.4 in the place of the microphones, that is, replace the pressure mics with acoustic dipole speakers (these speakers radiate both forwards and backwards) and replace the pressure gradient mics with monopole speakers (closed speakers radiating only forwards thus having a directional characteristic). The geometrical layout of the microphone array and the speaker array has to be the same. Each speaker is fed with the signal picked up by the respective microphone. Similarly we can surround the source instead of the listener with the microphone array [6]. Such a system guarantees (ideally) the exact reproduction of the field within the listening area and,
Trusted by over 1 million members
Try Scribd FREE for 30 days to access over 125 million titles without ads or interruptions! Start Free Trial Cancel Anytime.
Trusted by over 1 million members
Try Scribd FREE for 30 days to access over 125 million titles without ads or interruptions! Start Free Trial Cancel Anytime.
Trusted by over 1 million members
Try Scribd FREE for 30 days to access over 125 million titles without ads or interruptions! Start Free Trial Cancel Anytime.
Trusted by over 1 million members
Try Scribd FREE for 30 days to access over 125 million titles without ads or interruptions! Start Free Trial Cancel Anytime.
Trusted by over 1 million members
Try Scribd FREE for 30 days to access over 125 million titles without ads or interruptions! Start Free Trial Cancel Anytime.
Introduction to Ambisonics – Francesca Ortolani – Rev. 2015a
Possibly, in that case, we encounter spatial aliasing. As in the time domain, spatial aliasing occurs when the signal is sampled in space taking an insufficient amount of points. Aliasing is revealed by the appearance of fictitious sources. The maximum frequency above which spatial aliasing occurs is calculated as (Nyquist Theorem):
f max =
c
2d trans
(1.2)
where d trans is the distance between two transducers and c is the speed of sound. A signal of frequency above f max produces a time difference of arrival at two transducers greater than the signal period, while a signal of frequency below f max is such that, being the time difference in the range of the signal period, the phase difference at the transducers allows an unambiguous time difference evaluation. Basically, some simplifications on the Kirchhoff-Helmholtz integral and its use are operated. We try to minimize the number of transducers to represent the most important secondary sources and we try not to use both the monopole and dipole transducers: what we normally do is to send signals recorded with cardioid or figure-of-8 microphones to monopole speakers. Note that the superposition of a monopole and a dipole gives as a result a cardioid polar characteristic. Finally, with the aim of limiting the number of spaced microphones used, it is preferred to build "virtual microphones" through the processing of the recorded signals by weighting the amplitudes and delays in the time of arrival appropriately, in order to improve the resolution of the system. A constraint to the techniques based on the theory of the Kirchhoff-Helmholtz integral (with any simplistic changes) is the fact that within the listening area, surrounded by the speaker array, primary sources should be absent, i.e. the array is able to reproduce only the external sources. This is a false problem as it is possible, however, to reverse the phase of the signals feeding the array relatively to the secondary sources and reproduce, in this way, the internal sources, too. Therefore, we can create a concave wave front instead of a convex one [7]. Another way to describe a sound field, fiel d, especially in the case of sources with wit h spherical symmetry, is based on the decomposition of the sound field into spherical harmonics. harmonics. Ambisonics is founded on this second descriptive approach. Spherical harmonics are also used in issues concerning quantum mechanics, gravitational fields and can be found in 3D graphics applications and lighting engineering. The starting point is to express the acoustic wave equation in spherical coordinates where r is the radius,
( r ,θ , ϕ ) ,
θ is the azimuth and ϕ is the elevation.
The acoustic wave equation in the time domain is: 2
∇ p ( r ,θ , ϕ , t ) −
2 1 ∂ p ( r , θ , ϕ , t )
c2
2
∂t
=0
(1.3)
Trusted by over 1 million members
Try Scribd FREE for 30 days to access over 125 million titles without ads or interruptions! Start Free Trial Cancel Anytime.
Trusted by over 1 million members
Try Scribd FREE for 30 days to access over 125 million titles without ads or interruptions! Start Free Trial Cancel Anytime.
Trusted by over 1 million members
Try Scribd FREE for 30 days to access over 125 million titles without ads or interruptions! Start Free Trial Cancel Anytime.
Trusted by over 1 million members
Try Scribd FREE for 30 days to access over 125 million titles without ads or interruptions! Start Free Trial Cancel Anytime.
Trusted by over 1 million members
Try Scribd FREE for 30 days to access over 125 million titles without ads or interruptions! Start Free Trial Cancel Anytime.
Trusted by over 1 million members
Try Scribd FREE for 30 days to access over 125 million titles without ads or interruptions! Start Free Trial Cancel Anytime.
Introduction to Ambisonics – Francesca Ortolani – Rev. 2015a
The acoustic pressure field, due to external sources, can be developed into Fourier-Bessel series, σ
whose terms are weighted products of the directional functions Y mn (θ , ϕ ) - spherical harmonics – with the radial functions jm ( kr ) - spherical Bessel’s spherical Bessel’s functions of the first kind : r
p ( r ) =
∞
∑ ( 2m + 1) j
m
jm ( kr )
m =0
∑
BmσnY mσ n (θ , ϕ )
(1.4)
0 ≤ n ≤ m ,σ = ±1
σ is the spin and it will be obvious looking at the 2π f pictures further on), k is is the wavenumber k = solution of the . Equation (1.4) represents the solution c with m = degree and n = order (the meaning of
Wave Equation in the special case of plane wave. As shown later, the ambisonic signals in the transform domain [7] are represented by the coefficients σ Bmn and behave like Fourier coefficients in a Fourier series.
Note that, unlike WFS or Holophony, the sampling and the reconstruction of the sound field in
Figure 1.5 Bessel spherical functions of the first kind.
Trusted by over 1 million members
Try Scribd FREE for 30 days to access over 125 million titles without ads or interruptions! Start Free Trial Cancel Anytime.
Trusted by over 1 million members
Try Scribd FREE for 30 days to access over 125 million titles without ads or interruptions! Start Free Trial Cancel Anytime.
Trusted by over 1 million members
Try Scribd FREE for 30 days to access over 125 million titles without ads or interruptions! Start Free Trial Cancel Anytime.
Trusted by over 1 million members
Try Scribd FREE for 30 days to access over 125 million titles without ads or interruptions! Start Free Trial Cancel Anytime.
Trusted by over 1 million members
Try Scribd FREE for 30 days to access over 125 million titles without ads or interruptions! Start Free Trial Cancel Anytime.
Introduction to Ambisonics – Francesca Ortolani – Rev. 2015a
Let's see the spherical harmonic functions in detail, analysing how ambisonic signals are obtained from these functions. Spherical harmonics are defined as:
Ymσ n (θ , ϕ ) = 2m + 1
( 2 − δ 0, n )
cos nθ if σ = +1 ( m − n ) ! (1.5) Pmn sin ϕ × θ σ = − s i n i f 1 (ignore if 0) n n = ( m + n )!
where Pmn ( ξ ) are associated Legendre functions of degree m and order n, δ pq represents Kronecker delta and is equal to 1 if p = q , else it’s equal to 0. The associated Legendre function is defined as:
Pmn (ξ ) = (1 − ξ
2
)
n 2
dn dξ n
Pm (ξ ) =
( −1) 2m
m
(1 − ξ ) m! 2
n 2
d m + n d ξ m + n
2 m
(1 − ξ )
(1.6)
where ξ = cos ϕ .
In Ambisonics some kind of normalization of Legendre functions often takes place [8]. Schmidt Semi-Normalization is defined by:
N mn =
2m + 1
( 2 − δ 0, n )
(m − n )! ( m − n )! = en ( m + n )! ( m + n )!
e0 = 1 if n = 0
(1.7)
en = 2 if n ≥ 1 The harmonic functions can be rewritten in Schmidt semi-normalized form (SN3D) by substituting (1.7) into (1.5):
cos nθ σ Ymn (θ , ϕ ) = P%mn sin ϕ × sin nθ
if σ = +1 if σ = −1 (ignore if n = 0)
(1.8)
The set of spherical harmonics forms an orthonormal basis in the sense of the spherical scalar product, that is:
Trusted by over 1 million members
Try Scribd FREE for 30 days to access over 125 million titles without ads or interruptions! Start Free Trial Cancel Anytime.
Trusted by over 1 million members
Try Scribd FREE for 30 days to access over 125 million titles without ads or interruptions! Start Free Trial Cancel Anytime.
Trusted by over 1 million members
Try Scribd FREE for 30 days to access over 125 million titles without ads or interruptions! Start Free Trial Cancel Anytime.
Trusted by over 1 million members
Try Scribd FREE for 30 days to access over 125 million titles without ads or interruptions! Start Free Trial Cancel Anytime.
Trusted by over 1 million members
Try Scribd FREE for 30 days to access over 125 million titles without ads or interruptions! Start Free Trial Cancel Anytime.
Introduction to Ambisonics – Francesca Ortolani – Rev. 2015a
So, they can be linearly combined in order to define functions on the surface of a sphere.
Figure 1.6 High Order Ambisonics (up to 3rd order) - 3D view [© D. Courville]
In such a position, Equation (1.4) has to be arrested to a certain order M (because (because of manageability), also known as order of Ambisonics. Writing again for convenience: r
p ( r ) =
∞
∑ ( 2m + 1) j m=0
σ
m
jm ( kr )
∑
BmσnY mσ n (θ , ϕ )
(1.10)
0 ≤ n ≤ M ,σ = ±1
We have seen that components Bmn are tied tightly to the acoustic pressure field and its higher-order derivatives about the origin. In a vector form we have:
Trusted by over 1 million members
Try Scribd FREE for 30 days to access over 125 million titles without ads or interruptions! Start Free Trial Cancel Anytime.
Trusted by over 1 million members
Try Scribd FREE for 30 days to access over 125 million titles without ads or interruptions! Start Free Trial Cancel Anytime.
Trusted by over 1 million members
Try Scribd FREE for 30 days to access over 125 million titles without ads or interruptions! Start Free Trial Cancel Anytime.
Trusted by over 1 million members
Try Scribd FREE for 30 days to access over 125 million titles without ads or interruptions! Start Free Trial Cancel Anytime.
Trusted by over 1 million members
Try Scribd FREE for 30 days to access over 125 million titles without ads or interruptions! Start Free Trial Cancel Anytime.
Introduction to Ambisonics – Francesca Ortolani – Rev. 2015a
σ . The following example reveals how ambisonic signals can be achieved from coefficients Bmn
For the time being, we stop the Fourier-Bessel series at order M = 1 obtaining the signals called W, X, Y, Z , which we will define better in the next sections: T
B M =1( 3 D) = [WXYZ ] −1 B00 = W
relative to PRESSURE signal
B11+1 = X B11−1 = Y
relative to PRESSURE GRADIENT signals (or to acoustic velocity)
B10+1 = Z
As can be immediately seen from the 3D illustration of the spherical harmonics ( Figure 1.6), 1.6), in order to achieve a higher directional resolution, the order of Ambisonics must increase. Attention! Attention! So as to avoid confusion, it should be noted that order M of Ambisonics is different from order n defined in Legendre functions. We can rather say that it refers to the ambisonic order in terms of degree m in Legendre functions. Ambisonics is not only a 3D audio technique. Sound field representation can be specialized in 2D environments. For this purpose, the sound field should be decomposed according to a system of cylindrical coordinates for a horizontal-only reproduction system. One has: r
∞
1 00
p ( r ,θ ) = B J 0 ( kr ) +
∑ (B
1 mn
)
1 2 cos mθ + Bm−n 2 sin mθ J m ( kr ) =
m=1
∞
1 00
= B J 0 ( kr ) +
∑(B
1 1( 2 D ) mn mn
Y
−1 − 1( 2 D ) mn mn
(θ , 0 ) + B Y
(1.12)
(θ , 0 ) )J m ( kr )
m =1
Also in this case we get an orthonormal basis as seen in the set of 3D equations. The functions denoted
Trusted by over 1 million members
Try Scribd FREE for 30 days to access over 125 million titles without ads or interruptions! Start Free Trial Cancel Anytime.
Trusted by over 1 million members
Try Scribd FREE for 30 days to access over 125 million titles without ads or interruptions! Start Free Trial Cancel Anytime.
Trusted by over 1 million members
Try Scribd FREE for 30 days to access over 125 million titles without ads or interruptions! Start Free Trial Cancel Anytime.
Trusted by over 1 million members
Try Scribd FREE for 30 days to access over 125 million titles without ads or interruptions! Start Free Trial Cancel Anytime.
Trusted by over 1 million members
Try Scribd FREE for 30 days to access over 125 million titles without ads or interruptions! Start Free Trial Cancel Anytime.
Introduction to Ambisonics – Francesca Ortolani – Rev. 2015a
The sound field represented by Equation (1.10) and unpacked in in 2D form in Equation (1.12) can be rewritten in the explicit form for a plane wave as (referring to Figure 1.7 and [9]) r
∞
p ( r ) = Pθ e
jkr cos(ϕ −θ )
= Pθ J 0 ( kr ) + 2Pθ
∑j
m
J m ( kr ) cos m (ϕ − θ ) =
m =1 ∞ ∞ m m = Pθ J 0 ( kr ) + 2∑ j J m ( kr ) cos ( mϕ ) cos ( mθ ) + 2 ∑ j J m ( kr ) sin ( mϕ ) sin ( mθ ) m =1 m =1
(1.14)
Figure 1.7 Sound wave impinging on listening position [9]
Equation (1.14) can be transformed into matrix form: r
p ( r ) = Pθ BT h
(1.15)
where B
T
B1
B
σ
1
2
θ
2 i θ
2
θ
2 i
θ
Trusted by over 1 million members
Try Scribd FREE for 30 days to access over 125 million titles without ads or interruptions! Start Free Trial Cancel Anytime.
Trusted by over 1 million members
Try Scribd FREE for 30 days to access over 125 million titles without ads or interruptions! Start Free Trial Cancel Anytime.
Trusted by over 1 million members
Try Scribd FREE for 30 days to access over 125 million titles without ads or interruptions! Start Free Trial Cancel Anytime.
Trusted by over 1 million members
Try Scribd FREE for 30 days to access over 125 million titles without ads or interruptions! Start Free Trial Cancel Anytime.
Trusted by over 1 million members
Try Scribd FREE for 30 days to access over 125 million titles without ads or interruptions! Start Free Trial Cancel Anytime.
Trusted by over 1 million members
Try Scribd FREE for 30 days to access over 125 million titles without ads or interruptions! Start Free Trial Cancel Anytime.
Introduction to Ambisonics – Francesca Ortolani – Rev. 2015a
On the other hand, the information about sound field variations relatively to the listening point is included in vector h . Source directivity changes with the emitted frequency (directivity increases with increasing frequency). Furthermore, the use of microphones of the family of cardioids simplifies expressions (1.10) and (1.14) in a way that, separating spatial dependence from frequency dependence, one has: ∞
p (θ ) =
∑W
m
∑
(ω )
m =0
Bmσ Y mσ n (θ )
(1.16)
0 ≤ n ≤m ,σ = ±1
where W m (ω ) is the weighting factor:
Wm (ω ) = j m (α jm ( krMIC ) ) − j (1 − α ) j 'm ( krMIC )
(1.17)
Equation (1.17) highlights how the recording field is dependent on the frequency (in fact, k is is the wave number k =
2π
λ
, where λ is is the wavelength λ =
c f
, c is the speed of sound and f is is the frequency).
Basically, when miking a source, besides considering source directivity, we must consider the microphone polar characteristic with respect to the frequency. In light of the above, recorded ambisonic signals correspond to coefficients B
T
weighted by function
W m (ω ) which depends on frequency. Equation (1.17) was obtained by weighting Equation (1.10) by
a cardioid characteristic function of the kind G (θ ) = α + (1 − α ) cos θ . Remember that a cardioid microphone is generated by the superimposition of an omnidirectional microphone (responsive to
Trusted by over 1 million members
Try Scribd FREE for 30 days to access over 125 million titles without ads or interruptions! Start Free Trial Cancel Anytime.
Trusted by over 1 million members
Try Scribd FREE for 30 days to access over 125 million titles without ads or interruptions! Start Free Trial Cancel Anytime.
Trusted by over 1 million members
Try Scribd FREE for 30 days to access over 125 million titles without ads or interruptions! Start Free Trial Cancel Anytime.
Trusted by over 1 million members
Try Scribd FREE for 30 days to access over 125 million titles without ads or interruptions! Start Free Trial Cancel Anytime.
Trusted by over 1 million members
Try Scribd FREE for 30 days to access over 125 million titles without ads or interruptions! Start Free Trial Cancel Anytime.
Introduction to Ambisonics – Francesca Ortolani – Rev. 2015a
1.3 Ambisonic Formats As we have seen, in Ambisonics, sound directional components are encoded vectorially in a set of spherical harmonics. This paragraph shows how audio signals are recorded and processed in Ambisonics. Actually, Ambisonics is not limited to a particular number of channels: a greater number of channel provides a higher directional dire ctional resolution. In Ambisonics several formats exist for microphone recording, broadcasting and reproduction of recorded signals. -
A-Format: B-Format: C-Format/UHJ: D-Format: G-Format:
suitable for miking with specific microphone (e.g. Soundfield mic); suitable for miking and processing with studio equipment; suitable for mono, stereo, 3-channel systems and broadcasting; suitable for decoding and playback through array of speakers; alike D, but decoder is not required;
A-Format A-Format is achieved from the recording of four signals using a microphone equipped with four sub-cardioid capsules mounted on the faces of a tetrahedron and oriented as shown in Figure 1.8a: 1.8a:
Trusted by over 1 million members
Try Scribd FREE for 30 days to access over 125 million titles without ads or interruptions! Start Free Trial Cancel Anytime.
Trusted by over 1 million members
Try Scribd FREE for 30 days to access over 125 million titles without ads or interruptions! Start Free Trial Cancel Anytime.
Trusted by over 1 million members
Try Scribd FREE for 30 days to access over 125 million titles without ads or interruptions! Start Free Trial Cancel Anytime.
Trusted by over 1 million members
Try Scribd FREE for 30 days to access over 125 million titles without ads or interruptions! Start Free Trial Cancel Anytime.
Trusted by over 1 million members
Try Scribd FREE for 30 days to access over 125 million titles without ads or interruptions! Start Free Trial Cancel Anytime.
Introduction to Ambisonics – Francesca Ortolani – Rev. 2015a
A sub-cardioid capsule is characterized by a polar diagram of the type shown in Figure 1.8 and it is described by the equation:
ρ (θ ) = 0.7 + 0.3 co cos θ where θ is is the angle of incidence of the acoustic wave.
Figure 1.9 Polar diagram of a subcardioid capsule.
The summary table below includes microphones from the family of cardioids, their polar characteristics and the equations describing them: POLAR DIAGRAM
TYPE OF MICROPHONE
EQUATION
Family of cardioids
ρ (θ ) = α + (1 − α ) cos θ
(general equation)
OMNIDIRECTIONAL
ρ (θ ) = 1
SUB-CARDIOID
ρ (θ ) = 0.7 + 0.3 co cos θ
Trusted by over 1 million members
Try Scribd FREE for 30 days to access over 125 million titles without ads or interruptions! Start Free Trial Cancel Anytime.
Trusted by over 1 million members
Try Scribd FREE for 30 days to access over 125 million titles without ads or interruptions! Start Free Trial Cancel Anytime.
Trusted by over 1 million members
Try Scribd FREE for 30 days to access over 125 million titles without ads or interruptions! Start Free Trial Cancel Anytime.
Trusted by over 1 million members
Try Scribd FREE for 30 days to access over 125 million titles without ads or interruptions! Start Free Trial Cancel Anytime.
Trusted by over 1 million members
Try Scribd FREE for 30 days to access over 125 million titles without ads or interruptions! Start Free Trial Cancel Anytime.
Introduction to Ambisonics – Francesca Ortolani – Rev. 2015a
B-Format B-Format consists of of four signals called W, X, Y, Z. As already mentioned above, above, signal W is relative to the pressure component of the sound field in all directions, while X, Y, Z refer to the horizontal components of velocity on the horizontal plane (X, Y) and the vertical component (Z) of velocity. Microphone takes in B-Format are achieved using three figure-of-8 microphones for signals X, Y, Z and an omnidirectional microphone for signal W. The axis pointing 0° in microphone X points the source (it is equivalent to MID in MS), microphone Y is rotated by 90° (the 0° axis points leftwards), with respect to X (it is equivalent to SIDE in MS). Microphone Z is oriented along the orthogonal plane with respect to the plane described by the axes X and Y (the 0° axis points upwards). Figure 1.10 1.10 shows the microphone layout just depicted in words:
Trusted by over 1 million members
Try Scribd FREE for 30 days to access over 125 million titles without ads or interruptions! Start Free Trial Cancel Anytime.
Trusted by over 1 million members
Try Scribd FREE for 30 days to access over 125 million titles without ads or interruptions! Start Free Trial Cancel Anytime.
Trusted by over 1 million members
Try Scribd FREE for 30 days to access over 125 million titles without ads or interruptions! Start Free Trial Cancel Anytime.
Trusted by over 1 million members
Try Scribd FREE for 30 days to access over 125 million titles without ads or interruptions! Start Free Trial Cancel Anytime.
Trusted by over 1 million members
Try Scribd FREE for 30 days to access over 125 million titles without ads or interruptions! Start Free Trial Cancel Anytime.
Introduction to Ambisonics – Francesca Ortolani – Rev. 2015a
W = S cos θ cos ϕ X = S 2 co θ ϕ Y = S 2 si s i n c o s Z = S 2 sin sin ϕ
(1.18)
where S is the recorded source [1]. The reader is referred to Chapter 3 for further explanations about the factor
2 .
B-Format can be derived from A-Format A -Format through the following transformation:
X = 0.5 ( LF − LB ) + ( RF − RB ) Y = 0.5 ( LF − RB ) − ( RF − LB ) Z = LF − LB + RB − RF 0.5 ) ( ) ( W = 0.5 ( LF + LB + RF + RB )
(1.19)
Signal W, being omnidirectional, is given by the sum of the contributions from the four capsules. Moreover, the recorded signal W can be used to reinforce the lower frequencies, since other types of microphones do not perform so fatty in low frequency response, as omnidirectional microphones do. Extensions of B-Format were introduced for high-definition TV: BF and BEF Formats, which include the additional channels E and F, redundant in content with respect to the channels W, X, Y, Z and used to bolster the stability of the front image and sharpen front /rear separation.
Trusted by over 1 million members
Try Scribd FREE for 30 days to access over 125 million titles without ads or interruptions! Start Free Trial Cancel Anytime.
Trusted by over 1 million members
Try Scribd FREE for 30 days to access over 125 million titles without ads or interruptions! Start Free Trial Cancel Anytime.
Trusted by over 1 million members
Try Scribd FREE for 30 days to access over 125 million titles without ads or interruptions! Start Free Trial Cancel Anytime.
Trusted by over 1 million members
Try Scribd FREE for 30 days to access over 125 million titles without ads or interruptions! Start Free Trial Cancel Anytime.
Trusted by over 1 million members
Try Scribd FREE for 30 days to access over 125 million titles without ads or interruptions! Start Free Trial Cancel Anytime.
Trusted by over 1 million members
Try Scribd FREE for 30 days to access over 125 million titles without ads or interruptions! Start Free Trial Cancel Anytime.
Introduction to Ambisonics – Francesca Ortolani – Rev. 2015a
Σ = 0.9397W + 0.1856 X ∆ = j ( −0.3420W + 0.5099 X ) + 0.6555Y T = j − W + X − Y 0 . 1 4 3 2 0 . 6 5 1 2 0 . 7 0 7 1 ( ) Q = 0.9772 Z
W = 0.982Σ + 0.197 j ( 0.828 ∆ + 0.768T ) X = 0.419Σ − j ( 0.828∆ + 0.768T ) = Σ + ∆ − 0 . 1 8 7 0 . 7 9 6 0 . 6 7 6 Y j T ( ) Z = 1.023Q where j denotes a 90-degree phase advance. As it has been said, only signals L and R are exploited in stereo-compatible systems:
L = 0.5 ( Σ + ∆ )
(1.20)
(1.21)
Trusted by over 1 million members
Try Scribd FREE for 30 days to access over 125 million titles without ads or interruptions! Start Free Trial Cancel Anytime.
Trusted by over 1 million members
Try Scribd FREE for 30 days to access over 125 million titles without ads or interruptions! Start Free Trial Cancel Anytime.
Trusted by over 1 million members
Try Scribd FREE for 30 days to access over 125 million titles without ads or interruptions! Start Free Trial Cancel Anytime.
Trusted by over 1 million members
Try Scribd FREE for 30 days to access over 125 million titles without ads or interruptions! Start Free Trial Cancel Anytime.
Trusted by over 1 million members
Try Scribd FREE for 30 days to access over 125 million titles without ads or interruptions! Start Free Trial Cancel Anytime.
Introduction to Ambisonics – Francesca Ortolani – Rev. 2015a
2
No
Stereo
1
No
Mono
2-Ch systems CD, Stereo Radio, 2-Ch systems Radio
LR
-
-
LR (summed)
-
-
Table 1.2 Summary table of hierarchical hierarchical system UHJ, C-Format C-Format [10].
D-Format D-Format is the format that made Ambisonics compatible with common surround speaker systems, such as 5.1, 7.1, but also with arrays of different sizes and geometries (either regular or irregular geometries). Signals in D-Format can be derived from either B-Format or C-Format with the use of a decoder. The number of speakers is not limited in theory. The minimum requirements are, however, 4 speakers for adequate surround playback, 6 is better and full periphony (and therefore the information relative to the height) can be obtained through 8 speakers. For example, in a periphonic (i.e. 3D) system, with B-Format input signals, the i-th loudspeaker will
Trusted by over 1 million members
Try Scribd FREE for 30 days to access over 125 million titles without ads or interruptions! Start Free Trial Cancel Anytime.
Trusted by over 1 million members
Try Scribd FREE for 30 days to access over 125 million titles without ads or interruptions! Start Free Trial Cancel Anytime.
Trusted by over 1 million members
Try Scribd FREE for 30 days to access over 125 million titles without ads or interruptions! Start Free Trial Cancel Anytime.
Trusted by over 1 million members
Try Scribd FREE for 30 days to access over 125 million titles without ads or interruptions! Start Free Trial Cancel Anytime.
Trusted by over 1 million members
Try Scribd FREE for 30 days to access over 125 million titles without ads or interruptions! Start Free Trial Cancel Anytime.
Introduction to Ambisonics – Francesca Ortolani – Rev. 2015a
1.4 Higher Order Ambisonics B-Format breaks off at first order. The reproduction accuracy of the sound field increases with increasing order. Table 1.3 includes Furse-Malham and Schmidt (SN3D) coefficients, used to encode the ambisonic channels of order higher hi gher than 1. Spherical harmonics are represented in this t his form [8]:
cos nθ if σ = 1 σ Ymn (θ , ϕ ) = P%mn sin ϕ × ignore re if n=0 n=0 sin nθ if σ = −1 igno
(1.26)
% are Legendre semi-normalized functions of degree m and order n. This formulation is where P mn
called SN3D encoding (SN2D in the 2-D modification) and it is relative to 1st order Ambisonics (see Paragraph 1.1) with the exception of the weight 0.707 applied to signal W. MaxN ), Daniel’s modification, called MaxNormalization ( MaxN ), is followed by Furse-Malham ( FuMa)
Trusted by over 1 million members
Try Scribd FREE for 30 days to access over 125 million titles without ads or interruptions! Start Free Trial Cancel Anytime.
Trusted by over 1 million members
Try Scribd FREE for 30 days to access over 125 million titles without ads or interruptions! Start Free Trial Cancel Anytime.
Trusted by over 1 million members
Try Scribd FREE for 30 days to access over 125 million titles without ads or interruptions! Start Free Trial Cancel Anytime.
Trusted by over 1 million members
Try Scribd FREE for 30 days to access over 125 million titles without ads or interruptions! Start Free Trial Cancel Anytime.
Trusted by over 1 million members
Try Scribd FREE for 30 days to access over 125 million titles without ads or interruptions! Start Free Trial Cancel Anytime.
Introduction to Ambisonics – Francesca Ortolani – Rev. 2015a
3
2,2,-1
V
3,0,1
K
− 3) 2 sin ϕ ( 5 sin 2 ϕ −
3,1,1
L
3,1,-1
M
3,2,1
N
3,2,-1
O
3,3,1
P
3,3,-1
Q
( 3 8 ) cos θ cos ϕ (5 sin ϕ − − 1) ( 3 8 )sis in θ cos ϕ (5 sin ϕ − − 1) ( 15 2 ) cos ( 2θ ) sin ϕ cos ϕ ( 15 2 ) sin ( 2θ ) sin ϕ cos ϕ ( 5 8 ) cos (3θ ) cos ϕ ( 5 8 ) sin (3θ ) cos ϕ
(
)
2 3 2 sin ( 2θ ) cos ϕ
Table 1.3 SN3D Definitions and FuMa Weights for ambisonic signals up to third order.
1.5 Near sources
2
3 1
2
45 32
2
45 32
2
3
5
2
3
5
3
8 5
3
8 5
Trusted by over 1 million members
Try Scribd FREE for 30 days to access over 125 million titles without ads or interruptions! Start Free Trial Cancel Anytime.
Trusted by over 1 million members
Try Scribd FREE for 30 days to access over 125 million titles without ads or interruptions! Start Free Trial Cancel Anytime.
Trusted by over 1 million members
Try Scribd FREE for 30 days to access over 125 million titles without ads or interruptions! Start Free Trial Cancel Anytime.
Trusted by over 1 million members
Try Scribd FREE for 30 days to access over 125 million titles without ads or interruptions! Start Free Trial Cancel Anytime.
Trusted by over 1 million members
Try Scribd FREE for 30 days to access over 125 million titles without ads or interruptions! Start Free Trial Cancel Anytime.
Introduction to Ambisonics – Francesca Ortolani – Rev. 2015a
( m + n )! − jc σ F mn (ω ) = ∑ n = 0 ( m − n ) ! n ! ω R m
n
(1.29)
where c is the speed of sound and ω = 2π f . σ
It is deduced that F mn (ω ) has a gain that tends to infinite at low frequencies. One can plainly see that compensation through the weighting factors mentioned at Paragraph 1.4 also helps solving this problem (gain is no more infinite). In this manner, with the use of this formulation, it is possible to reproduce sources which are internal to the speaker array, since it allows the reconstruction of concave, plane and convex wave fronts. It is important to know beforehand, however, the dimension of the array when coding.
Trusted by over 1 million members
Try Scribd FREE for 30 days to access over 125 million titles without ads or interruptions! Start Free Trial Cancel Anytime.
Trusted by over 1 million members
Try Scribd FREE for 30 days to access over 125 million titles without ads or interruptions! Start Free Trial Cancel Anytime.
Trusted by over 1 million members
Try Scribd FREE for 30 days to access over 125 million titles without ads or interruptions! Start Free Trial Cancel Anytime.
Trusted by over 1 million members
Try Scribd FREE for 30 days to access over 125 million titles without ads or interruptions! Start Free Trial Cancel Anytime.
Trusted by over 1 million members
Try Scribd FREE for 30 days to access over 125 million titles without ads or interruptions! Start Free Trial Cancel Anytime.
Trusted by over 1 million members
Try Scribd FREE for 30 days to access over 125 million titles without ads or interruptions! Start Free Trial Cancel Anytime.
Introduction to Ambisonics – Francesca Ortolani – Rev. 2015a
1.6 Pressure Microphones and Pressure Gradient Microphones We conclude this section with some useful smattering about the microphones that are commonly used in ambisonic arrays [11].
Pressure Microphones Microphones These microphones show only the front face to the sound field and respond in the same way to changes in the acoustic pressure for all the directions of the incident sound. In effect, pressure omnidirectional
Trusted by over 1 million members
Try Scribd FREE for 30 days to access over 125 million titles without ads or interruptions! Start Free Trial Cancel Anytime.
Trusted by over 1 million members
Try Scribd FREE for 30 days to access over 125 million titles without ads or interruptions! Start Free Trial Cancel Anytime.
Trusted by over 1 million members
Try Scribd FREE for 30 days to access over 125 million titles without ads or interruptions! Start Free Trial Cancel Anytime.
Trusted by over 1 million members
Try Scribd FREE for 30 days to access over 125 million titles without ads or interruptions! Start Free Trial Cancel Anytime.
Trusted by over 1 million members
Try Scribd FREE for 30 days to access over 125 million titles without ads or interruptions! Start Free Trial Cancel Anytime.
Introduction to Ambisonics – Francesca Ortolani – Rev. 2015a
Mic behaviour in the presence presence of plane waves waves Figure 1.12 shows how, in the presence of a plane wave front, sound affects points A and B with the same strength but with a phase difference. With a constant sound pressure, the angle swept by the sound and the pressure gradient increase with frequency. In Figure 1.12b the acoustic wave has a frequency approximately twice that of Figure 1.12a and the same
Trusted by over 1 million members
Try Scribd FREE for 30 days to access over 125 million titles without ads or interruptions! Start Free Trial Cancel Anytime.
Trusted by over 1 million members
Try Scribd FREE for 30 days to access over 125 million titles without ads or interruptions! Start Free Trial Cancel Anytime.
Trusted by over 1 million members
Try Scribd FREE for 30 days to access over 125 million titles without ads or interruptions! Start Free Trial Cancel Anytime.
Trusted by over 1 million members
Try Scribd FREE for 30 days to access over 125 million titles without ads or interruptions! Start Free Trial Cancel Anytime.
Trusted by over 1 million members
Try Scribd FREE for 30 days to access over 125 million titles without ads or interruptions! Start Free Trial Cancel Anytime.
Introduction to Ambisonics – Francesca Ortolani – Rev. 2015a
Students at singing schools are taught that approaching the microphone to the mouth it is possible to enhance low frequencies. This is called proximity effect and and is explained by the fact that the effect is noticeable especially at low frequencies for which the forces acting on the diaphragm are weaker, because the phase shift is smaller than in the case of high frequencies.
Trusted by over 1 million members
Try Scribd FREE for 30 days to access over 125 million titles without ads or interruptions! Start Free Trial Cancel Anytime.
Trusted by over 1 million members
Try Scribd FREE for 30 days to access over 125 million titles without ads or interruptions! Start Free Trial Cancel Anytime.
Trusted by over 1 million members
Try Scribd FREE for 30 days to access over 125 million titles without ads or interruptions! Start Free Trial Cancel Anytime.
Trusted by over 1 million members
Try Scribd FREE for 30 days to access over 125 million titles without ads or interruptions! Start Free Trial Cancel Anytime.
Trusted by over 1 million members
Try Scribd FREE for 30 days to access over 125 million titles without ads or interruptions! Start Free Trial Cancel Anytime.
Introduction to Ambisonics – Francesca Ortolani – Rev. 2015a
Bibliography
[1] F. Rumsey, Spatial Audio, Focal Press, 2001. [2] M. A. Gerzon, "A year of surround sound," Hi-Fi News, August 1971. [3] M. A. Gerzon, "A year of surround sound," Wireless World, December 1974.