Acoustic Contrast, Planarity and Robustness of Sound Zone Methods Using a Circular Loudspeaker Array A)

Since the mid 1990s, acoustics research has been undertaken relating to the sound zone problem—using loudspeakers to deliver a region of high sound pressure while simultaneously creating an area where the sound is suppressed—in order to facilitate independent listening within the same acoustic enclosure. The published solutions to the sound zone problem are derived from areas such as wave field synthesis and beamforming. However, the properties of such methods differ and performance tends to be compared against similar approaches. In this study, the suitabil-ity of energy focusing, energy cancelation, and synthesis approaches for sound zone reproduction is investigated. Anechoic simulations based on two zones surrounded by a circular array show each of the methods to have a characteristic performance, quantified in terms of acoustic contrast, array control effort and target sound field planarity. Regularization is shown to have a significant effect on the array effort and achieved acoustic contrast, particularly when mismatched conditions are considered between calculation of the source weights and their application to the system.


I. INTRODUCTION
In recent years, personal electronic devices such as laptop computers, tablet computers, portable music and video players, smartphones, navigation devices, and gaming consoles have become commonplace for consuming audiovisual content.Consequently, multiple conflicting sound streams are increasingly required to be auditioned in the same acoustic space.The presence of competing audio programs in this scenario has a detrimental effect on the listening experience of each listener.While headphones could be used to create isolated listening conditions, they impede communication among listeners sharing the space.In contrast to headphones, loudspeaker systems operating at moderate levels still allow normal conversation and relatively good audibility of any background sounds.It would therefore be ideal if each listener could have their own audio program delivered to them via loudspeakers, but in such a way that the interference between adjacent listening regions is minimized.In the first instance, the scenario is considered for the case of two listeners.In order to produce such sound zones, it is first necessary to use the loudspeaker array to create a region of high sound pressure (the bright zone) and a region of low sound pressure (the dark zone) in the enclosure.The reverse situation can then be engineered, and the total sound field is achieved by superposition.
Such sound zoning principles and motivation were first introduced by Druyvesteyn and Garas (1997), who proposed that through a combination of active noise control (at low frequencies), loudspeaker array processing (at mid frequencies) and directional sound (at high frequencies), a full-band sound zone solution could be achieved.While directional sound and active noise control techniques are well established, there are a number of approaches to array signal processing that warrant an investigation into which is the most appropriate for the critical mid-range band.This paper therefore focuses on such techniques.
Array signal processing techniques for sound zoning are derived from two approaches: Sound field synthesis, where the entire sound field controlled by the array can be specified, and beamforming, where the array instead focuses the sound energy in a target direction.Under the sound field synthesis approach, the desired reproduced field can in theory be arbitrarily chosen.Traditionally, this approach has been applied to create spatial effects, but it can be applied to the sound zone problem by attenuating the sound pressure amplitude over a particular region.Two main approaches have been used for sound zones.Wu and Abhayapala (2011) developed an analytical approach where the sound field coefficients of multiple zones are translated onto the global sound field.The global sound field representation allows source weights to be calculated by existing synthesis approaches such as least-squares mode matching or wave field synthesis, and the basis functions may be selected depending on the dimensionality of the problem and the source geometry.As an alternative, Kirkeby and Nelson   a) Portions of this work were presented in P. Coleman, P. J. B. Jackson, M. Olik, M. Olsen, M. Møller, and J. A. Pedersen, "The influence of regularization on anechoic performance and robustness of sound zone methods," in Proceedings of Meetings on Acoustics, Vol. 19, 2013.Presented at ICA 2013, Montreal, 2-7 June 2013.b) Author to whom correspondence should be addressed.Electronic mail: p.d.coleman@surrey.ac.uk (1993) directly minimized the error between the desired field specified at the microphone positions and the reproduced sound field.This concept was applied for multi-zone reproduction by Poletti (2008) and more recently by Radmanesh and Burnett (2013) who used an irregular source array selected by a prior optimization step.Such least-squares optimization can be based on measured transfer functions, which lifts many of the constraints on source positions imposed by the analytical approaches and is referred to herein as pressure matching (PM).PM represents a logical extension of binaural techniques such as crosstalk cancellation (e.g., Bai and Lee, 2006;Akeroyd et al., 2007) over a larger spatial region by allowing the definition of complex pressures, for instance, to define a plane wave propagating across the bright zone.
Beamforming-based approaches have seen significant advances in recent years.From the classical analytical approach of delay and sum beamforming, super-directive approaches based on constrained optimization of sound pressure have emerged, utilizing the acoustic transfer functions between the loudspeaker array and control microphones.Choi and Kim (2002) proposed two constrained optimization cost functions pertaining to sound zones, an optimized beamformer brightness control (BC) for focusing the energy in a particular direction, and acoustic contrast control (ACC) achieving suppression in so-called dark zones in addition to the sound focusing.The latter technique is noted here to be a cancellation method, distinct from beamforming, as it creates cancellation regions in addition to focusing the sound energy.ACC has been the foundation of much subsequent sound zone attention and has been applied to personal computers (Chang et al., 2009a;Chang et al., 2009b), aircraft seats (Elliot and Jones, 2006;Jones and Elliott, 2008), and hand held devices (Elliott et al., 2010).An alternative cancellation method known as acoustic energy difference maximization (AEDM) was proposed by Shin et al. (2010) with a modified cost function negating the need for matrix inversion and allowing for adjustment of the array control effort via a parameter in the cost function.
These approaches to sound field control have generally been evaluated with respect to other studies in the same domain, and in each domain, a primary metric has emerged.For the methods derived from sound field synthesis, this is the reproduction accuracy, and for the sound energy based methods it is the sound pressure level difference between the zones.Recent studies have begun to bridge the gap between the domains: Jacobsen et al. (2011) compared an analytical synthesis approach with ACC under anechoic conditions and under experimental conditions using pure tones.This study highlighted the difference in the spatial properties of ACC in the bright zone compared to the plane wave synthesis approach, although it was not quantified.ACC and PM approaches have also been compared for line arrays by Sim on G alvez et al. (2012).Hybrid methods have emerged, with Chang and Jacobsen (2012) using a weighted PM motivated by ACC and Møller et al. (2012) combining AEDM and PM, in each case attempting to find an appropriate balance between control of the bright zone sound field and cancellation between zones.Similarly, Betlehem and Teal (2011) devised a constrained optimization approach that minimized the reproduction error in the bright zone and the squared pressures in the dark zones.Nevertheless, a detailed comparison between approaches does not currently exist in the literature.In particular, it is not clear how they compare under common design constraints such as the number of loudspeakers, limitations on control effort or common zone size.
Alongside the selection of an appropriate cost function for sound zone optimization, a suitable regularization scheme must be used.Regularization has two key functions: To improve the condition number of the matrix for inversion (reducing the impact of numerical errors), and to constrain the effort required by the array to reproduce the specified sound field (reducing the over-all sound energy in the enclosure and thereby the impact of reflections in a real room, limiting the drive of each loudspeaker resulting in more realizable filters, and reducing the influence of calibration/ setup errors).If there is too little regularization, the conditioning of the matrix will remain poor, and the effort may be excessive.If there is too much, the effort will be well controlled, but significant errors in the solution will reduce the contrast performance.Furthermore, the condition number of the matrix is highly dependent on the system geometry and varies as a function of frequency (Takeuchi and Nelson, 2002).
Many methods for determining the value of a frequency-dependent regularization parameter have been proposed.Bai and Lee (2006) and Elliott et al. (2012) implemented a "hard" control effort constraint, adjusting the parameter until the effort fell below a threshold.This method has a well defined physical motivation, which can be set in relation to the system under consideration.Elliott et al. (2012) also considered the regularization effect in relation to the acoustic contrast and the control effort for energy cancellation methods applied to small sound zone systems with up to 3 sources.Kirkeby et al. (1996) maintained a certain ratio between the largest eigenvalue of the matrix to be inverted and the regularization parameter, citing a ratio of 1000-5000 as a rule of thumb.This method has the advantage of being simple and direct, although a judicious choice of the target ratio must be made ahead of time.Optimal trade-offs between effort and reproduction error such as the L-curve (Hansen, 1992) and Generalized Cross-Validation (Golub et al., 1979) (compared by Nelson, 2001;Kim and Nelson, 2004, for acoustic inverse problems) can also be used, although the relationship between the reproduction error and control effort is less clear for multiple-zone systems than for single zone ones.The effect of regularization is comparable with using a pseudo-inverse approach (based on a truncated singular value decomposition) and modifying the threshold for a singular value being discarded, but the modal control is more continuous using the regularization approach and it has a clearer physical definition.
In addition to these design problems, the challenge remains to implement a system that is as robust as possible.Sound zone implementations require robustness to many kinds of degradations, for example, scattering, measurement noise, and varying experimental conditions.The robustness of some techniques to errors has been considered in the literature.Chang et al. (2009b) studied the degradations due to scattering based on a realization of ACC. Park et al. (2013) studied the effects of transfer function errors on performance of BC and ACC, and Elliott et al. (2012) considered the robustness of their array to uncertainties in the environment and the movement of a single loudspeaker.However, system robustness, regularization, and the corresponding effort have not been compared among approaches under uniform conditions.
Here, motivated by the need for a comparison among approaches, we compare three representative methods which can all be formulated as optimization problems based on measured transfer functions: BC (beamforming), ACC (energy cancellation), and PM (sound field synthesis).Such a selection of multi-point control methods allows us to consider some of the current active research topics proposed by Spors et al. (2013); for instance, we use methods that may be applied to flexible source geometries, we study behavior at frequencies above the spatial aliasing limit of the arrays, and we consider optimal regularization for sound zone reproduction.This study therefore extends the scope of the current literature first by scrutinizing the strengths and weaknesses of each approach by assessing the performance when applied under consistent geometries and regularization conditions, second by using an ensemble of metrics to evaluate both the acoustic contrast between the zones and spatial aspects of reproduction in addition to array effort, third by presenting simulation results demonstrating the significance of regularization on sound zone performance, and fourth by presenting simulations whereby errors are introduced to the system to explore the relative robustness of each method under comparable error conditions.
In Sec.II, the sound zone problem is elaborated and evaluation metrics are defined.In Sec.III the optimization cost functions are presented.The simulation conditions and comparative performance under ideal conditions are presented in Sec.IV, and the effect of regularization on performance and robustness to systematic errors is treated in Sec.V. Finally, there is a brief discussion, and the conclusions are drawn.

II. BACKGROUND
In this section the sound zone system considered is detailed, and the evaluation metrics are described.
A. Sound zone problem definition Figure 1 shows an example sound zone system layout.Two audio programs A and B are to be reproduced in zones A and B, respectively.The rest of the room is uncontrolled.The zones (defined by the control microphone positions) and loudspeakers may be placed arbitrarily in the room.
For each frequency, the source weights can be written in vector notation as q ¼ ½q 1 ; q 2 ; …; q L T , where there are L loudspeakers and q l is the complex source weight describing the amplitude and phase of the lth loudspeaker.Similarly, the complex pressures at the control microphone positions in zones A and B are written as p A ¼ ½p The control microphones used for calculating the sound zone filters (setup process) and the monitor microphones for assessing performance (playback process) are kept spatially distinct in order to reduce possible bias due to measurement of performance at the exact control positions.Thus, the evaluation metrics contain an inherent assessment of how well techniques calculated for discretized control points affect the sound field elsewhere in the vicinity of those positions.With fixed microphone positions, the independence of the control and monitor points increases with frequency.The observed pressures at the monitor points in each zone are denoted as , respectively, where there are M A monitor microphones in zone A and M B in zone B, and the complex pressures at the mth microphones in each zone are o m A and o m B .The plant matrices contain the transfer functions between each loudspeaker and microphone, and are considered with respect to the control and monitor microphones in each zone.For zone A they are defined as where G nl A and X ml A are the transfer functions between the nth control microphone and the mth monitor microphone in zone A, respectively, and the lth loudspeaker.The equivalent notation is used for G B and X B .The pressures at the microphone positions may be written as p

B. Evaluation measures
Three evaluation metrics are defined for the anechoic simulations, evaluating the separation between the zones, the physical cost of achieving such separation and the spatial properties of the sound field produced in the bright zone.

Acoustic contrast
Acoustic contrast is a summary measure for sound zone performance.It describes the attenuation achieved between the bright zone and the dark zone and is, therefore, of paramount importance for assessing sound zone algorithms.This metric is typically used in the energy cancellation literature and is adopted here because it is related to the relative loudness between programs, giving an indication of what a listener in the zone might experience.A large contrast score implies that the interfering program (that directed toward the other zone) will be less audible when the system is active.Various perceptual experiments indicate that the required contrast is between 10 and 40 dB, depending on the combination of program material and the task performed (Francombe et al., 2012;Baykaner et al., 2013).The acoustic contrast between bright zone A and dark zone B is the ratio of spatially averaged pressures in each zone due to the reproduction of program A, expressed in decibels, (2)

Control effort
The control effort is the energy that the loudspeaker array requires in order to achieve the reproduced sound field.Consequently, a high control effort implies poor acoustical efficiency, with high sound pressure levels emitted in to the room.In a practical situation, an upper effort limit may be imposed by the ability of the loudspeaker array to physically reproduce the required signals, and the electrical requirements necessary for such reproduction.Control effort is defined as the total array energy relative to a single reference source q r producing the same pressure in the bright zone (Elliott et al., 2010) and expressed in decibels as Using a reference source ensures that the effort performance is physically useful: A score of 0 dB means that the array requires the same energy as that source to reproduce the target sound pressure, with negative scores improving upon this.

Planarity
The planarity of the sound field-the extent to which the sound field in the bright zone resembles a plane waveis a physical measure recently proposed by Jackson et al. (2013).The planarity metric is well suited to the situation considered here, where it is desirable to derive an objective measure of the sound field properties from the microphone array that is applicable even when a target sound field is not fully specified.While reproduction error could be readily evaluated for a synthesis approach, beamforming and energy cancellation approaches do not consider the phase of the sound field in their optimization.For these approaches, it is therefore unreasonable to evaluate them against a target complex sound pressure at each microphone.Adopting a pressure-magnitude-based reproduction error at each point in the bright zone, with reference to a target level, might give an indication of the homogeneity of the reproduced field but cannot indicate spatial properties beyond this.Yet, selfcancellation problems brought about by plane wave components impinging from various directions may significantly affect the spatial quality of the target audio and should be accounted for in evaluation.Furthermore, sound field optimization approaches that reproduce planar sound fields without specifying a precise target direction, overcoming selfcancellation problems without a precise sound field reproduction requirement, are conceivable.Finally, the direction of the principal component may be unimportant for sound zone performance, and the reproduction error may rate a highly planar sound field very poorly if the plane wave direction does not match that of the specified sound field.
In these cases, a metric is needed that is able to distinguish between the underlying properties of a sound field (the number of incoming plane wave components and their relative energy) without presupposing a plane wave direction.The planarity metric observes the energy due to plane wave components impinging from each direction with respect to the array, and calculates the proportion of the energy in the bright zone that can be attributed to the largest energy component.
The energy distribution at the microphone array (over incoming plane wave direction) is given by w i ¼ 1 2 jw i j 2 , where w ¼ ½w 1 ; w 2 ; …; w I T are the energy components at the ith angle and w i is the plane wave component at the ith angle.The steering matrix H A of dimensions I Â M A , which maps between the observed pressures at the microphones and the plane wave components, can then be defined such that The elements of the steering matrix can be calculated using a spatial Fourier decomposition approach, or a beamforming approach.Here, as in Jackson et al. (2013), a super-directive (contrast control) beamformer is used to determine the steering matrix weights.Finally, the planarity metric can be defined (for the bright zone) as the ratio between the energy due to the largest plane wave component and the total energy flux of plane wave components where u i is the unit vector associated with the ith component's direction, u a is the unit vector in the direction a ¼ arg max i w i , and Á denotes the inner product.
Where a plane wave is reproduced, all of the energy in the zone can be attributed to the largest component and the score approaches 100%.Where a diffuse sound field is reproduced, or self-cancellation results in equal and opposite energy components in the zone, the score tends toward 0%.Therefore, evaluating the target sound field in terms of planarity allows the differences between control method performance characteristics in the bright zone to be quantified while being applicable for all approaches.

III. THEORY
In the following, the optimization cost functions are introduced for the methods compared.In each case, zone A is designated as the bright zone and zone B as the dark zone.The optimizations may utilize constraints on the sum of squared pressures in zone A and the sum of squared source weights.The former can be expressed as A ¼ N A jp r j 2 Â 10 T =10 , where T is the target spatially averaged level in decibels relative to the threshold of hearing p r ¼ 20 l Pa, and the latter as E ¼ jq r j 2 Â 10 Q=10 , where Q is the control effort in decibels as per Eq.(3).
A. Brightness control BC represents the optimal beamforming approach to producing sound zones, where constructive interference is sought but no cancellation is attempted.The cost function is written as a constrained optimization problem which maximizes the pressure in the bright zone, constrained to a fixed E (Choi and Kim, 2002), where H denotes Hermitian matrix transpose and k is a Lagrange multiplier.
The point that maximizes J can be found by taking its derivatives with respect to q and k, respectively, and setting to zero, The derivative @J=@q describes an eigenvalue problem, and the optimal source weight vector q is proportional to the eigenvector q corresponding to the maximum eigenvalue of G H A G A .The derivative @J=@k is used to enforce the effort constraint E, and introducing a normalization constant a, the Lagrange multiplier can be written as (Choi and Kim, 2002) where q ¼ aq.Thus, BC maximizes the sound pressure level (SPL) in the bright zone for a certain input power.Adjusting a, one can set either the effort or the brightness (i.e., the target SPL in the bright zone).

B. Acoustic contrast control
ACC represents the energy cancellation approach, and the ratio of the spatially averaged sound pressure levels between the bright zone and the dark zone is maximized (Choi and Kim, 2002).Comparisons between ACC and AEDM can be found in the literature (Shin et al., 2010;Elliott et al., 2012), and the latter is excluded here because the solutions bound those achieved by BC and ACC, depending on the value of a constant which functions as a trade-off between contrast and control effort performance.Introducing the "indirect" formulation of Elliott et al. (2012), the ACC cost function is written as a minimization of the pressure in the dark zone, with constraints imposed on both A and E, The cost function may be minimized as above by taking the derivatives with respect to q and the Lagrange multipliers l and k, and setting to zero and q is proportional to the eigenvector q corresponding to the maximum eigenvalue of ðG et al., 2012).The constraint that A equals a certain fixed value is enforced by scaling q with the normalization constant a as above, and the second Lagrange multiplier k must be chosen such that the effort constraint is satisfied.If E > q H q when k ¼ 0, the constraint is not active.When k > 0, it acts as regularization by trading the control effort for increased bright zone energy and improving the numerical condition of the inversion of G H B G B .In our implementation, k is determined numerically using a gradient descent search such that E ! q H q when A has been fixed.

C. Pressure matching
PM represents the sound field synthesis approach whereby a sound field is specified for each zone.The desired plane wave sound field in zone A can be written as d A ¼ D A e jkr n Áu u , for n ¼ 1; 2; …; N A , where D A gives the pressure amplitude, r n is the position of the nth control microphone in zone A, Á denotes the inner product, and u u is the unit vector in the direction of the incoming plane wave.The desired zone B sound field is given by a vector of length N B populated with zeros, d B ¼ 0. Although a plane wave is used here, the PM formulation is easily generalized to reproduce arbitrary sound fields.The optimization cost function minimizes the error between the sound pressures at the control microphones p ¼ p A ; p B ½ T and the desired sound field Including a constraint to fix the effort to a certain E, the cost function can be written as Using the method of Lagrange multipliers, the solution can be found by taking the derivatives with respect to q and k, The Lagrange multiplier k is numerically chosen to satisfy the control effort constraint, and it is assumed that the solution is appropriately scaled by setting d H A d A ¼ A. If E > q H q when k ¼ 0, the constraint is not active.Here, k acts as regularization both by converting the excess control effort to the reproduction error and by improving the numerical condition of the inversion of . The former effect creates a trade-off between the effort and the minimization of the reproduction error.

IV. SIMULATIONS
Simulations were designed and conducted to compare the methods' anechoic performance and robustness.In this section, the test methodology and experiments are motivated and described, and the corresponding results are introduced.

A. Method
The simulations were conducted in MATLAB, simulating a free-field lossless environment, with each source modeled as an ideal monopole.The free-field Green's Function was used to populate the plant matrices, where q ¼ 1.21 kg/m 3 , c ¼ 342 m/s, k is the wave number x=c, and x nl is the relative position vector between the nth microphone and the lth loudspeaker.
The frequency range considered is midrange, 100-4000 Hz, covering the telephony frequency range and corresponding to the upper band that Druyvesteyn and Garas (1997) suppose will provide an upper bound on array contributions to the sound zone problem.
Control and monitor microphones in the zones were spaced 2.1 cm apart, fulfilling the Nyquist spatial sampling criterion up to 8.5 kHz.In each case, there were 156 omnidirectional microphones in each zone, arranged to sample a circular zone of diameter 30 cm.Monitor microphones outside of the zones, used only to render visualizations of the sound field, were spaced at 5 cm.
For all methods, the target level was set at T ¼ 76 dB SPL, which has been shown to be a comfortable listening level and has been used during listening tests based on the sound zone interference situation (Francombe et al., 2012).As described in Sec.III, the source weights were scaled based on predictions of the sound pressure at the control microphones to achieve this level in the bright zone.The effect of spatial mismatching between setup and playback is that the observed level may vary from this value by a small amount.Although it imposes an upper bound on contrast performance, limiting the lowest possible sound pressure to the human threshold of hearing is intuitively justified.Similarly, any level below the noise floor would not be recorded in practice.
To set the regularization conditions for ACC and PM, we set E in Eqs. ( 9) and ( 11) to correspond to Q ¼ 20 dB control effort relative to a single monopole positioned on r L and equidistant from both zones.While alternative values could be used, this value ensured that the solutions were not overly regularized under the simulation conditions.This approach to setting E, also used by Elliott et al. (2012) and Bai and Lee (2006), is beneficial in that it has a clear physical interpretation and is frequency dependent.However, as described in Sec.III, the effort constraints may be inactive.Consequently, no regularization would be applied to the potentially ill-conditioned matrix inversions calculated for ACC and PM.Consider the example of the ACC and PM solutions at 1 kHz, with a 20 dB effort constraint.For our simulation geometry, the condition number of , inverted for PM, is 9.96 Â 10 14 , and the corresponding solution has control effort of 82 dB.In this case, the effort constraint would be active, and the inversion would be regularized.Conversely, the condition number of G H B G B , inverted for ACC, is 2.32 Â 10 19 , yet the corresponding solution has only 5 dB effort.In this case, the effort constraint would be inactive and the matrix inversion prone to numerical errors.We therefore considered the condition number of the matrices to be inverted in our selection of the k values [Eqs.( 9) and ( 11)] by initializing them such that the condition number of the matrix to be inverted did not exceed 10 16 (corresponding to the numerical accuracy of our simulations in MATLAB).Then, the effort constraints were enforced, if necessary, via a gradient descent search to find k such that the control effort fell in the range 19-20 dB.In Sec.V, the effect of regularization is considered in detail.

B. Control method comparison
To facilitate the control method comparison, a 48 element circular loudspeaker array was chosen.Although line arrays have been used for personal sound implementations, circular geometries have been used extensively in sound field reproduction as they enclose the control region, and for the sound zone scenario, the sources may sometimes surround the zones.A diagram of the geometry is shown in Fig. 2, although adoption of this geometry does not restrict the discussion of method properties to the specific case considered.While a 48 loudspeaker array may be fairly large compared to existing sound reproduction systems (e.g., a 5.1 channel system in a domestic room), a sufficient number of sources are required to ensure that the sound field can be synthesized under the PM approach.The link between the number of elements in circular arrays and the corresponding upper frequency bound for accurate sound field synthesis is well documented (e.g., Ward and Abhayapala, 2001).Above this limit, also known as the spatial aliasing limit, the wavelength is too short in relation to the loudspeaker spacing for the array to properly reproduce the sound field.For a certain wave number k and reproduction region with radius r ¼ 0.75 m (including both zones), the minimum number of loudspeakers required for reproduction is L ¼ 2 kr ½ .Therefore, the maximum frequency that can be reproduced by the array of L loudspeakers is f max ¼ cL=4pr.The spatial aliasing limit for this configuration is 1700 Hz, falling approximately half way along the range of frequencies considered for this system and allowing us to consider the response of the array on either side of the limit.
Figure 3 shows the performance of each method under the evaluation metrics of contrast, control effort, and planarity.The core properties of each method are demonstrated here: ACC produces the maximum contrast of 76 dB across the whole frequency range, requires the control effort constraint to be active at some (lower) frequencies and has a poor planarity score.These properties of ACC are not restricted to the circular array geometry; for a line array, multiple beams may still be formed across the zone as the target field is unspecified.PM on the other hand produces the best planarity score, along with a creditable contrast score of over 70 dB at points, but requires a consistently high control effort.While the planarity score falls away toward 80% at low frequencies, the score is affected by the resolution of the beamformer used to populate the planarity steering matrix [Eq.( 4)] which is related to the aperture of the sensor array and does not imply a large plane-wave reproduction error at this frequency.Finally, BC requires very little control effort and has a planarity that falls between the two cancellation methods; but, it has a low contrast score.
The sensitivity of PM to the circular array spatial aliasing limit is evident, particularly in terms of contrast where the cancellation across frequency falls away rather rapidly after the limit.The target sound field continues to be fairly planar at higher frequencies, although the planarity score does falter around the limit itself.As frequency increases, the contrast fluctuates as the aliasing lobes pass through the dark zone.Furthermore, it is clear that the frequency range over which the effort constraint is active for PM is much larger than for ACC-in fact for this configuration, reduction of the matrix condition number of G H B G B for ACC is adequate at all frequencies to ensure the control effort falls below 20 dB.Such properties of PM may be mitigated by careful specification of the desired sound field and may in general be outweighed by its ability to specify the spatial properties of the sound field, resulting in a considerable improvement in planarity over ACC.This both avoids problems with self-cancelation in the bright zone and allows potential adoption for spatial audio reproduction.
The circular geometry restricts the contrast performance of BC and the planarity performance of ACC in comparison with a less enveloping geometry.To quantify these differences, a 48 channel line array tangential to the lowest point on the loudspeaker circle in Fig. 2 was simulated, with interelement spacing of 9.8 cm (equivalent to the spacing around the reproduction radius for the circular array).Here, the maximum contrast achievable by BC increased to 40 dB, and the planarity score for ACC rose to 90% or above for frequencies above 580 Hz, reflecting the limited number of potential incident plane wave directions and the decreased potential for equal and opposite components leading to standing waves.In any case, the underlying characteristics among the methods, and their ranking with respect to the evaluation metrics, remain unchanged regardless of the loudspeaker geometry: ACC produces the greatest contrast, PM produces a planar sound field, and BC is the lowest effort solution.
Visualization of the sound fields reproduced by the three methods applied to the circular array clarifies the evaluation scores, particularly between the extreme cases.Figure 4 shows the sound pressure level and phase across the simulated room at 1 kHz, for each method.The effect of the control effort on the overall sound level is striking in the comparison between BC and PM; in the latter case, the introduction of a reflective surface at any boundary would have a large impact on the system.Similarly, the large size and depth of the cancellation region achieved by ACC with respect to the small region achieved by PM (and very little produced by BC) is remarkable.Yet, a standing wave can be observed running through the middle of the bright zone in the case of ACC.This demonstrates a risk of the cancellation approach that is not quantified in the contrast score: The  spatial averaging of the sound pressures allows inhomogenous sound pressure across the bright zone due to planewave components arriving from various directions.The opposite is true for PM where there is only a single component.From the phase plots, the plane wave traveling south-north can be observed, and for ACC, the standing wave can be seen (the phase is different on each side of the zone, but without a sharp transition of 2p), which gives rise to the very low planarity score.
The inability of PM to control the sound field above its spatial aliasing limit raises an issue of feasibility for broadband reproduction.For the Fig. 2 geometry, 111 loudspeakers are required for reproduction up to 4 kHz.In Fig. 5, the effect of varying the number of loudspeakers around r L (Fig. 2) is summarized.In our numerical results, ACC exhibited a roll-off where the maximum contrast was no longer reached, and PM exhibited a contrast degradation at its transition in to the region of aliasing performance.We therefore compare the upper frequency of effective contrast performance by plotting the frequency 3 dB below the local maximum at the roll-off point.From Fig. 5, it is clear that the achievable bandwidth of effective contrast for ACC increases more steeply with additional sources than that of PM, in addition to the absolute contrast values being higher.The fit line plotted for ACC has the gradient cL=4pr zone , corresponding to the spatial aliasing limit for dark zone control, and fitting the roll-off points well for our circular array simulations.This follows from the ACC cost function [Eq.( 9)], where only the dark zone pressures are considered as the primary minimization.The position of the line was then adjusted to have its x intercept at 8, being the minimum array order achieving the 76 dB maximum.The pattern of ACC having a broader contrast bandwidth than PM also holds for our line array simulations.

V. REGULARIZATION AND ROBUSTNESS
The simulations presented thus far were regularized by initializing the k in Eqs. ( 9) and ( 11) based on a maximum matrix condition number of 10 16 and, subsequently, enforcing the effort constraints if the effort exceeded 20 dB.However, the value of k selected has a significant effect on the control effort, performance, and robustness of the sound zone system.In this section, results are shown from simulations varying the regularization applied to the 48 element  circular loudspeaker array.As discussed in Sec.I, there are many possible ways of determining a regularization parameter for a particular kind of problem.Here, we study the effect of regularization by directly adjusting k, which in the following is referred to simply as the regularization parameter, as it regularizes the solution both in the sense of limiting the array effort, and in terms of adding a constant to the diagonal of a matrix to be inverted thereby improving its conditioning.First, the effect is considered under ideal conditions.Then, systematic errors are introduced in order to study the effect of regularization on the robustness of the control methods.

A. Varying the regularization conditions
First, the regularization was tested under ideal conditions with assumed perfect estimates of the system's acoustic response.The regularization parameter was varied from 10 À10 to 10 10 at 1000 logarithmically spaced values.Figure 6 shows the effect of regularization on the contrast, effort and planarity reproduced by the array.The BC scores are plotted in terms of the k set to satisfy the effort constraint in Eq. ( 6).The regularization parameters used for the previous simulations in Sec.IV are marked for reference.
There are three regions of performance in relation to the effort.First, for very small regularization parameters, numerical errors in the matrix inversion cause an unstable effort response, most clearly visible at 500 Hz, and can also be observed in, e.g., ACC planarity and PM contrast.In the second region, there is a monotonic relationship between increasing the regularization parameter and decreasing effort.Finally, the minimum possible effort is reached.
The asymptotic minimum effort values for very high regularization correspond to the BC effort values, showing this to be the least-effort approach, albeit with poor contrast.In fact, the BC scores correspond under each metric to the asymptotic scores for ACC and PM, demonstrating that such heavy regularization limits the freedom of the optimization to the extent that cancellation is impossible.Although the cost functions imply that the control effort limit could be set arbitrarily, it is evident that there is a lower bound beyond which the effort cannot be further reduced.
Some further comments can be made in relation to our specific simulation results.While an increased regularization parameter consistently reduces the effort for each method, the relationship with contrast varies.For ACC, the regularization has no discernible effect on the upper performance for a wide range of parameter values, and the contrast degrades from the maximum value as regularization becomes large.For PM, there are local maxima in the contrast, becoming increasingly significant with increasing frequency.From FIG. 6. (Color online) Performance of ACC (left column) and PM (right column) as function of the regularization parameter, in terms of the contrast achieved (top row), effort (middle row) and planarity (bottom row), at 100 Hz (thin), 500 Hz (thick), 1 kHz (thick, dot-dash), and 2 kHz (thin, dot-dash).The BC scores are indicated for each frequency (100 Hz ä; 500 Hz ᭛; 1 kHz þ; 2 kHz Â) in terms of the k set to satisfy the effort constraint in Eq. ( 6).The regularization parameter used in Sec.IV is marked ᭺ ð Þ on each line.
visualization of these results, we note that these peaks correspond to the situation where the source weights are constrained toward operating as a directive beamformer toward the bright zone.
Regularization has little bearing on the planarity scores once the matrix inversion has been stabilized.With much regularization, ACC planarity increases toward the BC score as the array effort is heavily constrained.At 2 kHz, the ACC self cancellation patterns across the bright zone become more complex; increased regularization reduces the number of nulls in these patterns, and so the score oscillates.PM planarity begins to decrease as the increased regularization reduces the number of available array modes below that required for accurate reproduction (especially notable at 100 Hz).By 2 kHz, the PM planarity is unaffected by the regularization parameter, even when it is very large.
Considering the regularization approach used in Sec.IV, it is clear that the minimum regularization was required, as discussed above.Furthermore, the control effort constraint was active at several frequencies.Although at low frequencies our approach provides a simple trade-off between effort and contrast, it does not consider contrast performance and may under-regularize from the perspective of contrast.For PM at 1 kHz, for example, increased regularization would have improved the contrast performance while also reducing the effort with respect to the results in Sec.IV.

B. Robustness to mismatched setup and playback conditions
Even under simulated anechoic conditions, one can see the practical benefits of regularization in relation to the robustness of the system by introducing perturbations.A sound zone system should be robust to small changes in the reproduction atmosphere and allow some tolerance to the positioning of the equipment, which in practical scenarios will generally be restricted to loud-speaker placement once a set of room impulse measurements have been acquired.In the following, we present case studies where errors have been introduced by varying the sound propagation speed and applying random errors to each loudspeaker position.The performance is then evaluated with various regularization parameters.As the key metric across sound zone systems (assessing the fundamental ability to create sound separation), only the contrast is considered here.After calculating the source weights for a particular array and environment, the configuration was modified before application of the original source weights, thus introducing an error between setup and playback.Specifically, these experiments test the robustness of a certain set of filter weights to variations in the geometry post-calibration, as a function of the control method, frequency, and regularization parameter.

Mismatched sound propagation speed
First, robustness to sound propagation speed was investigated.This varies with temperature, air pressure and humidity in practical situations.The transfer functions were modified on playback by introducing a variation of up to 10 m/s (corresponding to a change in temperature of 17 C) to the Green's function and recalculating the transfer function matrices X A and X B accordingly.Such a variation, applied consistently across each transfer function term, is analogous to a shift in frequency between setup and playback.
Figure 7(a) shows the acoustic contrast achieved under the mismatched propagation speed conditions at 100 Hz and 1 kHz.It is clear that such error has the potential to seriously degrade the realizable contrast.The various regularization parameters can be seen to have a similar effect between the two methods of ACC and PM at 100 Hz but remarkably different outcomes at 1 kHz.
For the mismatched speed of sound at 100 Hz, a very small amount of regularization improves the performance of ACC (but the score is more sensitive to increased regularization overall), and the PM performance is slightly improved by increasing the regularization.At 1 kHz, the effect of the error on ACC is negligible for all regularization parameters.For PM, on the other hand, regularization has a significant effect on the contrast, and the performance degradation of 42 dB from the ideal case can be almost entirely removed, with optimal regularization giving 54 dB performance improvement from the unregularized case.The best robustness to error is noted to correspond to the point of optimal regularization in the ideal case.

Mismatched loudspeaker positions
The second mismatch introduced between the setup and playback of the source weights was a variation in the positioning of the loudspeakers.Each loudspeaker was moved independently in the x and y directions by a random amount drawn from a normal distribution.Unlike the systematic error in sound propagation speed, the error on the phase component of the transfer function is not the same for each path, and additionally an amplitude error is introduced.Here, the maximum error considered was with one standard deviation of the loudspeaker placement equal to 10 mm.The 95% confidence interval has a diameter in the x-y plane of 57 mm about the specified location, which might correspond to re-installation of a large sound zone system without precise positioning instruments.For a rigidly installed system (e.g., a sound system in a car), considerably smaller variation in loudspeaker locations would be expected.
Figure 7(b) shows the results in comparison with the ideal case.As with the case of the propagation speed error, the behavior between the methods at 100 Hz is fairly comparable, yet at 1 kHz it varies considerably.At 100 Hz, the degradation is seen to be generally more severe for PM than it is for ACC, as for the propagation speed results at this frequency.In this case, however, the performance error for PM is so severe that the contrast becomes negative for very light regularization.Increasing the regularization of ACC brings about a significant improvement of 40 dB in the contrast.PM is unable to control the sound field until suitable regularization is applied, and the optimal point coincides with the point where the curves with and without position error converge.The optimal ACC contrast is 60 dB, while the optimal PM contrast is 36 dB.
At 1 kHz, increasing the regularization for ACC (beyond ensuring satisfactory matrix conditioning) does not bring about any further benefit in contrast.However, PM behaves in a similar manner to the lower frequency case, where there is severe degradation for light regularization, and very large regularization parameters can improve performance.At the maximum point, the optimal contrast achieved by regularized PM becomes favorable over ACC by 4 dB.For ACC, there is little that can be done by regularization to improve the robustness to this kind of error once the matrix inversion has been adequately conditioned, but for PM the effect remains.

VI. DISCUSSION
Significant differences in the methods' performance under ideal conditions have been observed, presenting a design choice based on the requirements for a sound zone system.Where the target sound field properties are a priority, a sound field synthesis method should be considered; where the aim is purely to maximize acoustic contrast, energy control methods should be adopted; if the lowest possible array control effort is required then beamforming approaches are the most efficient.Given the differences in methods, it is easy to understand why various authors have hybridized the methods, attempting to combine their desirable characteristics while minimizing the undesirable ones.Using the methods here in their basic form allows new strategies to be envisaged without necessarily constraining the designer toward an existing solution.
A further design choice must trade the frequency band of good separation performance against the required bright zone properties and the available number of loudspeakers.In having freedom to optimize the spatially averaged squared pressures rather than reproduction error, energy control methods are a more appropriate choice for achieving good separation performance across frequency, especially if a limited number of loudspeakers are available.In fact, the methods that use PM as part of a hybrid method have only been demonstrated at relatively low frequencies.In any case, Druyvesteyn's original suggestion of cancellation up to 4 kHz may be generous for sound field synthesis with a realistic number of loudspeakers, but realizable with energy control.
The simulations presented here highlight the importance of judicious selection of the regularization parameter for optimal sound zone performance, even under ideal anechoic conditions.The performance of PM can be significantly improved, in terms of acoustic contrast and control effort, by a well selected regularization parameter.Moreover, the acoustic contrast, in having maxima in relation to the regularization parameter, does not directly correspond to the reproduction error, which would increase monotonically with increased regularization according to Eq. ( 11).
Under mismatched setup and playback conditions, regularization can help recover good per-formance when the sound propagation speed is mismatched.In our simulations, the best robustness corresponded to the point of optimal regularization determined under ideal conditions.While ACC needed minimal regularization, PM needed a much regularization to become robust.When the loudspeaker positions are varied, the regularization can help to limit the degradation, although in our simulations a reasonable contrast score was recovered for ACC at low frequencies with a large The ideal cases for PM (thin) and ACC (thick) are compared with the error cases (thin, dotdash; and thick, dot-dash; respectively).The BC scores with (᭛) and without (Â) the same magnitude of error, in terms of the k set to satisfy the effort constraint in Eq. ( 6).The regularization parameter used in Sec.IV is marked ᭺ ð Þ.
regularization parameter.For increasing error, the optimal contrast scores between ACC and PM became very close, suggesting that the achievable contrast may be much closer in a real system than in anechoic simulations.

VII. SUMMARY
Three sound zoning methods were compared under identical conditions.ACC produced the best zone separation under ideal conditions, for moderate control effort, yet with no control over the phase of the reproduced sound field.BC produced low contrast, at the least control effort cost.PM created a planar bright zone and good contrast, but the control effort cost was high and the performance over frequency sensitive to the geometrical limits of the array.Therefore, a sound zone designer may select among the methods based on the most critical property for their application; BC for least power consumption, ACC for maximum contrast, and PM for precise control over the target sound field.
Regularization was shown to have a significant effect on the contrast, with the relationship between increasing regularization and decreasing performance not always monotonic.Degradations to both methods were observed under mismatched setup and playback conditions, with ACC generally providing significantly better contrast and increased regularization generally improving the robustness to error.Future work may consider the calculation of regularization parameters leading to optimal contrast performance, and alternative methods for creating low-effort, planar sound zones with high acoustic contrast.The findings may also be validated by physical measurements in an anechoic chamber.

FIG. 1
FIG. 1. (Color online) Example of a sound zone system, with L loudspeakers, zones A and B comprising N A and N B control microphones (black) and M A and M B monitor microphones (white), respectively.

FIG. 2 .
FIG. 2. Simulation geometry with two zones surrounded by a circular loudspeaker array, showing the array radius r L ¼ 1:2 m, the reproduction radius r rep ¼ 0:75 m and the zone radius r zone ¼ 0:3 m.
FIG. 5. (Color online) Bandwidth of zone separation with increasing numbers of loudspeakers (L) in the array, showing the frequency where the contrast falls 3 dB below the local maximum at the point of contrast failure, for PM ðþÞ, and ACC ð᭺Þ.The PM aliasing line is f max ¼ cL=4pr rep .The gradient of the ACC fit line is cL=4pr zone , with the x intercept adjusted to correspond to the minimum L required to reproduce the maximum contrast.

FIG. 4 .
FIG. 4. Sound pressure level (top)  and phase (bottom) of reproduced sound field at 1 kHz using BC (left column), ACC (center column), and PM (right column), using the circular array of 48 monopoles.
FIG. 7. (Color online) Effect of the regularization parameter on acoustic contrast at 100 Hz (top) and 1000 Hz (bottom), with mismatched playback conditions due to (a) speed of sound error and (b) loudspeaker position error.The ideal cases for PM (thin) and ACC (thick) are compared with the error cases (thin, dotdash; and thick, dot-dash; respectively).The BC scores with (᭛) and without (Â) the same magnitude of error, in terms of the k set to satisfy the effort constraint in Eq. (6).The regularization parameter used in Sec.IV is marked ᭺ ð Þ.