I Introduction
The main impairments to a communication system, whether wireless or wireline, are due to multipath propagation through a physical medium and additive noise. Multipath propagation is commonly modeled as a linear convolution that, in the slowfading scenario, can be characterized by a channel impulse response that is fixed over the duration of one codeword. In the wellknown “uncorrelated Rayleigh/Riceanfading” scenario, the (complexbaseband) channel “taps”
are modeled as independent circular Gaussian random variables, and in the well known “additive white Gaussian noise” (AWGN) scenario, the timedomain additive noise samples
are modeled as independent circular Gaussian random variables [1].Ia Motivation
In this work, we focus on applications where the uncorrelatedRayleigh/Riceanfading assumption holds but the AWGN assumption does not. Our work is motivated by extensive measurement campaigns of terrestrial wireless installations wherein the additive noise is impulsive, with peak noise amplitudes reaching up to dB above the thermal background noise level [2, 3, 4, 5, 6, 7]. The noise affecting powerline communications (PLC) has also been shown to be highly impulsive, as well as bursty [8, 9, 10].
We restrict our attention to systems employing (coded or uncoded) orthogonal frequency division multiplexing (OFDM) [1], as used in many modern cellular wireless standards (e.g., IEEE802.11n and LTE) and PLC standards (e.g., PRIME and IEEE1901). OFDM is advantageous in that it facilitates data communication across convolutive multipath channels with high spectral efficiency and low complexity.
The impulsivity of noise has particular consequences for OFDM systems. Recall that, in conventional OFDM receivers, the timedomain received signal is converted to the frequency domain through a discrete Fourier transform (DFT)
[1], after which each subcarrier (or “tone”) is demodulated independently. Such tonebytone demodulation is in fact optimal with AWGN and perfect channel estimates [1], and is highly desirable from a complexity standpoint, since it leaves the DFT as the primary source of receiver complexity, and thus requires only multiplies per symbol for tones. When the timenoise is impulsive, however, the corresponding frequencydomain noise samples will be highly dependent, and tonebytone demodulation is no longer optimal. We are thus strongly motivated to find nearoptimal demodulation strategies that preserve the complexity of classical OFDM. In this work, we propose one such solution that exploits recent breakthroughs in loopy belief propagation.IB Prior Work
IB1 OFDM Reception in Impulsive Noise
One popular approach to OFDM reception in impulsive noise stems from the argument that the noiseless timedomain received OFDM samples can be modeled as i.i.d Gaussian (according to the central limit theorem with sufficiently many tones), in which case the noise impulses can be detected using a simple threshold test. This approach straightforwardly leads to a decoupled procedure for impulse mitigation and OFDM reception: the timedomain received signal is
preprocessed via clipping or blanking techniques [11, 12] or (nonlinear) MMSE estimation [13], and the result passed to a conventional DFT receiver for decoding. While agreeable from a complexity standpoint, these techniques perform relatively poorly, especially when the power of the implusive noise is comparable to the power of the OFDM signal, or when higher order modulations are used [13]. This loss of performance can be explained by the fact that OFDM signal structure is not exploited for noise mitigation. In an attempt to improve performance, it has been suggested to iterate between such preprocessing and OFDM decoding, but the approaches suggested to date (e.g., [14, 15, 16, 17]) have shown limited success, mainly because the adaptation of preprocessing with each iteration is challenging and often done in an adhoc manner.Another popular approach models the timedomain impulsive noise sequence as a sparse vector and then uses
sparsereconstruction techniques to estimate this sequence from the observed OFDM null and pilot (i.e., known) tones. The recovered impulse vector is then subtracted from the timedomain received signal, and the result is passed to a conventional DFT receiver for decoding. Algebraic techniques were proposed in [18, 19, 20], and sparse reconstruction techniques based on compressivesensing were proposed in [21, 22]. With typical numbers of known tones, these techniques have been shown to work well for very sparse impulsive noise sequences (e.g., one impulse in a tone OFDM system with known tones) but not for practical sparsity rates [21, 23].A more robust approach was proposed in [23], which performs joint symbol detection and impulsenoise estimation using sparse Bayesian learning (SBL). Still, because [23] decouples channel estimation from impulsenoise estimation and symbol detection, and because it integrates coding in an adhoc manner, there is considerable room for improvement. In addition, it performs matrix inversion that is impractical for typical OFDM receivers with hundreds of tones.
IB2 Factor Graph Receivers
Factorgraphbased receivers [24] have been proposed as a computationally efficient means of tackling the difficult task of joint channel, symbol, and bit (JCSB) estimation. Here, messages (generally in the form of pdfs) are passed among the nodes of the factor graph according to belief propagation strategies like the sumproduct algorithm (SPA) [25]. Due to the loopy nature of the OFDM factor graph, however, exact implementation of the sum product algorithm is infeasible, and so various approximations have been proposed [26, 27, 28, 29]. Notably, [29] merged the “generalized approximate message passing” (GAMP) algorithm [30] with a softinput softoutput decoder in a “turbo” configuration to accomplish nearoptimal^{1}^{1}1The approach was shown to be nearoptimal in the sense of achieving [31] the prelog factor of the sparse channel’s noncoherent capacity [32]. joint structuredsparsechannel estimation and decoding of bitinterleaved codedmodulation (BICM)OFDM with complexity. To our knowledge, no factorgraphbased OFDM receivers have been proposed to tackle impulsive noise, however.
IC Contribution
In this paper, we propose a novel OFDM receiver that performs nearoptimally in the presence of impulsive noise while maintaining the complexity order of the conventional tone OFDM receiver. Our approach is based on computing joint soft estimates of the channel taps, the impulsenoise samples, the finitealphabet symbols, and the unknown bits. Moreover, all observed tones (i.e., pilots, nulls, and data tones) are exploited in this joint estimation. To do this, we leverage recent work on “generalized approximate message passing” (GAMP) [30], its “turbo” extension to larger factor graphs [33], and offtheshelf softinput/softoutput (SISO) decoding [34]. The receiver we propose can be categorized as an extension of the factorgraphbased receiver [29] that explicitly addresses the presence of impulsive noise. The resulting receiver provides a flexible performanceversuscomplexity tradeoff and can be parallelized, making it suitable for FPGA implementations.
ID Organization and Notation
In Section II, we describe our OFDM, channel, and noise models, and provide an illustrative example of impulsive noise. Then, in Section III, we detail our proposed approach, which we henceforth refer to as joint channel, impulse, symbol, and bit (JCISB) estimation. In Section IV we provide extensive numerical results, and in Section V we conclude.
Notation: Vectors and matrices are denoted by boldface lowercase () and uppercase notation (), respectively. then represents the submatrix constructed from rows and columns of , where the simplified notation means and “” indicates all columns of . The notations and
denote transpose and conjugate transpose, respectively. The probability density function (pdf) of a random variable (RV)
is denoted by , with the subscript omitted when clear from the context. Similarly, for discrete RVs, the probability mass function (pmf) is denoted by . For a circular Gaussian RV with meanand variance
, we write the pdf as . The expectation and variance of a RV are then given by and , respectively. We use the sansserif font to indicate frequency domain variables like and bold sansserif to indicate frequency domain vectors like .Ii System Model
Iia Coded OFDM Model
We consider an OFDM system with tones partitioned into pilot tones (indexed by the set ), null tones (indexed by the set ), and data tones (indexed by the set ), each modulated by a finitealphabet symbol chosen from an ary constellation . The coded bits (which determine the data symbols) are generated by encoding information bits using a rate coder, interleaving them, and allocating the resulting bits among an integer number of OFDM symbols.
In the sequel, we use with to denote the th element of , and to denote the corresponding bits as defined by the symbol mapping. Likewise, we use to denote the scalar symbol transmitted on the th tone of the th OFDM symbol. Based on the tone partition, we note that: for all , where is a known pilot symbol; for all ; and for some such that for all , where denotes the coded/interleaved bits corresponding to . On the frame level, we use to denote the coded/interleaved bits allocated to the data tones of the th OFDM symbol, and to denote the entire codeword obtained from the information bits by coding/interleaving. Similarly, we use to denote the th OFDM symbol’s tone vector, including pilot, null, and data tones.
For OFDM modulation, an inverse of the unitary point discrete Fourier transform (IDFT) matrix is applied to the th OFDM symbol’s tone vector , producing the timedomain sequence , to which a cyclic prefix is prepended. The resulting sequence propagates through an tap lineartimeinvariant channel with impulse response before being corrupted by both AWGN and impulsive noise. Assuming a cyclic prefix of length , intersymbol interference is avoided by simply discarding the cyclic prefix at the receiver, after which the remaining samples are
(1) 
where is the timedomain noise vector and is the circulant matrix formed by [1]. Applying a DFT, the resulting frequencydomain received vector becomes
(2) 
where is the frequencydomain channel vector, is the frequencydomain noise vector, and denotes the Hadamard (i.e., elementwise) product. The second equality in (2) follows from the fact that a circulant matrix is diagonalized by the Fourier basis. In fact, (2) illustrates the principal advantage of OFDM: each transmitted tone experiences a flat scalar subchannel, since
(3) 
IiB Channel Modeling
We assume that the channel taps remain constant during the entire duration of one OFDM symbol, as required by (2). Since we make no assumptions on how the taps change across symbols, for simplicity we take and to be statistically independent for . Furthermore, we use the Rayleighfading uncorrelatedscattering model
(4) 
where is the power delay profile. Extensions to sparse [31], structuredsparse [29], and timevarying sparse channels [35] are straightforward, but not covered here.
IiC Impulsive Noise Models
In many wireless and powerline communication (PLC) systems, the additive noise is nonGaussian (see the example in Fig. 1) and the result of random emission events from uncoordinated interferers (due to, e.g., aggressive spectrum reuse) or noncommunicating electronic devices. In his pioneering work, Middleton modeled these random spatiotemporal emissions, or the “noise field,” using Poisson point processes (PPP), giving rise to the “Middleton classA” and “Middleton classB” noise models. (For a recent review see [36].) Recently, his approach has been extended to modeling fields of interferers in wireless and PLC networks using spatial and temporal PPPs, and the resulting interference was shown to follow either the Symmetric alpha stable, the Middleton classA (MCA), or the more general Gaussian mixture (GM) distribution, depending on the network architecture [37, 38, 9, 10]. Fig. 1 illustrates that a GM model provides a significantly better fit to a noise realization collected from a receiver embedded in a laptop than a Gaussian model does.
Since our factorgraphbased receiver is inherently Bayesian, these statistical models provide natural priors on the noise. Thus, we model the additive noise using a GM model, noting that—given the pdf parameters—there is no distinction between the MCA and GM models. In particular, we decompose^{2}^{2}2Our approach is equivalent to modeling the total noise by a GM pdf with and for . a given timedomain noise sample into a Gaussian background component and a sparse impulsive component with BernoulliGM pdf
(5) 
where denotes the Dirac delta and . Equivalently, we can model the (hidden) mixture state of the impulsive component as a random variable, giving rise to the hierarchical model (with )
(6a)  
(6b) 
In many applications, such as PLC, the noise is not only impulsive but also bursty
and thus the noise samples are no longer statistically independent. Such burstiness can be captured via a BernoulliGaussian hidden Markov model (BGHMM) on the impulsenoise
[8, 39] or equivalently a Markov model on the GM state in (6). For this, we model the sequence as a homogeneous (stationary)ary Markov chain with a state transition matrix
such that(7) 
In this case, the marginal pmf of steadystate obeys , and the mean duration of the event is [39].
As an illustrative example, Fig. 2 plots two realizations of the total noise with impulsive component generated by the hierarchical BernoulliGM model (6). Both realizations have identical marginal statistics: their impulsive components have two nontrivial emission states with powers dB and dB above the background noise power that occur and of the time, respectively. However, in one case the emission state was generated i.i.d whereas in the other case it is generated Markov with statetransition matrix
(8) 
The GHMM realization clearly exhibits bursty behavior.
In practice, assuming the noise statistics are slowly varying, the noise parameters and
can be estimated using the expectationmaximization (EM) algorithm
[25] during quiet intervals when there is no signal transmission.Iii MessagePassing Receiver Design
In this section, we design computationally efficient messagepassing receivers that perform nearoptimal bit decoding which, as we shall see, involves jointly estimating the coded bits, finitealphabet symbols, channel taps, and impulsive noise samples. In doing so, our receivers exploit knowledge of the statistical channel and noise models discussed above and the OFDM signal structure (i.e., the pilot and null tones, the finitealphabet symbol constellation, and the codebook).
Iiia MAP Decoding of OFDM in Impulsive Noise
Maximum a posteriori (MAP) decoding, i.e.,
(9) 
is well known to be optimal in the sense of minimizing the biterror rate. Here,
collects the received OFDM symbols of the corresponding frame. Using the law of total probability, we can write the posterior informationbit probability from (
9) as(11)  
(12) 
where “” denotes equality up to a constant, , and the information bits are assumed to be independent with . Equation (12) shows that optimal decoding of involves marginalizing over the finitealphabet symbols , coded bits , noise states , impulse noise samples , channel taps , and other info bits .
The probalistic structure exposed by the factorization (12) is illustrated by the factor graph in Fig. 3. There and in the sequel, for brevity, we drop the “” (i.e., OFDM symbol) index when doing so does not cause confusion.
Clearly, direct evaluation of from (12) is computationally intractable due to the highdimensional integrals involved. Belief propagation (BP), and in particular the sumproduct algorithm (SPA) [25] described below, offers a practical alternative to direct computation of marginal posteriors. In fact, when the factor graph has no loops, the SPA performs exact inference after only two rounds of message passing (i.e., forward and backward). On the other hand, when the factor graph is loopy, the computation of exact marginal posteriors is generally NP hard [40] and thus the posteriors computed by BP are generally inexact. Nevertheless, loopy BP has been successfully applied to many important problems, such as multiuser detection [41, 42], turbo decoding [43], LDPC decoding [34], compressed sensing [44, 30], and others.
In fact, for certain large densely loopy graphs that arise in the context of compressed sensing, SPA approximations such as the AMP [44] and GAMP [30] algorithms are known to obey a state evolution whose fixed points, when unique, yield exact posteriors [45, 46]. Looking at the factor graph in Fig. 3, we see densely loopy subgraphs between the factors and the timedomain noise samples and channel taps , which are due to the linear mixing of the Fourier matrix . It is these types of densely loopy graphs for which AMP and GAMP are designed.^{3}^{3}3Although rigorous GAMP guarantees have been established only for generalized linear inference problems with i.i.d subGaussian transform matrices [46], equally good performance has been empirically observed over much wider classes of matrices [47]. In the sequel, we will describe exactly how we combine the sumproduct and GAMP algorithms for approximate computation of the bit posteriors in (12). First, however, we review the SPA.
IiiB Belief Propagation using SumProduct Algorithm
Belief propagation (BP) transforms a highdimensional marginalization problem (like (12)) into a series of local lowdimensional marginalization problems by passing beliefs, or messages, which usually take the form of (possibly unnormalized) pdfs or pmfs, along the edges of a factor graph. The sumproduct algorithm (SPA) [25] is probably the best known approach to BP, and it operates according to the following rules:
IiiB1 Messages from Factor Nodes to Variables
Suppose the pdf factor depends on the variables . Then the message passed from factor node to variable node is
representing the belief that node has about the variable .
IiiB2 Messages from Variables to Factor Nodes
Suppose the factors all involve the variable . Then the message passed from variable node to factor node is
and represents the belief about the variable that node passes to node .
IiiB3 Marginal Beliefs
SPA’s approximation to the marginal posterior pdf on the variable is
where is a normalization constant.
IiiC Joint Channel/ImpulseNoise Estimation and Decoding
We now propose a strategy to approximate the bit posteriors in (12) by iterating (an approximation of) the SPA on the loopy factor graph in Fig. 3. To distinguish our approach from others in the literature, we will refer to it as “joint channel, impulse, symbol, and bit estimation” (JCISB).
Given the loopy nature of the factor graph, there exists considerable freedom in the messagepassing schedule. In JCISB, we choose to repeatedly pass messages from right to left, and then left to right, as follows.

Beliefs about coded bits flow rightward through the symbolmapping nodes , the finitealphabet symbol nodes , and into the factor nodes .

GAMPbased messages are then passed repeatedly between the and nodes until convergence.

GAMPbased messages are passed repeatedly between the and nodes until convergence, and then through the nodes using the forwardbackward algorithm, alternating these two steps until convergence.

Finally, the messages are propagated from leftward through the symbol nodes , the symbolmapping nodes , the codedbit nodes , and the codinginterleaving block—the last step via an offtheshelf softinput/softoutput (SISO) decoder.
In the sequel, we will refer to steps 1)–4) as a “turbo” iteration, and to the iterations within step 3) as “impulse iterations,” We note that it is also possible to execute a parallel schedule if the hardware platform supports it. The details of these four message passing steps are given below.
IiiC1 Bits to Symbols
Beliefs about the coded bits (for each data tone ) are first passed through the symbol mapping factor node . The SPA dictates that
(13) 
where (IIIC1) follows from the deterministic symbol mapping . The resulting message is then copied forward through the node, i.e., , also according to the SPA. Note that, at the start of the first turbo iteration, we have no knowledge of the bits and thus we take to be uniform across for all .
IiiC2 GAMP for Channel Estimation
The next step in our messagepassing schedule is to pass messages between the factor nodes and the timedomain channel nodes . According to the SPA, the message passed from to is
(14) 
Exact evaluation of (IIIC2) involves an integration of the same highdimensionality that made (12) intractable, with exponential complexity in . Thus, we instead approximate the message passing between the and nodes using generalized approximate message passing (GAMP) algorithm [30] reviewed in Appendix A and summarized in Table I.
To do this, we temporarily treat the messages and as fixed, allowing us to employ “,” using the notation established in the caption of Table I. Definition (D1) [later used in steps (R3)–(R4)] requires us to specify the likelihood relating the transform output to the corresponding observed output . From Fig. 3, we see that there are two types of belief flowing into each node (apart from beliefs about ) that determine this likelihood: beliefs about the symbols , which we parameterize as with , and beliefs about the frequencydomain impulsive noise , which GAMP approximates as , where the values were computed by in the previous turbo iteration.^{4}^{4}4During the first turbo iteration, we use and . Here, refers to the impulsive component of the frequencydomain noise , with , so that [from (2)]
(15) 
From (3) and (15), the likelihood is
(16) 
with the corresponding “output MMSE estimation functions” and , as used in steps (R3)–(R4), specified in Table II. (See Appendix B for derivations).
Tone Type  

Pilot:  
Data:  
Tone Type  
Null:  
Pilot:  
Data: 
also requires us to derive the “input MMSE estimation functions” and for GAMP steps (R9)–(R10). Given the channel model specified in Section IIB and definition (D2), it is straightforward to show [30] that the input MMSE estimation functions are and .
After is iterated to convergence, the outputs and of steps (R4)–(R3) are close approximations to the marginal posterior mean and variance, respectively, of . These outputs will be used in the next step of the messagepassing schedule, as described below. Similarly, the outputs and of steps (R10)–(R9) are close approximations to the marginal posterior mean and variance, respectively, of .
IiiC3 TurboGAMP for Noise Estimation
The next step in our schedule is to pass messages between the factor nodes , the timedomain impulsenoise nodes , and the noisestate nodes . According to the SPA, the message passed from to is
(17) 
Although GAMP can help approximate the messages in (IIIC3), GAMP alone is insufficient due to connections between the nodes, which are used to model the burstiness of the timedomain impulsenoise . However, recognizing that the underlying problem is estimation of a clusteredsparse sequence from compressed linear measurements, we can use the solution proposed in [33], which alternated (G)AMP with the forwardbackward algorithm [25], as described below.
First, by temporarily treating the messages , , and as fixed, we can apply under the likelihood model
(18) 
implied by (3) and (15), and the coefficient prior
(19) 
implied by (5). In (18), are the symbol beliefs coming from the nodes and are the frequencydomain channel estimates previously calculated by . Meanwhile, in (19), represents the pmf on the noise state that is set as . The resulting output MMSE estimation functions, derived in Appendix C, are listed in TABLE II, and the input MMSE estimation functions are
(20)  
(21) 
Here, is the posterior pmf for noisestate
Comments
There are no comments yet.