## DSP Processors, Embodiments and Alternatives: Part I

Uptil now, we described digital signal processing in general terms, focusing on DSP fundamentals, systems and application areas. Now, we narrow our focus in DSP processors. We begin with a high level description of the features common to virtually all DSP processors. We then describe typical embodiments of DSP processors, and briefly discuss alternatives to DSP processors such as general purpose microprocessors, microcontrollers(for comparison purposes) (TMS320C20) and FPGA’s. The next several blogs provide a detailed treatment of DSP processor architectures and features.

So, what are the “special” things that a DSP processor can perform? Well, like the name says, DSP processors do signal processing very well. What does “signal processing” mean? Really, it’s a set of algorithms for processing signals in the digital domain. There are analog equivalents to these algorithms, but processing them digitally has been proven to be more efficient. This has been trend for many many years. Signal processing algorithms are the basic building blocks for many applications in the world from cell phones to MP3 players, digital still cameras, and so on. A summary of these algorithms is shown below:

• FIR Filter: $y(n)=\sum_{k=0}^{N}a_{k}x(n-k)$
• IIR Filter: $y(n)=\sum_{k=0}^{M}a_{k}x(n-k)+\sum_{k=1}^{N}b_{k}y(n-k)$
• Convolution: $y(n)=\sum_{k=0}^{N}x(k)h(n-k)$
• Discrete Fourier Transform: $X(k)=\sum_{n=0}^{N-1}x(n)\exp{[-j(2 \pi l/N)nk]}$
• Discrete Cosine Transform: $F(u)=\sum_{x=0}^{N-1}c(u).f(x).\cos{\frac{\pi}{2N}u(2x+1)}$

One or more of these algorithms are used in almost every signal processing application. FIR filters and IIR filters are almost fundamental to any DSP application. They remove unwanted noise from signals being processed, convolution algorithms are used to look for similarities in signals, discrete Fourier transforms are used to represent signals in formats that are easier to process, and discrete cosine transforms are used in image processing applications. We will discuss the details of some of these algorithms later, but there are things to notice about  this entire list of algorithms. First, they all have a summing operation, the function. In the computer world, this is equivalent to an accumulation of a large number of elements which is implemented using a “for” loop. DSP processors are designed to have large accumulators because of this characteristic. They are specialized in this way. DSPs also have special hardware to perform the “for” loop operation so that the programmer does not have to implement this in software, which would be much slower.

The algorithms above also have multiplication of two different operands. Logically, if we were to speed up this operation, we would design a processor to accommodate the multiplication and accumulation of two operands like this very quickly. In fact, this is what has been done with DSPs. They are designed to support the multiplication and accumulation of data sets like this very quickly; for most processors, in just one cycle. Since these algorithms are very common in most DSP applications, tremendous execution savings can be obtained by exploiting these processor optimizations.

There are also inherent structures in DSP algorithms that allow them to be separated and operated on in parallel. Just as in real life, if I can do more things in parallel, I can get more done in the same amount of time. As it turns out, signal processing algorithms have this characteristic as well. So, we can take advantage of this by putting multiple orthogonal (nondependent) execution units in our DSP processors and exploit this parallelism when implementing these algorithms.

DSP processors must also add some reality in the mix of these algorithms shown above. Take the IIR filter described above. You may be able to tell just by looking at this algorithm that there is a feedback component that essentially feeds back previous outputs into the calculation of the current output. Whenever you deal with feedback, there is always an inherent stability issue. IIR filters can become unstable just like other feedback systems. Careless implementation of feedback systems like the IIR filter can cause the output to oscillate instead of asymptotically decaying to zero (the preferred approach). This problem is compounded in the digital world where we must deal with finite word lengths, a key limitation in all digital systems. We can alleviate this using saturation checks in software or use a specialized instruction to do this for us. DSP processors, because of the nature of signal processing algorithms, use specialized saturation underflow/overflow instructions to deal with these conditions efficiently.

There is more I can say about this, but you get the point. Specialization is really all it is about with DSP processors; these processors specifically designed to do signal processing really well. DSP processors may not be as good as other processors when dealing with nonsignal processing centric algorithms (that’s fine; I am not any good at medicine either). So, it’s important to understand your application and choose the right processor. (A previous blog about DAC and ADC did mention this issue).

(We describe below common features of DSP processors.)

Dozens of families of DSP processors are available on the market today. The salient feature of some of the commonly used families of DSP Processors are summarized in Table 1. Throughout these series, we will use these processors as examples to illustrate the architectures and features that can be found in commercial DSP processors.

Most DSP processors share some common features designed to support repetitive, numerically intensive tasks. The most important of these features are introduced briefly here. Each of these features and many others will be examined in greater detail in this blog article series.

Table 1.

$\begin{tabular}{|c|c|c|c|c|} \hline Vendor & Processor Family & Arithmetic Type & Data Width & Speed (MIPS) or MHz\\ \hline Analog Devices & ADSP 21xx & Fixed Point & 16 bit & 25MIPS\\ \hline Texas Instruments & TMS320C55x & Fixed Point & 16 bit & 50MHz ro 300MHz \\ \hline \end{tabular}$

Fast Multiply Accumulate

The most often cited of DSP feature of DSP processors is the ability to perform a multiply-accumulate operation (often called a MAC) in a single instruction cycle. The multiply-accumulate operation is useful in algorithms that involve computing a vector product, or matrix product, such as digital filters, correlation, convolution and Fourier transforms. To achieve this functionality, DSP processors include a multiplier and accumulator integrated into  the main arithmetic processing unit (called the data path) of the processor. In addition, to allow a series of multiply-accumulate operations to proceed without the possibility of arithmetic overflow, DSP processors generally provide extra bits in their accumulator registers to accommodate growth of  the accumulated result. DSP processor data paths will be discussed in detail in some later blog here.

Multiple Access Memory Architecture.

A second feature shared by most DSP processors is the ability to complete several accesses to memory in a single instruction cycle. This allows the processor to fetch an instruction while simultaneously fetching operands for the instruction or storing the result of the previous instruction to memoryHigh bandwidth between the processor and memory is essential for good performance if repetitive data intensive operations are required in an algorithm, as is common in many DSP applications.

In many processors, single cycle multiple memory accesses are subject to restrictions. Typically, all but one of the memory locations must reside on-chip, and multiple memory accesses can take place only with certain instructions. To support simultaneous access of multiple memory locations, DSP processors provide multiple on-chip  buses, multiported on-chip memories, and in some casesm multiple independent memory banks. DSP memory structures are quite distinct from those of general purpose processors and microcontrollers. DSP processor memory architectures will be investigated in detail later.

To allow arithmetic processing to proceed at maximum speed and to allow specification of multiple operands in a small instruction word, DSP processors incorporate dedicated address generation units. Once the appropriate addressing registers have been configured, the address generation units operate in the background, forming the addresses required for operand accesses in parallel with the execution of arithmetic instructions. Address generation units typically support a selection of addressing modes tailored to DSP applications. The most common of these is register-indirect addressing with post-increment, which is used in situations where a repertitive computation is performed on a series of data stored sequentially in memory. Special addressing modes (called circular or modulo addressing) are often supported to simplify the use of data buffers. Some processors support bit-reversed addressing, which eases the task of interpreting the results of the FFT algorithm. Addressing modes will be described in detail later.

Specialized Execution Control.

Because many DSP algorithms involve performing repetitive computations, most DSP processors provide special support for efficient looping. Often, a special loop or repeat instruction is provided that allows the programmer to implement a for-next loop without expending any instruction cycles for updating and testing the loop counter or for jumping back to the top of the loop.

Some DSP processors provide other execution control features to improve performance, such as context switching and low-latency, low overhead interrupts for fast input/output.

Hardware looping and interrupts will also be discussed later in this blog series.

Perfipheral and Input/Output Interfaces

To allow low-cost, high performance input and output (I/O), most DSP processors incorporate one or more serial or parallel I/O interfaces, and specialized I/O handling mechanisms such as Direct Memory Access (DMA). DSP processor peripheral interfaces are often designed to interface directly with common peripheral devices like analog-to-digital and digital-to-analog converters.

As integrated circuit manufacturing techniques have improved in terms of density and flexibility, DSP processor vendors have included not just peripheral interfaces, but complete peripheral devices on-chip. Examples of this are chips designed for cellular telephone applications, several of which incorporate a DAC and ADC on chip.

Various features of DSP processor peripherals will be described later in this blog series.

In the next blog, we will discuss DSP Processor embodiments.

More later,

Aufwiedersehen,

Nalin Pithwa

## Digital Signal Processing and DSP Systems

We can look upon a Digital Signal Processing (DSP) System to be any electronic system making use of digital signal processing. Our informal definition of digital signal processing is the application of mathematical operations to digitally represent signals. Signals are represented digitally as sequences of samples. Often, these samples are obtained from physical signals (for example, audio signals) through the use of transducers (such as microphones) and analog-to-digital converters. After mathematical processing, digital signals may be converted back to physical signals via digital to analog converters.

In some systems, the use of DSP is central to the operation of the system. For example, modems and digital cellular phones or the so-called smartphones rely very heavily on DSP technology. In other products, the use of DSP is less central, but often offers important competitive advantages in terms of features, power/performance and price. For example, manufacturers of primarily analog consumer electronics devices like audio amplifiers are employing DSP technology to provide new features.

In this little article, a high level overview of digital signal processing is presented. We first discuss the advantages of DSP over analog systems. We then describe some salient features and characteristics of DSP systems in general. We conclude with a brief look at some important classes of DSP applications.

Digital signal processing enjoys several advantages over analog signal processing. The most significant of these is that DSP systems are able to accomplish tasks inexpensively that would be difficult or even impossible using analog electronics. Examples of such applications include speech synthesis, speech recognition, and high-speed modems involving error-correction coding. All of these tasks involve a combination of signal processing and control (example, making decisions regarding received bits or received speech) that is extremely difficult to implement using analog techniques.

• Insensitivity to environment: Digital systems, by their very nature, are considerably less sensitive to environmental conditions than analog systems. For example, an analog circuits behaviour depends on its temperature. In contrast, barring catastrophic  failures, a DSP system’s operation does not depend on its environment — whether in the snow or in the desert, a DSP system delivers the same response.
• Insensitivity to component tolerances: Analog components are manufactured to particular tolerances — a resistor, for example, might be guaranteed to have a resistance within a one percent of its nominal value. The overall response of an analog system depends on the actual values of all of the analog components used. As a result, two analog systems of exactly the same design will have slightly different responses due to slight variations in their components. In contrast, correctly functioning digital components always produce the same outputs given the same inputs.

• Predictable, repeatable behaviour. Because a DSP system’s output does not vary due to environmental factors or component variations, it is possible to design systems having exact, known responses that do not vary.

Finally, some DSP systems may also have two other advantages over analog systems.

• Reprogrammability. If a DSP system is based on programmable processors, it can be reprogrammed — even in the field — to perform other tasks. In contrast, analog systems require physically different components to perform different tasks.
• Size: The size of analog components varies with their values, for example, a 100 microFarad capacitor used in an analog filter is physically larger than a 10 picoFarad capacitor used in a different analog filter. In contrast, DSP implementations of both filters might well be the same size — indeed, might even use the same hardware, differing only in their filter coefficients — and might be smaller than either of the two analog implementations.

These advantages, coupled with the fact that DSP can take advanrage of denser and denser VLSI systems manufacturing process, increasingly make DSP the solution of choice for signal processing.

1.2) Characteristics of DSP Systems:

In this section, we describe a number of characteristics common to all DSP systems, such as algorithms, sample rate, clock rate and arithmetic type.

Algorithms

DSP systems are often characterized by the algorithms. The algorithm specifies the arithmetic operations to be performed, but does not specify how that arithmetic is to be implemented. It might be implemented in software on an ordinary microprocessor or programmable signal processor, or it might be implemented in custom IC. The selection of an implementation technology is determined in part by the required speed and arithmetic precision. The table given below lists some common types of DSP algorithms and some applications in which they are typically used.

Table 1-1: Common DSP algorithms and typical applications

$\begin{tabular}{|l|||l|} \hline DSP Algorithm & System Application \\ \hline Speech Coding Decoding & Cell phones, personal comm systems, secure comm \\ \hline Speech encryption decryption & Cell phones, personal comm systems, secure comm \\ \hline Speech Recognition & multimedia workstations, robotics, automotive applications \\ \hline Speech Synthesis & Multimedia PCs, advanced user interfaces, robotics \\ \hline Speaker Identification & Security, multimedia workstations, advanced user interfaces \\ \hline Hi-Fi audio encoding and decoding & Consumer audio, consumer video, digital audio broadcast, professional audio and multimedia computers \\ \hline Modem algorithms & Digital cellular phones, GPS, data fax modems, secure comm \\ \hline Noise cancellation & Professional audio, adv vehicular audio, industrial applications \\ \hline Audio equalization & Consumer audio, professional audio, adv vehicular audio, music \\ \hline Ambient acoustics emulation & Consumer, prof audio, adv vehicular audio, music \\ \hline Audio mixing and editing & Professional audio, music, multimedia computers \\ \hline Sound Synthesis & Professional audio, music, multimedia PC's, adv user interfaces \\ \hline Vision & Security, multimedia PC's, instrumentation, robotics, navigation \\ \hline Image Compression decompression & Digital photos, digital video,video-over-voice \\ \hline Image compositing & Multimedia computers, consumer video, navigation \\ \hline Beamforming & Navigation, medical imaging, radar,sonar, signals intelligence \\ \hline Echo cancellation & Speakerphones, modems and telephone switches \\ \hline Spectral estimation & Signal intelligence, radar, sonar, professional audio, music \\ \hline \end{tabular}$

Sample Rates

A key characteristic of a DSP system is its sample rate: the rate at which samples are consumed, produced, or processed. Combined with the complexity of the algorithms, the sample rate determines the required speed of the implementation technology. A familiar example is the digital audio compact disc (CD) player, which produces samples at a rate of 44.1 kHz on two channels.

Of course, a DSP system may use more than one sample rate; such  systems are said to be multirate DSP systems. An example is a converter from the CD sample rate of 44.1 kHz to the digital audio tape (DAT) rate of 48 kHz. Because of the awkward ratio between these sample rates, the conversion is usually done in stages, typically with at least two intermediate sample rates. Another example of a multirate algorithm is a filter bank, used in applications such as speech, audio, and video encoding and some signal analysis algorithms. Filter banks, typically consist of stages that divide the signal into high and low frequency portions. These new signals are then downsampled and divided again. In multirate applications, the ratio between the highest and the lowest sample rates in the system can become quite large, sometimes exceeding 100,000.

The range of sample rates encountered in signal processing systems is huge. Roughly speaking, sample rates for applications range over 12 orders of magnitude. Only at the very top  of that range is digital implementation rare. This is because the cost and difficulty of implementing a given algorithm digitally increases with the sample rate. DSP algorithms used at higher sample rates tend to be simpler than those at lower sample rates.

Many DSP systems must meet extremely rigorous speed goals, since they operate on lengthy segments of real world signals in real time. Where other kinds of systems (like databases) may be required to meet performance goals on an average, real time DSP systems often must meet such goals in every instance. In such systems, failure to maintain the necessary processing rates is considered a serious malfunction. Such systems are often said to be subject to hard real time constraints. For example, let’s suppose that the compact disc to digital audio tape sample rate converter discussed above is to be implemented as a real-time system, accepting digital signals at the CD sample rate of 44.1 kHz and producing digital signals at the DAT sample rate of 48 kHz. The converter must be ready to accept a new sample from the CD source every 22.6 microseconds and must produce a new output sample for the DAT device every 20.8 microseconds. If the system ever fails to accept or produce a sample rate on this schedule, data are lost and the resulting output signal is corrupted. The need to meet such hard real-time constraints creates special challenges in the design and debugging of real-time DSP systems.

Clock Rates

Digital electronic systems are often characterized by their clock rates. The clock rate usually refers to the rate at which the systems performs its most basic unit of work. In mass-produced commercial products, clock rates of up to 100 MHz are common, with faster rates found in some high-performance products. For DSP systems, the ratio of system clock rate to sample rate is one of the most important characteristics used to determine how the system will be implemented. The relationship between the clock rate and the sample rate partially determines the amount of hardware needed to implement an algorithm with a given complexity in real-time. As the ratio of sample rate to clock rate increases, so does the amount and complexity of hardware required to implement the algorithm.

Numeric Representations

Arithmetic operations such as addition and multiplication are at the heart of DSP algorithms and systems. As a result, the numeric representations and type of arithmetic used can have a profound influence on the behaviour and performance of a DSP system. The most important choice for the designer is between fixed point and floating point arithmetic. Fixed point arithmetic represents numbers in a fixed range (example, -1.0 to +1.0) with a finite number of bits of precision (called the word width). For example, an eight bit fixed point number provides a resolution of $1/256$ of the range over which the number is allowed to vary. Numbers outside of the specified range cannot be represented; arithmetic operations that would result in a number outside this range either saturate (that is, are limited to the largest positive or negative representable value) or wrap around (that is, the extra bits from the arithmetic operation are discarded).

Floating point arithmetic greatly expands the representable range of values. Floating point arithmetic represents every number in two parts: a mantissa and an exponent. The mantissa is, in effect, forced to lie between -1.0 and +1.0, while the exponent keeps track of the amount by which the mantissa must be scaled (in terms of powers of two) in order to create the actual value represented. That is,

$value=mantissa \times 2^{exponent}$

Floating point arithmetic provides much greater dynamic range (that is,, the ratio between the largest and smallest values that can be represented) than the fixed point arithmetic. Because it reduces the probability of overflow and the necessity of scaling, it can be considerably simplify algorithm and software design. Unfortunately, floating point arithmetic is generally slower and more expensive than fixed point arithmetic, and is more complicated to implement in hardware than fixed point arithmetic.

Classes of DSP Applications.

Digital signal processing, in general, and DSP processors in particular, are used in an extremely diverse range of applications, from radar systems to consumer electronics. Naturally, no  one processor can meet the needs of all or even most applications. Therefore, the first task for the designer selecting a DSP processor is to weigh the relative importance of performance, price, power consumption, integration, ease of development, and other factors for the application at hand. Here, we briefly touch on the needs of just a few categories of DSP applications.

Low cost embedded systems.

The largest applications (in terms of dollar volume) for digital signal processors are inexpensive, high-volume embedded systems, such as cellular telephones, disk drives (where DSP’s are used for servo control) and modems. In these applications, cost and integration considerations are paramount. For portable, battery operated products, power consumption is also critical. In these high volume, embedded applications, performance and ease of development considerations  are often given less weight, even though these applications usually involve the development of software to run on the DSP and custom hardware that interfaces with the DSP.

High Performance Applications

Another class of important applications is involving processing large volumes of data with complex algorithms for specialized needs. This includes uses like sonar and seismic explorations, where production volumes are lower, algorithms are more demanding, and product designs are larger and more complex. As a result, designers favour processors with maximum performance, ease of use, and support for multiprocessor configurations. In some cases, rather than designing their own hardware and software from scratch, designers of these systems assemble systems using standard development boards and ease their software development tasks by using existing software libraries.

Personal Computer Based Multimedia.

Another class of applications is PC based multimedia functions. Increasingly, PCs are incorporating DSP processors to provide such varied capabilities as voice mail, data and fax modems, music and speech synthesis, and image compression. For example, Software Defined Radios (SDR’s ) and also, smartphones. As with other high-volume embedded applications, PC multimedia also demands high performance, since a DSP processor in a multimedia PC may be called on to perform multiple functions simultaneously. In addition to performing each function efficiently, the DSP processor must have the ability to efficiently switch between functions. Memory capacity may also be an issue in these applications, because many multimedia applications require the ability to manipulate large amounts of data.

There has also been a trend to incorporate DSP like functions into general purpose processors to better handle signal processing tasks. In some cases, such augmented microprocessors may be able to handle certain tasks without the need of a separate DSP processor. However, I think the use of a separate general purpose processor (like ARM for control purposes in a smartphone) and DSP processors has much offer and will continue to be the dominant implementation of PC or SDR or smartphone based multimedia for some time to come.

More later,

Nalin Pithwa