Processing power available from many DSP processors has allowed a reduction of the amount of analog processing required in Ultrasound applications. In addition, they have reached performance levels that will allow additional features to be implemented without a significant increase in cost.

Visual and auditory feedback are critical to the placement of the probe, which is extremely important to the quality of the test results. To assist in this difficult task, the ultrasound system must process the data it collects and render an image as quickly as possible. This would allow the technician to receive essential visual feedback in the placement of the ultrasound probe without significant lag. This requirement generates a tight latency requirement on the processing of ultrasound data.

A typical ultrasound machine might take data at 20 million samples-per-second of 12-bits on two channels. This rate allows operation at carrier frequencies of 10 MHz or less, which is typical for ultrasound systems.

Several processing steps must be performed on this high sample rate data to reliably detect the signals generated by the sound returns. Two components of the signal are of interest – the amplitude of the reflection and its frequency. The amplitude of the return is used to detect density of tissue, while the frequency is used to detect motion via the Doppler effect.

The emitted sound beam is generated from a phased array emitter, which is pulsed with a carrier frequency of from 2 to 10 megahertz. The phased array emitter allows the sound beam to be positioned electrically (i.e. ‘beam steering’) and focused into a small area (i.e. ‘beam forming’). The sound reflected back is received by the emitter and passed to an A-D converter for conversion to digital form. The A-D converter processes the return signal and the original emitted signal generating two streams of digital data – a signal and a quadrature channel.

The digital data is processed in two ways. First, the amplitude of the signal return is demodulated from the high frequency carrier with a synchronous AM detector. Second, the phase of the returns are compared to the originally emitted carrier, with a synchronous quadrature detector. This phase measurement is combined with other phase measurements to detect the Doppler shift in the return signal. In addition to these demodulation steps, the signals are gated – that is selected by time and angle – to report data from a region selected by the operator. Additional processing can be performed to remove artifacts such as echo returns, variable sound velocities, and geometrical artifact (e.g. off angle Doppler measurements).

The quadrature phase detector is implemented in software on the processor array. The pair of synchronous mixers and the low-pass-filters operate at the 20MHz sample rate. The output of the low-pass-filters is sample rate converted down to a 25 KHz for further processing (see diagram above).

To implement the high sample rate processing, the following is required: a multiply in each synchronous mixer; a delay for the 90 degree phase shift; and a multistage rate converting low pass filter. This filter is implemented as a three tap filter, followed by a rate conversion down to 1MHz, then another three tap filter and a rate conversion down to 100KHz, then a 17 tap filter and a rate conversion down to 25 KHz. The rate conversions reduce the computational load and allow a sharp cutoff filter implementation. A non-rate converting filter would require significantly greater processing bandwidth. Each filter tap involves a multiply and an add. The phase shifts are essentially free as they are implemented through modifying the address from which the data is taken. Totaling the computational requirement we have 8 multiplies and 6 adds at 20 MHz, then 6 multiplies and 6 adds at 1 MHz, then 34 multiplies and 34 adds at 100 KHz. The required performance is (280+12+6.8=298.8) about 300 Mflops. This is well within the reach of DSP processors on the market today.

After this processing the two channels are combined to form the Doppler shift signal, which is gated and then processed with an FFT for display. The gating selects a window determined by the operator in which signals are taken, out of which the signal is ignored. This involves varying the data selection window, which translates into a start and stop time, for each beam position.

The demodulated amplitude signal is corrected by applying an increasing gain with time to correct for attenuation, and warped to correct for speed variations. The amplitude signal is plotted on the display at the locations dictated by the beam position. This plot creates the image of the area being studied.

These post processing steps, except for the FFTs, are insignificant when compared to the processing needed by the demodulator and the FFTs. The FFT processing is performed to display the spectrum of the Doppler return, which translates directly into velocity, after considerations for geometry. The character of the spectrum (noisy, smooth, wide band, narrow band) is an indication of the condition of the area being examined.

The processing required by the FFTs depends on the rates selected by the operator. The 25 KHz Doppler signal can be rate converted up to give finer frequency (i.e. velocity) resolution, and can be performed lapped (i.e. overlapping sliding buffers) to improve low velocity sensitivity of the system. All of these effects translate into an equivalent number of 1K complex FFTs per second. The table and graph below show how well various processors perform the FFT processing.

Two considerations are significant to this processing, the rate at which data is passed to the process performing the FFTs, and the bus architecture used. As the number of FFTs per second increase, the computational demand on the processor increases. At the same time, the demands of the processor for data increase as well.

This results in the following requirements for processors assuming almost linear scaling. This is true for most designs except the "native" PIII-450 case.

16 bit 1024 CFFTs/sec TM1300 ADSP21160 TMS320C6701 TMS320C6201 PIII-450
5000 1 1 1 1 1
7000 1 1 1 1 2
10000 1 1 2 1 2
20000 2 2 3 2 4
50000 4 5 6 5 NP
70000 6 6 8 7 NP
100000 8 9 11 9 NP

DSP processors available today are now fast enough to assume responsibility for the signal processing performed in ultrasound systems at the carrier frequency, eliminating the requirement for special purpose demodulators implemented in hardware.

Click to Download the application note in pdf format