Background

Methodology

Guitar Effects

    Tremolo

    Distortion

    Wah Wah

    Chorusing

    Delay

    Reverb       

Pitch Scaling

   Introduction

   STFT

   SOLA

   DSP SOLA

   SOLA Results

   Conclusions

 

SOLA Results

This section examines the quality of the SOLA method by raising and lowering the pitch of various signals. +5 and –5 semitones are used to generate the results, as these are the maximum variation in pitch that a guitarist would need. However, extended variation above these values is achievable at reasonable quality. The effect of correlation and overlapping rates is also investigated.  

Scaling of Various Signals 

All results have been generated using GL = 4080, RO = 0.75 and FS = 24000. 

Sine Waves

Figure 26 shows the effect of scaling at +5 and –5 semitones on a 500Hz sine wave.

Figure 26: Pitch scaling of a sine wave. Top is original 440Hz sine, middle is scaled by –5 semitones, bottom by +5 semitones 

This figure demonstrates that the envelope of the waveform has been largely maintained although a slight modulation is discernable in the figure. There is no noticeable distortion introduced that the ear can detect. 

Scaling of Guitar Samples

The results for single notes are shown first, with chords following. Figure 27 below shows the effect of both increasing and decreasing the pitch of an A note (hi E string, 17th fret). At this note, the upper frequency range of the guitar is present. These spectrums clearly show an undistorted scaling process.  

Fig 27. (a) Original spectrum of note A, (b) scaled by +5 semitones, (c) scaled by -5 semitones 

The ‘drop D’ note shown in Figure 28 on the following page is representative of the lowest frequencies generated by a guitar. Whilst increasing the pitch in (c) shows no distortion, decreasing it in (b) causes the peaks to spread. This is caused by the length of correlation being too small to give any meaningful alignment between successive grains. This is the effect of the correlation length not being long enough to cover several fundamental periods. 

Fig. 28 (a) Original spectrum of note D, (b) scaled by +5 semitones, (c) scaled by -5 semitones 

For scaling at –5 semitones, XL = 50 samples. By increasing this to 200 samples, the distortion is removed, as can be seen in figure 29. 

Fig 29. Reduction of distortion by increased correlation length 

Figure 30 gives the results for pitch scaling a G chord. While distortion is present, it is not perceivable. 

Fig. 30 (a) Original spectrum of chord G, (b) scaled by +5 semitones, (c) scaled by -5 semitones 

In the processing of chord samples, if correlation is not used, or the correlation length is very small compared with the fundamental frequency, the result sounds like the individual strings which contribute to the chord are out of tune. The reason for this effect is the adding together of non-aligned grains that distort the wave envelope. This results in a beating effect similar to that heard when two notes of slightly different frequency are played. 

Analysis of Overlap Rates 

The accuracy of how the resultant scaled envelope compares to the original envelope depends on the overlap ratio. Figure 31 shows that as the ratio increases, the envelope more closely resembles the original. The windowed grains are evident in (b), with each grain consisting of a triangular envelope. 

Fig. 31 (a) Original envelope, (b) overlap of 2%, (c) overlap of 30%, (d) overlap of 75%

 

At 75% overlap, it is found that the envelope distortion is reduced to levels that are not significantly perceivable to the ear. 

Extended Scaling Using SOLA 

This algorithm can scale by much greater than +/- 5 semitones. At around 12 semitones, the output starts to sound fake, as if a guitar is not generating the signal. At 24 semitones, the signal is either too high or too low in frequency, so is not particularly useful. However, at this level of scaling, the output sounds very hollow. The reason for this is that all frequency components are being scaled. Some components of a sound (known as formants) stay at the same frequency regardless of the change in pitch of the instrument. These are generated usually from the body of the instrument. To achieve 100% accurate scaling, these formants cannot be scaled. The reason the SOLA algorithm works so well for electric guitar signals is that the entire signal is generated from the strings (except for any noise pickup from interfering sources). This removes the formants from the body of the guitar, but the strings still generate a small number of these. The effect of scaling these formants is most noticeable when the semitone deviation is high, such as 12 or 24.

 This formant issue is why music and voice signals are not scaled very well with this version of SOLA. The larger bandwidth of these types of signals also means that a very large correlation length is required to align the grains successfully. This means that the grain lengths have to be increased to accommodate. When these grains become too large, the signal in the grain becomes non-stationary and length distortion becomes apparent. A possible solution to this problem is to break the signal up into frequency bands, and process each band individually, removing the requirement for a long correlation length. 

hichord.mp3  hisingle.mp3  lochord.mp3  losingle.mp3 

Previous | Next