EzDevInfo.com

spectrum

The No Hassle JavaScript Colorpicker Spectrum - The No Hassle jQuery Colorpicker spectrum is a javascript colorpicker plugin using the jquery framework. it is highly customizable, but can also be used as a simple input type=color polyfill

Spectrogram C++ library

For my current project in C++ / Qt I need a library (LGPL is preferred) which can calculate a spectrogram from a signal ( basically an array of doubles ). I already use Qwt for the GUI part.

Any suggestions? Thanks.


Source: (StackOverflow)

Harmonics count in a music sample

To determine the richness of a sound, I would like to determine the number of harmonics in a sample of music. For that, I'm using Processing with the Minim library which gives me a full spectrum with a FFT. I'm wondering how to count all the peaks in the spectrum produced by the FFT, I'm not even interested in the fundamental frequency.


Source: (StackOverflow)

Advertisements

"Winamp style" spectrum analyzer

I have a program that plots the spectrum analysis (Amp/Freq) of a signal, which is preety much the DFT converted to polar. However, this is not exactly the sort of graph that, say, winamp (right at the top-left corner), or effectively any other audio software plots. I am not really sure what is this sort of graph called (if it has a distinct name at all), so I am not sure what to look for.

I am preety positive about the frequency axis being base two exponential, the amplitude axis puzzles me though.

Any pointers?


Source: (StackOverflow)

Export audio file volume dB levels in Linux

I would like to be able to generate some sort of file that will store the volume levels of an audio file. I am pretty flexible about this but an example would be a csv that shows the volume every second. I don't need this number to be that precise.

Do you have any suggestions on how to approach this? I would appreciate it.


Source: (StackOverflow)

How to generate the audio spectrum using fft in C++?

I want to generate an audio spectrum (as seen in this video) of a mp3 audio file. Basically this problem requires calculating the fft of the audio signal. How do I do I program this in C/C++?

I've looked at a couple of open source libraries such as FFTW and I really don't know how to use these for my problem. Any help would be greatly appreciated. Thanks in advance!


Source: (StackOverflow)

Generating a spectrogram for a sequence of 2D movie frames

I have some data that consists of a sequence of video frames which represent changes in luminance over time relative to a moving baseline. In these videos there are two kinds of 'event' that can occur - 'localised' events, which consist of luminance changes in small groups of clustered pixels, and contaminating 'diffuse' events, which affect most of the pixels in the frame:

enter image description here

I'd like to be able to isolate local changes in luminance from diffuse events. I'm planning on doing this by subtracting an appropriately low-pass filtered version of each frame. In order to design an optimal filter, I'd like to know which spatial frequencies of my frames are modulated during diffuse and local events, i.e. I'd like to generate a spectrogram of my movie over time.

I can find lots of information about generating spectrograms for 1D data (e.g. audio), but I haven't come across much on generating spectrograms for 2D data. What I've tried so far is to generate a 2D power spectrum from the Fourier transform of the frame, then perform a polar transformation about the DC component and then average across angles to get a 1D power spectrum:

enter image description here

I then apply this to every frame in my movie, and generate a raster plot of spectral power over time:

enter image description here

Does this seem like a sensible approach to take? Is there a more 'standard' approach to doing spectral analysis on 2D data?

Here's my code:

import numpy as np
# from pyfftw.interfaces.scipy_fftpack import fft2, fftshift, fftfreq
from scipy.fftpack import fft2, fftshift, fftfreq
from matplotlib import pyplot as pp
from matplotlib.colors import LogNorm
from scipy.signal import windows
from scipy.ndimage.interpolation import map_coordinates

def compute_2d_psd(img, doplot=True, winfun=windows.hamming, winfunargs={}):

    nr, nc = img.shape
    win = make2DWindow((nr, nc), winfun, **winfunargs)

    f2 = fftshift(fft2(img*win))
    psd = np.abs(f2*f2)
    pol_psd = polar_transform(psd, centre=(nr//2, nc//2))

    mpow = np.nanmean(pol_psd, 0)
    stdpow = np.nanstd(pol_psd, 0)

    freq_r = fftshift(fftfreq(nr))
    freq_c = fftshift(fftfreq(nc))
    pos_freq = np.linspace(0, np.hypot(freq_r[-1], freq_c[-1]), 
        pol_psd.shape[1])

    if doplot:
        fig,ax = pp.subplots(2,2)

        im0 = ax[0,0].imshow(img*win, cmap=pp.cm.gray)
        ax[0,0].set_axis_off()
        ax[0,0].set_title('Windowed image')

        lnorm = LogNorm(vmin=psd.min(), vmax=psd.max())
        ax[0,1].set_axis_bgcolor('k')
        im1 = ax[0,1].imshow(psd, extent=(freq_c[0], freq_c[-1], 
            freq_r[0], freq_r[-1]), aspect='auto', 
            cmap=pp.cm.hot, norm=lnorm)
        # cb1 = pp.colorbar(im1, ax=ax[0,1], use_gridspec=True)
        # cb1.set_label('Power (A.U.)')
        ax[0,1].set_title('2D power spectrum')

        ax[1,0].set_axis_bgcolor('k')
        im2 = ax[1,0].imshow(pol_psd, cmap=pp.cm.hot, norm=lnorm, 
            extent=(pos_freq[0],pos_freq[-1],0,360), 
            aspect='auto')
        ax[1,0].set_ylabel('Angle (deg)')
        ax[1,0].set_xlabel('Frequency (cycles/px)')
        # cb2 = pp.colorbar(im2, ax=(ax[0,1],ax[1,1]), use_gridspec=True)
        # cb2.set_label('Power (A.U.)')
        ax[1,0].set_title('Polar-transformed power spectrum')

        ax[1,1].hold(True)
        # ax[1,1].fill_between(pos_freq, mpow - stdpow, mpow + stdpow, 
        #   color='r', alpha=0.3)
        ax[1,1].axvline(0, c='k', ls='--', alpha=0.3)
        ax[1,1].plot(pos_freq, mpow, lw=3, c='r')
        ax[1,1].set_xlabel('Frequency (cycles/px)')
        ax[1,1].set_ylabel('Power (A.U.)')
        ax[1,1].set_yscale('log')
        ax[1,1].set_xlim(-0.05, None)
        ax[1,1].set_title('1D power spectrum')

        fig.tight_layout()

    return mpow, stdpow, pos_freq

def make2DWindow(shape,winfunc,*args,**kwargs):
    assert callable(winfunc)
    r,c = shape
    rvec = winfunc(r,*args,**kwargs)
    cvec = winfunc(c,*args,**kwargs)
    return np.outer(rvec,cvec)

def polar_transform(image, centre=(0,0), n_angles=None, n_radii=None):
    """
    Polar transformation of an image about the specified centre coordinate
    """
    shape = image.shape
    if n_angles is None:
        n_angles = shape[0]
    if n_radii is None:
        n_radii = shape[1]
    theta = -np.linspace(0, 2*np.pi, n_angles, endpoint=False).reshape(-1,1)
    d = np.hypot(shape[0]-centre[0], shape[1]-centre[1])
    radius = np.linspace(0, d, n_radii).reshape(1,-1)
    x = radius * np.sin(theta) + centre[0]
    y = radius * np.cos(theta) + centre[1]

    # nb: map_coordinates can give crazy negative values using higher order
    # interpolation, which introduce nans when you take the log later on
    output = map_coordinates(image, [x, y], order=1, cval=np.nan, 
        prefilter=True)
    return output

Source: (StackOverflow)

iOS - Smooth Color Change Transition/Animation

I want to have a smooth color transition that goes across the entire spectrum (i.e. red, blue, green, yellow, orange, etc.)

Also want to be able to have smooth transitions of colors in specific spectrum (i.e. all reds).

Are there any simple algorithms/recursive functions/formulas that can help simplify this process?


Source: (StackOverflow)

iOS FFT Draw spectrum

I've read these question:

Using the apple FFT and accelerate Framework

How do I set up a buffer when doing an FFT using the Accelerate framework?

iOS FFT Accerelate.framework draw spectrum during playback

They all describe how to setup fft with the accelerate framework. With their help I was able to setup fft and get a basic spectrum analyzer. Right now, I'm displaying all values I got from the fft. However, I only want to show 10-15, or a variable number, of bars respreseting certain frequencies. Just like the iTunes or WinAmp Level Meter. 1. Do I need to average magnitude values from a range of frequencies? Or do they just show you a magnitude for the specific frequency bar? 2. Also, do I need to convert my magnitude values to db? 3. How do I map my data to a certain range. Do I map against the max db range for my sounds bitdepth? Getting the max Value for a bin will lead to jumping max mapping values.

My RenderCallback:

static OSStatus PlaybackCallback(void *inRefCon,
                                 AudioUnitRenderActionFlags *ioActionFlags,
                                 const AudioTimeStamp *inTimeStamp,
                                 UInt32 inBusNumber,
                                 UInt32 inNumberFrames,
                                 AudioBufferList *ioData)
{
    UInt32 maxSamples = kAudioBufferNumFrames;

    UInt32 log2n = log2f(maxSamples); //bins
    UInt32 n = 1 << log2n;

    UInt32 stride = 1;
    UInt32 nOver2 = n/2;

    COMPLEX_SPLIT   A;
    float          *originalReal, *obtainedReal, *frequencyArray, *window, *in_real;

    in_real = (float *) malloc(maxSamples * sizeof(float));

    A.realp = (float *) malloc(nOver2 * sizeof(float));
    A.imagp = (float *) malloc(nOver2 * sizeof(float));
    memset(A.imagp, 0, nOver2 * sizeof(float));

    obtainedReal = (float *) malloc(n * sizeof(float));
    originalReal = (float *) malloc(n * sizeof(float));
    frequencyArray = (float *) malloc(n * sizeof(float));

    //-- window

    UInt32 windowSize = maxSamples;
    window = (float *) malloc(windowSize * sizeof(float));

    memset(window, 0, windowSize * sizeof(float));
    //    vDSP_hann_window(window, windowSize, vDSP_HANN_DENORM);

    vDSP_blkman_window(window, windowSize, 0);

    vDSP_vmul(ioBuffer, 1, window, 1, in_real, 1, maxSamples);

    //-- window

    vDSP_ctoz((COMPLEX*)in_real, 2, &A, 1, maxSamples/2);

    vDSP_fft_zrip(fftSetup, &A, stride, log2n, FFT_FORWARD);
    vDSP_fft_zrip(fftSetup, &A, stride, log2n, FFT_INVERSE);

    float scale = (float) 1.0 / (2 * n);

    vDSP_vsmul(A.realp, 1, &scale, A.realp, 1, nOver2);
    vDSP_vsmul(A.imagp, 1, &scale, A.imagp, 1, nOver2);

    vDSP_ztoc(&A, 1, (COMPLEX *) obtainedReal, 2, nOver2);
    vDSP_zvmags(&A, 1, obtainedReal, 1, nOver2);

    Float32 one = 1;
    vDSP_vdbcon(obtainedReal, 1, &one, obtainedReal, 1, nOver2, 0);

    for (int i = 0; i < nOver2; i++) {
        frequencyArray[i] = obtainedReal[i];
    }


    // Extract the maximum value
    double fftMax = 0.0;
    vDSP_maxmgvD((double *)obtainedReal, 1, &fftMax, nOver2);

    float max = sqrt(fftMax);
}

Playing some music, I get values from -96db to 0db. Plotting a point at:

CGPointMake(i, kMaxSpectrumHeight * (1 - frequencyArray[i]/-96.));

is giving my a rather rounded curve:

plot1

If I don't convert to db I can plot by multiplying my array value by 10000 and get nice peaks.

plot2

Am I doing something totally wrong? And how do I get to showing a variable number of bars?


Source: (StackOverflow)

Analyze audio using Fast Fourier Transform

I am trying to create a graphical spectrum analyzer in python.

I am currently reading 1024 bytes of a 16 bit dual channel 44,100 Hz sample rate audio stream and averaging the amplitude of the 2 channels together. So now I have an array of 256 signed shorts. I now want to preform a fft on that array, using a module like numpy, and use the result to create the graphical spectrum analyzer, which, to start will just be 32 bars.

I have read the wikipedia articles on Fast Fourier Transform and Discrete Fourier Transform but I am still unclear of what the resulting array represents. This is what the array looks like after I preform an fft on my array using numpy:

   [ -3.37260500e+05 +0.00000000e+00j   7.11787022e+05 +1.70667403e+04j
   4.10040193e+05 +3.28653370e+05j   9.90933073e+04 +1.60555003e+05j
   2.28787050e+05 +3.24141951e+05j   2.09781047e+04 +2.31063376e+05j
  -2.15941453e+05 +1.63773851e+05j  -7.07833051e+04 +1.52467334e+05j
  -1.37440802e+05 +6.28107674e+04j  -7.07536614e+03 +5.55634993e+03j
  -4.31009964e+04 -1.74891657e+05j   1.39384348e+05 +1.95956947e+04j
   1.73613033e+05 +1.16883207e+05j   1.15610357e+05 -2.62619884e+04j
  -2.05469722e+05 +1.71343186e+05j  -1.56779748e+04 +1.51258101e+05j
  -2.08639913e+05 +6.07372799e+04j  -2.90623668e+05 -2.79550838e+05j
  -1.68112214e+05 +4.47877871e+04j  -1.21289916e+03 +1.18397979e+05j
  -1.55779104e+05 +5.06852464e+04j   1.95309737e+05 +1.93876325e+04j
  -2.80400414e+05 +6.90079265e+04j   1.25892113e+04 -1.39293422e+05j
   3.10709174e+04 -1.35248953e+05j   1.31003438e+05 +1.90799303e+05j...

I am wondering what exactly these numbers represent and how I would convert these numbers into a percentage of a height for each of the 32 bars. Also, should I be averaging the 2 channels together?


Source: (StackOverflow)

Units of a Fourier Transform (FFT) when doing Spectral Analysis of a Signal

My question has to do with the physical meaning of the results of doing a spectral analysis of a signal, or of throwing the signal into an FFT and interpreting what comes out using a suitable numerical package,

Specifically:

  • take a signal, say a time-varying voltage v(t)
  • throw it into an FFT (you get back a sequence of complex numbers)
  • now take the modulus (abs) and square the result, i.e. |fft(v)|^2.

So you now have real numbers on the y axis -- shall I call these spectral coefficients?

  • using the sampling resolution, you follow a cookbook recipe and associate the spectral coefficients to frequencies.
  • AT THIS POINT, you have a frequency spectrum g(w) with frequency on the x axis, but WHAT PHYSICAL UNITS on the y axis?

My understanding is that this frequency spectrum shows how much of the various frequencies are present in the voltage signal -- they are spectral coefficients in the sense that they are the coefficients of the sines and cosines of the various frequencies required to reconstitute the original signal.

So the first question is, what are the UNITS of these spectral coefficients?

The reason this matters is that spectral coefficients can be tiny and enormous, so I want to use a dB scale to represent them.

But to do that, I have to make a choice:

  • Either I use the 20log10 dB conversion, corresponding to a field measurement, like voltage.
  • Or I use the 10log10 dB conversion, corresponding to an energy measurement, like power.

Which scaling I use depends on what the units are.

Any light shed on this would be greatly appreciated!


Source: (StackOverflow)

Sorting a list of RGB triplets into a spectrum

I have a list of RGB triplets, and I'd like to plot them in such a way that they form something like a spectrum.

I've converted them to HSV, which people seem to recommend.

from PIL import Image, ImageDraw
import colorsys

def make_rainbow_rgb(colors, width, height):
    """colors is an array of RGB tuples, with values between 0 and 255"""

    img = Image.new("RGBA", (width, height))
    canvas = ImageDraw.Draw(img)

    def hsl(x):
        to_float = lambda x : x / 255.0
        (r, g, b) = map(to_float, x)
        h, s, l = colorsys.rgb_to_hsv(r,g,b)
        h = h if 0 < h else 1 # 0 -> 1
        return h, s, l

    rainbow = sorted(colors, key=hsl)

    dx = width / float(len(colors)) 
    x = 0
    y = height / 2.0
    for rgb in rainbow:
        canvas.line((x, y, x + dx, y), width=height, fill=rgb)
        x += dx
    img.show()

However, the result doesn't look very much like a nice rainbow-y spectrum. I suspect I need to either convert to a different color space or handle the HSL triplet differently.

this doesn't look like a spectrum

Does anyone know what I need to do to make this data look roughly like a rainbow?

Update:

I was playing around with Hilbert curves and revisited this problem. Sorting the RGB values (same colors in both images) by their position along a Hilbert curve yields an interesting (if still not entirely satisfying) result:

RGB values sorted along a Hilbert curve.


Source: (StackOverflow)

Clojure/Java: Java libraries for spectrum analysis of sound?

I am looking for a library that can accept a chunk of audio data and return the average amplitude over time within a given frequency band.

I've already asked this question over at comp.dsp, but it's clear to me that acquiring the know-how to build this on my own using a basic FFT library is going to require more time and energy than I have at present. Here is my original question with more detai: http://groups.google.com/group/comp.dsp/browse_thread/thread/e04f78d439e9e2bd

I've found lots of nice libraries for playing with sound (I used JSyn in the past), but none of these seem to be set up to return quick and dirty spectral information about a sound sample.

Any pointers would be much appreciated.


Source: (StackOverflow)

Rhythm detection through analyzing the audio spectrum

I'm building a rhythm-based game, and facing a lot of problems with rhythm-detection. I receive the current spectrum of a playing song. It looks like a float array with 512 floats. 256 for left and right channel representation. FFT is also available. But I have no idea how to work with that data, I've made some experiments with visualizing, but it gave me very few information. I've googled for some ready algorithms, but there is nothing. Please, can someone help me with, maybe, some references, materials, articles connected with rhythm detection, working with audio spectrum. Code will also be very helpful. Thanks.


Source: (StackOverflow)

iOS FFT Accerelate.framework draw spectrum during playback

I did a lot of research and learned a lot about FFT and the Accelerate Framework. But after days of experiments I'm kind of frustrated.

I want to display the frequency spectrum of an audio file during playback in a diagram. For every time interval it should show the magnitude in db on the Y-axis (displayed by a red bar) for every frequency (in my case 512 values) calculated by a FFT on the X-Axis.

The output should look like this: enter image description here

I fill a buffer with 1024 samples extracting only the left channel for the beginning. Then I do all this FFT stuff.

Here is my code so far:

Setting up some variables

- (void)setupVars  
{  
    maxSamples = 1024;

    log2n = log2f(maxSamples);  
    n = 1 << log2n;  

    stride = 1;  
    nOver2 = maxSamples/2;  

    A.realp = (float *) malloc(nOver2 * sizeof(float));  
    A.imagp = (float *) malloc(nOver2 * sizeof(float));  
    memset(A.imagp, 0, nOver2 * sizeof(float));

    obtainedReal = (float *) malloc(n * sizeof(float));  
    originalReal = (float *) malloc(n * sizeof(float));

    setupReal = vDSP_create_fftsetup(log2n, FFT_RADIX2);  
}

Doing the FFT. FrequencyArray is just a data structure that holds 512 float values.

- (FrequencyArry)performFastFourierTransformForSampleData:(SInt16*)sampleData andSampleRate:(UInt16)sampleRate   
{  
    NSLog(@"log2n %i n %i,  nOver2 %i", log2n, n, nOver2);

    // n = 1024
    // log2n 10
    // nOver2 = 512

    for (int i = 0; i < n; i++) {
        originalReal[i] = (float) sampleData[i];
    }

    vDSP_ctoz((COMPLEX *) originalReal, 2, &A, 1, nOver2);

    vDSP_fft_zrip(setupReal, &A, stride, log2n, FFT_FORWARD);

    float scale = (float) 1.0 / (2 * n);

    vDSP_vsmul(A.realp, 1, &scale, A.realp, 1, nOver2);
    vDSP_vsmul(A.imagp, 1, &scale, A.imagp, 1, nOver2);

    vDSP_ztoc(&A, 1, (COMPLEX *) obtainedReal, 2, nOver2);

    FrequencyArry frequencyArray;

    for (int i = 0; i < nOver2; i++) {
        frequencyArray.frequency[i] = log10f(obtainedReal[i]); // Magnitude in db???
    }

    return frequencyArray;  
}

The output looks always kind of weird although it some how seems to move according to the music.

I'm happy that I came so far thanks to some very good posts here like this: Using the apple FFT and accelerate Framework

But now I don't know what to do. What am I missing?


Source: (StackOverflow)

Why do I need to apply a window function to samples when building a power spectrum of an audio signal?

I have found for several times the following guidelines for getting the power spectrum of an audio signal:

  • collect N samples, where N is a power of 2
  • apply a suitable window function to the samples, e.g. Hanning
  • pass the windowed samples to an FFT routine - ideally you want a real-to-complex FFT but if all you have a is complex-to-complex FFT then pass 0 for all the imaginary input parts
  • calculate the squared magnitude of your FFT output bins (re * re + im * im)
  • (optional) calculate 10 * log10 of each magnitude squared output bin to get a magnitude value in dB
  • Now that you have your power spectrum you just need to identify the peak(s), which should be pretty straightforward if you have a reasonable S/N ratio. Note that frequency resolution improves with larger N. For the above example of 44.1 kHz sample rate and N = 32768 the frequency resolution of each bin is 44100 / 32768 = 1.35 Hz.

But... why do I need to apply a window function to the samples? What does that really means?

What about the power spectrum, is it the power of each frequency in the range of sample rate? (example: windows media player visualizer of sound?)


Source: (StackOverflow)