Vowshape

Vowshape

Overview

A set of functions intended for empirically shaping vocal tract shapes for tract. The aim of these function is to make it easier to programatically target vowel sounds.

Tangled Files

vowshape.h and vowshape.c.

<<vowshape.c>>=
#include <math.h>
#include "vowshape.h"
<<funcs>>

<<vowshape.h>>=
#ifndef SK_VOWSHAPE_H
#define SK_VOWSHAPE_H

#ifndef SKFLT
#define SKFLT float
#endif

<<funcdefs>>
#endif

Constant

Sets a region to be a constant value val, at the starting point start with a length len.

<<funcdefs>>=
void sk_vowshape_constant(SKFLT *a,
                          int start, int len, SKFLT val);

<<funcs>>=
void sk_vowshape_constant(SKFLT *a,
                          int start, int len, SKFLT val)
{
    int i;

    for (i = 0; i < len; i++) {
        a[start + i] = val;
    }
}

Pyramid

Creates a triangle shape around the center point center, with values that go within range beg and end. The number of steps a side has is determined with nsteps.

<<funcdefs>>=
void sk_vowshape_pyramid(SKFLT *a,
                         int center, int nsteps,
                         SKFLT beg, SKFLT end);

<<funcs>>=
void sk_vowshape_pyramid(SKFLT *a,
                         int center, int nsteps,
                         SKFLT beg, SKFLT end)
{
    int i;

    for (i = 0; i < nsteps; i++) {
        SKFLT y;
        y = (1.0 - ((SKFLT)i / nsteps));
        a[center + i] = (1 - y)*beg + y*end;
    }

    for (i = 0; i < nsteps; i++) {
        SKFLT y;
        y = (1.0 - ((SKFLT)i / nsteps));
        a[center - i] = (1 - y)*beg + y*end;
    }
}

Parabola

Draws a parabola around a center position center with a width at each side being nsteps. The value is scaled to between beg and end.

<<funcdefs>>=
void sk_vowshape_parabola(SKFLT *a,
                          int center, int nsteps,
                          SKFLT beg, SKFLT end);

<<funcs>>=
void sk_vowshape_parabola(SKFLT *a,
                          int center, int nsteps,
                          SKFLT beg, SKFLT end)
{
    int i;
    double dt;

    dt = 1.0/nsteps;

    for (i = 0; i < nsteps; i++) {
        SKFLT y;
        y = (-((i * dt)*(i * dt)) + 1.0);
        a[center + i] = (1 - y)*beg + y * end;
    }

    for (i = 0; i < nsteps; i++) {
        SKFLT y;
        y = (-((-i * dt)*(-i * dt)) + 1.0);
        a[center - i] = (1 - y)*beg + y * end;
    }
}

Exponential

Draws an exponential curve with a slope slope, starting at position start and being nsteps long.

<<funcdefs>>=
void sk_vowshape_exponential(SKFLT *a,
                             SKFLT slope,
                             int start,
                             int nsteps,
                             SKFLT beg, SKFLT end);

<<funcs>>=
void sk_vowshape_exponential(SKFLT *a,
                             SKFLT slope,
                             int start,
                             int nsteps,
                             SKFLT beg, SKFLT end)
{
    double dt;
    int i;

    if (nsteps == 1) dt = 0;
    else dt = 1.0/(nsteps - 1);

    for (i = 0; i < nsteps; i++) {
        SKFLT x, y;
        x = i * dt;
        y = (1.0 - exp(x*slope)) / (1 - exp(slope));
        a[start + i] = (1 - y)*beg + y * end;
    }
}

Distinctive Region Model

The Distinctive Region Model is a technique for articulatory synthesis models that subdivide the vocal tract into 8 regions. Manipulating these regions in various ways will change the levels of the first 3 formants.

The regions of the tract are of different sizes. It has been difficult to find what the precise sizes are. The core papers describing this technique in detail (Mrayati 1988, Carre and Mrayati 1992) are behind paywalls. So, the best I have write now are some loose instructions and a chart.

Many thanks to "The Tube Resonance Model Speech Synthesizer" by Leonard Manzara for hints and charts on this, as well as providing a clear overview for the underlying techniques for tube models used in articulatory synthesis.

So, hints and charts tell me the regions are roughly proportional to one another.

R2 and R7 are the smallest regions, and seem to be equal in size.

R4 and R5 are the largest regions, and seem to be equal in size.

R1 seems to be about 2x the size of R2, maybe less. R8 is about the same size as R1.

R2 + R3 lengths are about R4.

So a decent enough ratio would be 2:1:2:4:4:2:1:2. According to the hints in the paper, this can be approximate (the paper distributes this using only 10 cylindrical areas. This model uses up to 44).

To do this dynamically, divide the total array size into 18 parts. This size is the base unit size. Apply the ratios above, and then on the last one, fill the remaining samples.

<<funcdefs>>=
void sk_vowshape_drm(SKFLT *a, int len, SKFLT *vals);

<<funcs>>=
void sk_vowshape_drm(SKFLT *a, int len, SKFLT *vals)
{
    int pos;
    int i;
    SKFLT unit;
    int nsmps;

    unit = len / 18.0;

    pos = 0;

    /* R1: ~2 units */

    nsmps = floor(unit * 2);

    for (i = 0; i < nsmps; i++) {
        a[pos] = vals[0];
        pos++;
    }

    /* R2: ~1 unit */

    nsmps = floor(unit * 1);

    for (i = 0; i < nsmps; i++) {
        a[pos] = vals[1];
        pos++;
    }

    /* R3: ~2 units */

    nsmps = floor(unit * 2);

    for (i = 0; i < nsmps; i++) {
        a[pos] = vals[2];
        pos++;
    }

    /* R4: ~4 units */

    nsmps = floor(unit * 4);

    for (i = 0; i < nsmps; i++) {
        a[pos] = vals[3];
        pos++;
    }

    /* R5: ~4 units */

    nsmps = floor(unit * 4);

    for (i = 0; i < nsmps; i++) {
        a[pos] = vals[4];
        pos++;
    }

    /* R6: ~2 units */

    nsmps = floor(unit * 2);

    for (i = 0; i < nsmps; i++) {
        a[pos] = vals[5];
        pos++;
    }

    /* R7: ~1 units */

    nsmps = floor(unit * 1);

    for (i = 0; i < nsmps; i++) {
        a[pos] = vals[6];
        pos++;
    }

    /* R8: ~2 units. Finish it off */

    for (i = pos; i < len; i++) {
        a[i] = vals[7];
    }
}