Skip to content

Latest commit

 

History

History
 
 

festvox

                  "Building Voices in Festival"
   Alan W Black ([email protected]) and Kevin Lenzo ([email protected])
                      http://www.festvox.org

The included documentation, scripts and examples should be sufficient
for an interested person to build their own synthetic voices in
currently supported languages or new languages in the University of
Edinburgh's Festival Speech Synthesis System.  The quality of the
result depends much on the time and skill of the builder.  For English
it may be possible to build a new voice in a couple of days work, a
new language may take months or years to build.  It should be noted
that even the best voices in Festival (or any other speech synthesis
system for that matter) are still nowhere near perfect quality.

This distribution includes
    Support for designing, recording and autolabelling diphone databases
    Support for designing, recording and autolabelling unit selection dbs
    Support for designing, recording and autolabelling statistical parametric
        synthesis voices
    Building simple limited domain synthesis engines
    Support for building rule driven and data driven prosody models
       (duration, intonation and phrasing)
    Support for building rule driven and data driven text analysis
    Lexicon and building Letter to Sound rule support
    Predefined scripts for building new US (and UK) English voices
    Scripts for designing and selecting prompts to record in
       arbitrary languages

New in 2.7

    Random forest models building for spectrum and duration in clustergen
    Grapheme based synthesizers (with specific support for large number
      of unicode writing systems)
    Clustergen state and stop value optimization
    Wavesurfer label support
    SPAM F0 support
    Phrase break support
    Support for SPTK's mgc parameterization

New in 2.3
    
    Support for cygwin tools under Windows
    Substantially improved CLUSTERGEN support with mlpg and mlsb

WARNING

This is not a pointy/clicky plug and play program to build new voices.
It is instructions with discussion on the problems and an attempt to
document the expertise we have gained in building other voices.
Although we have tried to automate the task as much as possible this
is no substitute for careful correction and understanding of the
processes involved.  There are significant pointers into the
literature throughout the document that allow for more detailed study
and further reading.

REQUIREMENTS

A Unix Machine
    although there is nothing inheritantly Unix about the scripts, no
    attempt has yet been made about porting this to other platforms
Festival and Speech Tools
    This uses speech tools programs and festival itself at various
    stages in builidng voices as well as (of course) for the final
    voices.  Festival and the Edinburgh Speech Tools are available from
       http://www.cstr.ed.ac.uk/projects/festival/
    or
       http://www.festvox.org/festival

    It is recommended that you compile your own versions of these
    as you will need the libraries and include files to build some
    programs in this festvox.
Wavesurfer
    TO display wavesforms, spectragrams and phoneme labels.
Patience and understanding
    Building a new voice is a lot of work, and something will probably
    go wrong which may require the repetition of some long boring and
    tedious process.  Even with lots of care a new voice still might 
    just not work.  In distributing this document we hope to increase the
    basic knowledge of synthesis out there and hopefully find people 
    who can improve on this making the processing easier and more reliable
    in the future.

INSTALLATION

You must have the Edinburgh Speech Tools and Festival instllation
before you can build the tools in the festvox distribution.

Unpack festvox-2.7-release.tar.gz

    tar zxvf festvox-2.7-release.tar.gz
    cd festvox
    ./configure
    make

The configuration basically tries to find your version of 
the Edinburgh Speech Tools and uses its configuration to set
compiler type etc.  So you must have that installed.  If configure
fails try expliciting setting your ESTDIR environment variable
to point ot your compiled version of the Speech Tools.

A pre-generated version of the document in html and postscript are
distributed in the html/ directory

If you need to build the document itself, you will need a working
version of the docbook tools, which may (or may not) already
be installed on your system
    
To build the documenation
   
    cd docbook
    make doc

Note that even if the documentation build fails you can still use all
the scripts and programs.  

To use the scripts and programs in the festvox distribution each
user is expected to have the environment variables ESTDIR and
FESTVOXDIR set for example as (if you use bash, zsh, ksh or sh)

   export ESTDIR=/home/awb/projects/speech_tools
   export FESTVOXDIR=/home/awb/projects/festvox

Or if you use csh or tcsh

   setenv ESTDIR /home/awb/projects/speech_tools
   setenv FESTVOXDIR /home/awb/projects/festvox

Remember to set these to where *your* installations are, not *ours*.