Mac OSX Text to Speech Batch

06
2014-04
  • pommy

    I have 300 English text files that I want to make into mp3 files to listen to as and when.

    Is there a free program or automator script that I could use so that my Mac will batch text to speech the files to mp3 using a rotating voice from the free voices available on Mac OSX?

  • Answers
  • Lauri Ranta

    You can use a shell command like this:

    for f in *.txt;do say -f "$f" -o "${f%txt}aif";done
    

    Random English voice:

    IFS=$'\n';a=($(say -v\?|sed -E $'s/ {2,}/\t/'|awk -F$'\t' '$2~/^en_/{print $1}'));for f in *.txt;do say -v "${a[$((RANDOM%${#a[@]}))]}" -f "$f" -o "${f%txt}aif";done

    Random voice from a list:

    IFS=, read -a a<<<'Daniel,Fiona,Moira,Emily,Serena,Tessa';for f in *.txt;do say -v "${a[$((RANDOM%${#a[@]}))]}" -f "$f" -o "${f%txt}aif";done

    You can use ffmpeg to convert the files to mp3:

    for f in *.aif;do ffmpeg -i "$f" -aq 2 "${f%aif}mp3";done
    

    -aq 2 corresponds to -V2 in lame. You can install ffmpeg with brew install ffmpeg after installing Homebrew.


  • Related Question

    Good Text-to-Speech solution for Windows
  • Jim McKeeth

    I am running Windows 7 and I know it has the ability to read me text in my applications, but I am looking for a good utility to save chunks of text as a wav file or mp3. It may already be built into the OS, but cleverly disguised. I know I can write a program to call the API, which is my next step if there isn't a good solution already.

    I really like the quality of the AT&T system, but it has some pretty steep restrictions on using the produced MP3. I'd like to use them in my podcast.

    Web based is OK too, as long as it easily produces a fairly unencumbered (Public domain or Creative Commons) Wav, MP3 or some other standard audio file. Naturally I prefer free or open source over commercial, but that isn't a requirement.


  • Related Answers
  • User

    I've tried espeak, festival, and MaryTTS. They all generate understandable voices for the most part but they are not very natural. Even with additional voice downloads for these systems (e.g. Mbrola, CMU Arctic) the voices are not that great.

    IVONA voices are the best I've heard so far. They give you a 30 day free demo which is enough if you have a one-off task to do. After that they are like $45/voice. Amazon just bought the company so you know it's solid (http://www.ivona.com/us/news/amazoncom-announces-acquisition-of-ivona-software/).

    They work with Microsoft's SAPI interface which means the voices are available to any program that supports that (e.g. Adobe Reader). I've been using them with Text To Wav program which is nice for bulk conversion of text files into wave files.

    Edit

    Actually just re-read your question and I think for non-personal use (e.g. podcasts) the price is probably a lot higher for IVONA. In that case I'd say check out MaryTTS.

  • John T

    eSpeak is free & open source and offers everything you need.

    It can run as a command line program to speak text from a file or from stdin.
    A shared library version is also available.
    
    * Includes different Voices, whose characteristics can be altered.
    * Can produce speech output as a WAV file.
    * SSML (Speech Synthesis Markup Language) is supported (not complete),
      and also HTML.
    * Compact size. The program and its data, including many languages,
      totals about 1 Mbytes.
    * Can translate text to phoneme codes, so it could be adapted as a front
      end for another speech synthesis engine.
    * Potential for other languages. Several are included in varying stages
      of progress. Help from native speakers for these or other languages is
      welcomed.
    * Development tools available for producing and tuning phoneme data.
    * Written in C++.