TOC

The community is working on translating this tutorial into Belarusian, but it seems that no one has started the translation process for this article yet. If you can help us, then please click "More info".

Audio & Video:

Speech synthesis (making WPF talk)

In the System.Speech assembly, Microsoft has added something really cool: Speech Synthesis, the ability to transform text into spoken words, and Speech Recognition, the ability to translate spoken words into text. We'll be focusing on the speech synthesis in this article, and then get into speech recognition in the next one.

To transform text into spoken words, we'll be using the SpeechSynthesizer class. This class resides in the System.Speech assembly, which we'll need to add to use it in our application. Depending on which version of Visual Studio you use, the process looks something like this:

With the appropriate assembly added, we can now use the SpeechSynthesizer class from the System.Speech.Synthesis namespace. With that in place, we'll kick off with yet another very simple "Hello, world!" inspired example, this time in spoken words:

<Window x:Class="WpfTutorialSamples.Audio_and_Video.SpeechSynthesisSample"
        xmlns="http://schemas.microsoft.com/winfx/2006/xaml/presentation"
        xmlns:x="http://schemas.microsoft.com/winfx/2006/xaml"
        Title="SpeechSynthesisSample" Height="150" Width="150">
    <Grid>
        <Button Name="btnSayIt" Click="btnSayHello_Click" VerticalAlignment="Center" HorizontalAlignment="Center">Say hello!</Button>
    </Grid>
</Window>
using System;
using System.Speech.Synthesis;
using System.Windows;

namespace WpfTutorialSamples.Audio_and_Video
{
	public partial class SpeechSynthesisSample : Window
	{
		public SpeechSynthesisSample()
		{
			InitializeComponent();
		}

		private void btnSayHello_Click(object sender, RoutedEventArgs e)
		{
			SpeechSynthesizer speechSynthesizer = new SpeechSynthesizer();
			speechSynthesizer.Speak("Hello, world!");
		}
	}
}

This is pretty much as simple as it gets, and since the screenshot really doesn't help a lot in demonstrating speech synthesis, I suggest that you try building the example yourself, to experience it.

Controlling pronunciation

The SpeechSynthesizer can do more than that though. Through the use of the PromptBuilder class, we can get much more control of how a sentence is spoken. This next example, which is an extension of the first example, will illustrate that:

<Window x:Class="WpfTutorialSamples.Audio_and_Video.SpeechSynthesisPromptBuilderSample"
        xmlns="http://schemas.microsoft.com/winfx/2006/xaml/presentation"
        xmlns:x="http://schemas.microsoft.com/winfx/2006/xaml"
        Title="SpeechSynthesisPromptBuilderSample" Height="150" Width="150">
    <Grid>
        <Button Name="btnSayIt" Click="btnSayHello_Click" VerticalAlignment="Center" HorizontalAlignment="Center">Say hello!</Button>
    </Grid>
</Window>
using System;
using System.Speech.Synthesis;
using System.Windows;

namespace WpfTutorialSamples.Audio_and_Video
{
	public partial class SpeechSynthesisPromptBuilderSample : Window
	{
		public SpeechSynthesisPromptBuilderSample()
		{
			InitializeComponent();
		}

		private void btnSayHello_Click(object sender, RoutedEventArgs e)
		{
			PromptBuilder promptBuilder = new PromptBuilder();
			promptBuilder.AppendText("Hello world");

			PromptStyle promptStyle = new PromptStyle();
			promptStyle.Volume = PromptVolume.Soft;
			promptStyle.Rate = PromptRate.Slow;
			promptBuilder.StartStyle(promptStyle);
			promptBuilder.AppendText("and hello to the universe too.");
			promptBuilder.EndStyle();

			promptBuilder.AppendText("On this day, ");
			promptBuilder.AppendTextWithHint(DateTime.Now.ToShortDateString(), SayAs.Date);

			promptBuilder.AppendText(", we're gathered here to learn");
			promptBuilder.AppendText("all", PromptEmphasis.Strong);
			promptBuilder.AppendText("about");
			promptBuilder.AppendTextWithHint("WPF", SayAs.SpellOut);

			SpeechSynthesizer speechSynthesizer = new SpeechSynthesizer();
			speechSynthesizer.Speak(promptBuilder);
		}
	}
}

This is where it gets interesting. Try running the example and see how nicely this works. By supplying the SpeechSynthesizer with something more than just a text string, we can get a lot of control of how the various parts of the sentence are spoken. In this case, the application will say the following:

Hello world and hello to the universe too. On this day, <today's date>, we're gathered here to learn all about WPF.

Now try sending that directly to the SpeechSynthesizer and you'll probably giggle a bit of the result. What we do instead is guide the Speak() method into how the various parts of the sentence should be used. First of all, we ask WPF to speak the "and hello to the universe too"-part in a lower volume and a slower rate, as if it was whispered.

The next part that doesn't just use default pronunciation is the date. We use the special SayAs enumeration to specify that the date should be read out as an actual date and not just a set of numbers, spaces and special characters.

We also ask that the word "all" is spoken with a stronger emphasis, to make the sentence more dynamic, and in the end, we ask that the word "WPF" is spelled out (W-P-F) instead of being pronounced as an actual word.

All in all, this allows us to make the SpeechSynthesizer a lot easier to understand!

Summary

Making your WPF application speak is very easy, and by using the PromptBuilder class, you can even get a lot of control of how your words are spoken. This is a very powerful feature, but it might not be relevant to a lot of today's applications. It's still very cool though!


This article has been fully translated into the following languages: Is your preferred language not on the list? Click here to help us translate this article into your language!