For the WordTutor application to work, we need to be able to read words (and letters) out loud to our student. To power the speech synthesis, we’re going to integrate Azure Cognitive Services into the application.
Azure Speech API
Setting up access to the Speech API within Azure Cognitive Services is relatively straightforward. Rather than repeating the details here, I’ll just point you to the quickstart. I’m using the free tier (“F0”), allowing for up to 5 hours of speech rendering per month.
When you’re finished creating your service, take note of the region you used (mine was australiaeast
) and one of the API keys generated for you.
Secret Management
We don’t want to embed any secrets directly in the application, nor commit them to our git repository.
Fortunately, we can easily move those secrets out of the application, using a couple of NuGet packages:
Microsoft.Extensions.Configuration
provides basic infrastructure; andMicrosoft.Extensions.Configuration.UserSecrets
allows us to keep our secrets outside the project directory during development.
There are other packages available as well, allowing configuration to be stored in other places.
With those packages installed, we can use the dotnet
command to store our secrets.
For the entry-point project of our application we need a one-time initialization:
If you’re doing this for yourself, don’t make the mistake I just showed above! Secrets configuration needs to be done for the entry point of the project, so I had to redo the above step for the WordTutor.Desktop
project.
Once completed, you’ll find a <UserSecretsId>
element has been added to the .csproj
file of your project, in the first <PropertyGroup>
:
To store our two secrets, we again use the dotnet
command:
These secrets are stored in your user profile on this PC. Navigate to the folder %USERPROFLE%\AppData\Roaming\Microsoft\UserSecrets
to find a folder with the name matching the <UserSecretsId>
from above; inside there is usersecrets.json
, containing your secrets. It’s worth emphasizing that there’s no encryption here; the goal is to keep the secrets out of your git repo, not to hide them from you.
Speech service
We need to declare a service interface for our application, representing the speech service in a technology-agnostic way:
Using this interface will allow us to wire up a fake service for testing purposes, allowing us to verify correct behaviour without actually calling into the Azure implementation and incurring costs.
Our primary implementation of ISpeechService
will be AzureSpeechService
. To keep our dependencies properly isolated, this lives in a new project WordTutor.Azure
. Anything else Azure related we choose to add in the future will live here too.
The IConfigurationRoot
parameter for the constructor is how we retrieve the user secrets we stashed away earlier. In full, the constructor looks like this:
SimpleInjector
To make our ISpeechService
available for consumption, we need to register it with our dependency injection container.
We also need a singleton registration for IConfigurationRoot
. To build this we need to first build it:
The generic parameter Program
provided to the AddUserSecrets()
call is used to identify the entry-point assembly. The <UserSecretsId>
element from the csproj
file turns into an assembly level attribute, allowing the running application to find the secrets it needs.
Consumption
Finally, for demo purposes, we can inject ISpeechService
into our main window, hook it up to a button and make everything work.
Having (literally!) achieved “Hello World”, we need to make a number of improvements. Most notably, we currently have a non-trivial lag before speech begins - plus we’re re-rendering the same text as audio every time we want to speak. Some caching - and some pre-caching - seems to be in order.
Comments
blog comments powered by Disqus