Accuracy and Optimizing Speech Recognition
Choosing the best hardware to give you
the best accuracy and performance
These days you no
longer need a supercomputer for speech recognition to work
well. But you might get some appreciable benefits by
upgrading your current computer.
This is part of a series on
speech recognition software. See related articles
listed on the right.
How much accuracy is necessary
to make speech recognition a practical consideration for you?
How fast do you need to be able
to speak at so as to make speech recognition truly productive for you?
And, a related question, is
your present computer system powerful enough to give you good
first speech recognition? If it is not, what sort of computer
system should you now upgrade to?
Please read on for the answers
to these questions.
The Vital Importance of Small
Improvements In Accuracy
It is easy to look at two
different speech recognition systems, one offering perhaps a 97%
accuracy rate and the other offering perhaps a 98% accuracy
rate, and to think that the difference of 'only' 1%, especially
when both systems are scoring so high, is not worth paying much
extra money for.
Unfortunately, if you think
this, you are looking at the process from the wrong perspective.
The key concept to keep in mind is not how much the system gets
correct, but how much the system gets wrong.
And so, when expressed in
terms of errors, one system offers 3% errors and the other
system offers 2% errors. Maybe these two numbers also seem very
similar. But think about it this way : the system with 3% errors
is making mistakes 50% more often than the system with 2%
errors. Instead of having maybe six errors to correct in a given
piece of dictation, you will have nine. All of a sudden, a
small seeming difference starts to be more realistically
appreciated as very big and significant.
Keep in mind that the major
hassle factor in speech recognition software is the correction
process. And so you would be very well advised to chase down the
most accurate system possible - that seemingly insignificant
difference of only 1% actually represents a 50% increase in
errors and correcting. and that is definitely worth going
to a reasonable amount of time, trouble, and investment to
How Much Accuracy Is Enough
Of course, the more accurate
the results you get from your speech recognition, the very much
better it will be.
There is no exact magic
number, below which speech recognition is inadequate, and above
which it is productive. Assessing these values depends in
part on how fast you can type, and what your other alternatives
But, to provide some general
guidance, it would be fair to say that if you are getting less
than 95% accuracy, you will be disappointed; and if you are
getting more than 98%, you will be pleased. If you reach 99%,
you may even be delighted. As you will recall from the
preceding section, a move from 98% to 99% it is a profound
difference - it represents a massive halving of the number of
errors to be corrected.
Getting it Right First Time
If you are going to
implement a speech recognition system, you should try and get
your system optimized right from the very beginning.
you save yourself the time and inconvenience of needing to
retrain your system if you make changes to it. It also means
that you'll be getting a more positive experience right from day
one, and so you will be more likely to benefit from and continue
to use speech recognition software.
(I write this very much in a
'Do as I say, not as I do' manner; I am on my second computer
and fourth microphone in not quite three weeks of successive -
and increasingly frustrating -
tweaks to my set up in the ongoing quest for better performance.)
So, be sensitive to the
vital importance of optimizing accuracy as far as is practicable
and be open to the probable need to invest money in better
equipment upfront; this will save you lots of time and
The key two things to get
right are the power of your computer and the quality of your
How Much Computer Power Do You
How high is up? The
short answer to this question is that a more powerful computer
is invariably better than a less powerful computer.
A more powerful computer
gives you two important benefits when using speech recognition
software. The first benefit is that it works more quickly,
and you can speak at a more natural speed and the computer will
more readily keep up with you.
The second benefit is that a
more powerful computer can "think harder" about what you are
saying. Instead of considering (shall we say) 10 different
alternatives for what it thinks you said; it can consider 20
different alternatives in the same amount of time and increase
the likelihood of determining the correct words for what it
Is your present computer
This is again a subjective
question with a similarly subjective answer. The actual speed at which a computer
works depends upon several different parts of the computer, and
also depends on the type of work it is doing.
In the case of speech
recognition, the most important parts of the computer are the
processor and its speed, the amount of cache on the processor itself,
and lastly the amount of and speed of memory (not disk but
memory). Speech processing software typically resides
completely in memory, and does most of its processing using the
processor's onboard cache.
Apart from its initial load,
the software should never need to go back out to the desk to get extra
information because that would be way too slow. It also is
not a graphically intense task, so is not materially dependent
on the speed of your graphics card.
Test and measure your computer
Here is a way to objectively
measure your computer's real world processing power - use
free suite of three tests.
This testing suite will give
you results for three different tests, and then an aggregate
result averaging all three individual results. You should
look both at the scores and also the percentage CPU utilization
in the three tests. You want the scores to be high and the
CPU utilization to be low. Low scores mean an underpowered
computer, and high CPU utilization suggests that the CPU is the
weak link in the chain.
The least relevant of the
three tests is the last one, but they are all reasonably
sensible tests that relate reasonably directly to the
performance you can expect with speech recognition software.
I have tested two of my main
computers. The slower scored 532, the faster scored 905.
But, despite one machine
scoring almost twice the other, there
was not really a profound difference between the two machines in
terms of performance. The slower machine work adequately,
the faster machine works better. I was surprised at the
difference in score, and immediately discontinued using the
slower machine, hence the slightly imprecise nature of my
There was one particularly
surprising part of the score. The slower machine (almost 4 years
old) was powered by a dual core Pentium processor, operating at
3.4 GHz. The faster machine (not yet six months old) was powered
by a dual core T9600 processor, operating at 2.8 GHz. Although
slower in gigahertz terms, everything else about the computer is
clearly very much faster. Moral of the story? You cannot
judge a computer merely by the speed of its processor.
These days, modern
processors do more things per cycle. It is probably valid
to compare one processor with another of the same generation and
design, and to conclude that the one with the higher rated speed
will be faster, but it is not so valid to compare processors of
two different generations and to try and directly compare their
productivity as if linked to their rated speeds.
I also tested an Intel
Celeron 743 single core CPU powered Netbook, operating at
1.3GHz. It scored a mere 253.
What these scores suggest,
overall, is that if your computer scores less than 500, it is
probably appreciably underpowered, and you should
consider upgrading. If it scores over 1000, it is a good
midrange computer judged by current standards of CPU power
(i.e. as of May 2010).
Choosing a New Faster Computer
How to know if a new
computer will be sufficiently more powerful than your old
computer? Should you spend $500, $1000, $1500, or some
other sum on getting a new computer?
One possible way to get
a feel for the real world improvement offered by a new
computer would be to run the same test on the computer you are
considering buying. This of course assumes that you have access
to a computer in a store, and that the store salesman will allow
you to download software and run it on their demo
computer. You could point out to them that this is well written
software that does not touch the registry at all, and is very
easy to uninstall without leaving any mess behind.
Otherwise, although it is a
bit of a simplification, you can read up on reviews of similar
computers that have the same CPU to get a feeling for their
performance, particularly relative to other computer models
currently available. That will give you some feeling for
relativity of performance.
Generally, our strategy has
always been to buy a fast state-of-the-art computer, but not the
very fastest. We will happily spend an extra $100 or $200 to get
a faster than 'normal' computer, with the expectation being that
spending an extra few hundred dollars up front may lengthen the
practical working life of the computer by possibly as much as an
extra year. We would much rather pay, for example, $1250 for a
computer that lasts us four years than $1000 for a computer that
lasts us three years - particularly when you consider the
appalling hassle involved in upgrading a computer, copying over
files, reinstalling software, etc.
Lastly, to offer some advice
that will quickly become out of date, it is our feeling at
present (May 2010) that the 'sweet spot' of computer
price/performance is represented by a computer with
an Intel i7-930 CPU and 6 GB of triple channel DDR3 memory,
running Windows 7 in a 64-bit version.
This will probably cost $1250-$1400 depending on options and
where you buy it from. You can expect it to score about
1650 on the testing programs mentioned above.
Is There Such a Thing as a Too
Yes and no. Some of
the typical enhancements to make your computer more generally
powerful and faster have little relevance to the performance you
will get when using speech recognition software. The two
key areas that determine the performance you'll experience with
speech recognition software are the processor speed and the
amount of L2/L3 cache (this assumes that you have 4 GB or more
of reasonably fast memory).
Some benchmarking studies
have shown very little difference in observed user experience as
between the ultimate top-of-the-line computers and those several
steps down from that. Indeed, my own experience showed
less difference between an underpowered computer and a
moderately powered computer than I might have expected (but
possibly there were some hidden advantages caused by the more
powerful computer taking more processing time to be more
So this is good news - you
don't need to break the bank on getting an ultimate speed demon
top-of-the-line computer. But you also need to remember
that you will be using your computer not just for speech
recognition purposes. You will probably have other
programs running in the background, and so more general
computing speed overall will help your computer be more
We return back to our
suggestion in the preceding section. Our feeling is that a good
price performance compromise, offering you the best value, is
currently represented by a computer based on an Intel i7-930
Summary of Part 3 of this
Chasing down every last
partial percent of accuracy is a very valid undertaking.
There is no such thing as diminishing returns when it comes to
improving the accuracy of your speech recognition system.
In this article, we also
discuss the amount of processing power you need to get best use
from your speech recognition software.
Coming next week will be the
fourth part of our
series, where we talk about how to choose the best microphone.
This one choice can have more impact on your overall performance
and productivity than anything else.
If so, please donate to keep the website free and fund the addition of more articles like this. Any help is most appreciated - simply click below to securely send a contribution through a credit card and Paypal.
7 May 2010, last update
02 Jul 2017
You may freely reproduce or distribute this article for noncommercial purposes as long as you give credit to me as original writer.