A new generation of voice recognition technology makes it possible

Every digital service provider has a telephony infrastructure to handle customer calls, billing systems with a wealth of subscriber information, and support equipment to maintain services. What if you could allow callers to truly take advantage of these systems and actually enhance customer care without agent assistance? Sounds too good to be true, right? Actually, it is now being accomplished at a number of cable providers by automating technical support with a new generation of Rich Phone Applications utilizing voice recognition. Over 30 percent of the millions of calls received by four MSOs in the United States are being handled entirely automatically, without any assistance from a contact center agent. Although speech technology has drastically improved during the past decade, it is not the “secret sauce” on its own that makes something as challenging as automating technical support for callers a reality. The integration of speech technology, the voice user interface and backend support systems makes it possible to provide an immersive caller experience. This deep level of engagement is achieved in part when an automated system demonstrates knowledge about the caller’s environment without exclusively asking questions.

More than just dialogue
Hold on…Isn’t the diagnostic and corrective action essence of technical support the result of sleuthing for a root cause from descriptions that can be quite vague? How do you accomplish that without asking many questions? In fact, most providers already possess the tools to identify specific caller attributes. For example, computer telephony equipment delivers a telephone number, billing systems can associate that with the make and model of the equipment in the home, and diagnostic systems will enumerate performance metrics and deliver corrective action.

Integration of these tools with Rich Phone Applications using voice recognition can add what is perceived by callers as a tremendous level of intelligence. An ability to deliver “personalized” prompts such as “I know you have a silver Scientific Atlanta and a black Motorola set-top box. Which one are you calling about?” increase effectiveness by reducing the possibility of a caller not immediately knowing specific details, and provide a higher level of confidence.

Systems integration...
Figure 1: Systems integration with capable dialogues.

Automated corrective action
Even more encouraging is fixing something without participation from a call center agent. Resetting an IP address or “hitting” a set-top is tangible evidence of effort converted to action. Voice prompts can also inform the caller of signal strength, and a variety of other physical characteristics specific to the caller’s equipment.

Systems capable of being integrated with a Web services approach provide one of the most efficient means for leveraging existing subscriber information or a support infrastructure. In some cases, digital providers have deployed “home-grown” solutions requiring custom integration. Closer ties to billing and diagnostic systems enabled at a later date will drive higher automation rates, as Figure 2 indicates. Note that integration also identifies repeat callers so that special handling can be applied.

Voice recognition 
Voice recognition solutions today are markedly different than those from only a few years ago. Improvements such as larger grammars to better match what callers are saying, robustness to noise, dynamic elements that help to personalize interactions and the advent of natural language have a direct impact on the caller experience.

Speech recognition, however, is not magic. Modern speech recognition relies on statistical models of language, from the basic sounds, to words and phrases. Statistical models are derived from data collected and analyzed from actual calls using sophisticated algorithms. Collection and the management of speech data captured during the lifecycle of a deployed system helps improve its performance, thus increasing the level of automation.

Rich Phone Applications
Evolution of some key voice recognition components is now allowing call automation to move beyond informational (exchange of information as in package tracking) and transactional (the performance of actual transactions as in travel ticketing or bank-ing) toward problem-solving. Problem-solving systems cannot be modeled with form filling. Problem-solving requires a deep understanding of the domain, a comprehensive mapping of symptoms, and the ability to backtrack to alternative resolution strategies when one fails. Natural language voice recognition helps by enabling automation to be applied in situations when it is not easy for callers to describe what they are experiencing. A good example is video troubleshooting. The service is not technically “out” so there is no binary on/off question to ask. Natural language recognition has the ability to understand what the caller means.

It is important to note the operative phrase “has the ability to” in the previous sentence. Developers must “train” natural language engines by transcribing large numbers of utterances extracted from actual calls, and analyzing the results for meaning. In the technical support space, this is especially challenging as there are many symptoms, with a virtually infinite number of descriptions.

Focusing on specific problem-solving areas can assist with increasingly better natural language recognition. For example, transcribing and classifying millions of calls over a period of years will yield far better results than simply doing the same for 1,000 calls. The difference will most readily be apparent in understanding a far greater range of expressions for the symptoms already covered.

Guiding the caller
The use of natural language is not the only answer. In fact, a combination of natural language and directed dialogue is probably a better choice. The doctor/patient analogy works well as a comparison. At the beginning of a visit, the doctor will let you describe a “symptom.” Once you have provided a description, the doctor will take over and direct the conversation by asking specific questions that require simple answers, most of them yes/no: “Does it hurt here?” A directed dialogue format helps the doctor further qualify the symptoms and drive toward a root cause.

The same is true for problem-solving speech applications. A caller might be instructed to initially “describe the problem in one short sentence.” After confirming the problem, the remainder of the interaction may very well be a directed dialogue where the caller will be instructed to say “yes” or “no,” or other simple phrases until the issue is resolved.

Engendering confidence
Keeping callers engaged with an automated system is not always easy. A number of techniques can be used to make callers feel that progress is being made, creating a caller to automated system partnership.

One of the most important techniques is confirming what the system understands at critical points. Distinction among a large number of symptoms generally improves the caller experience and reduces the number of opt-outs. Callers often will feel that the automated system did not recognize what they said if the confirmation prompt from the system is not a close match to what they said. Integration with a digital provider’s systems can contribute mightily to a good caller experience by injecting the feeling of personalization. For example, a voice prompt stating that that the system knows the caller has a cable modem from a specific vendor demonstrates knowledge important to the caller. Another helpful technique is to encourage callers to remain with the system at certain points: “We are almost done; let me try a few more things.” Issuing a note of encouragement will help to keep callers engaged.

What about change?

Change has previously struck fear into the heart of any manager contemplating a voice recognition solution. Successful technical support is all about being up-to-date on any new procedures or equipment. The cable market in particular is facing an ever-growing list of equipment vendors that will complicate technical support due to the tru2way initiative, designed to provide a common foundation for those selling equipment such as cable set-tops.

A hosted or managed service speech recognition solution facilitates frequent updates for content as well as speech. For example, a new cable modem introduced to a subscriber population will require call center agent training. On the other hand, new modem details can more quickly be injected into a managed voice recognition system to immediately handle all new modem calls.

Automation rates
Figure 2: Caller satisfaction and automation improves with integration.

Where do agents fit in?
Callers must be given an opportunity to reach a call center agent at any time. The very nature of technical support also makes agent access important, as an automated system may run through all reasonable efforts, with a truck roll ultimately being the only answer.

Automation and agent integration is essential for a good caller experience. Procedures should avoid asking callers the same question, whenever possible. Also, agents should be updated with what the automated system has accomplished. Callers should ideally experience a seamless transition between a self-help system and a contact center agent.

Examples at work
SpeechCycle’s LevelOne Broadband Agent is currently fielding hundreds of thousands of calls per month. It is tuned to solve customer issues such as lost, slow, or intermittent Internet, email problems, and providing assistance with PC setup and password reset.

ntegration with many of the same systems that contact center agents access enable SpeechCycle to determine important information such as the subscriber’s brand of cable modem, signal levels, and connectivity without asking the caller. Subscribers remain engaged with SpeechCycle’s Automated Agents for an average of six to eight minutes due to the specific questions and responses that are perceived by callers as personalized for their problem.

LevelOne Broadband also offers interactive assistance for troubleshooting that might include rebooting a network router or delving into the Windows Internet Connection Wizard. Callers are always in control, with an ability to have instructions repeated or reach a call center agent at any time.

In another application, digital phone (VoIP) subscribers do not have to wait for a call center agent when they are provisioning new services such as conferencing or call forwarding. Speech-enabled guidance within the LevelOne Digital Phone Agent provides interactive guidance that goes beyond FAQs.

Customer satisfaction and cost savings
Rich Phone Applications are capable of interacting with callers in a way that feels personalized. Natural language speech, coupled with intelligence derived from systems integration, is a foundation for an excellent caller experience. Consistency helps as well: Call center agents do not achieve the same level of training. Automated agents ensure the same level of support is provided to all callers. The ability to avoid hold time for agent availability is another caller benefit.

Cost savings are considerable when compared to a live agent pool. However, other benefits such as allowing agents to concentrate on higher value tasks or up-sell calls can be realized when automating technical support with a Rich Phone Application. The problem of scale with human agent availability is becoming a bigger factor. An ever-increasing number of complex devices and services make it more difficult to cost-effectively attract, retain and continually train a pool of agents.

Digital providers already have the systems in place to automate technical support calls with a new generation of voice recognition. Customer care managers can leverage these resources to automate calls that most have previously considered only in the realm of a call center agent. Managers should explore how Rich Phone Applications can provide a unique combination of customer satisfaction and cost savings.