Senior Fitness - Exercise and Nutrition for Aging Men and Women
FREE Article Feed for your website.
Home Ownership Magazine
Party Planning Information
Article Marketing Resources
Bio-Medical Research Article Database
Informative Articles on Life, Love and Happiness
Tutorials on Business to Writing
Famous Quotes from Famous People
Song Lyric Information
New US Patent Information
Comprehensive List of Content by Category
Online Auctions and Shopping Related Articles
Article Search
Most Recent Articles
 

Identify The Right Blogging Platform
Category:
Marketing  

Good Grief What Good is it
Category:
Home And Family  

How To Get Help With Medical Malpractice
Category:
Self Help  

Keep It Simple Stupid The Beauty of a Clean Cut Website
Category:
Computers  

Getting Keyword rich Content for Your Website
Category:
Computers  

What Is A Means Test Under Chapter 7 Bankruptcy
Category:
Home And Family  

PAS is what it takes to be successful in a home business when ma...
Category:
Business  

America Is it in Bible prophecies and what can the USA expect at...
Category:
Home And Family  

Financial Incentives for Your Business to Use Solar Power
Category:
Business  

Golf Clubs
Category:
Sports  

The Truth About Pants And Women
Category:
Fashion  

How To Sky Rocket Your Profits By 267 With One Simple Sentence
Category:
Marketing  

Electrolysis Hair Removal For The Brave
Category:
Health / Fitness  

1st Birthday Invitation
Category:
Entertainment / Television  

Why multilayer golf balls spin less off the tee and more around ...
Category:
Sports  

Leading Online Personal Trainer Reveals The Hidden Secrets to Sa...
Category:
Health / Fitness  

Personal loans UK customizing your future
Category:
Finance / Investment  

Acne Treatment The Natural Way
Category:
Health / Fitness  

Best Bass Fishing In Mexico
Category:
Sports  

Poker Mistakes and how to fix them
Category:
Sports  

Top 4 Ways In Which You Can Earn Major Income For Your Home Base...
Category:
Marketing  

A Look at Outdoor Ceiling Fans
Category:
Home And Family  

Stop Wasting Thousands of Dollars on Leads
Category:
Marketing  

Top 10 Most Surprising Diet Busters
Category:
Sports  

Why Become Lead Generation Expert
Category:
Marketing  

The Facts of Goal Setting
Category:
Business  

How Debt Consolidation Can Save Your Finances
Category:
Finance / Investment  

Adsense Pros and Cons
Category:
Marketing  

Modern Kayak Types
Category:
Sports  

Two Common Flooring Decorating Traps Almost All Customers Fall I...
Category:
Home And Family  

Fast Tracking to Mortgage Free
Category:
Finance / Investment  

Super Fast Fat Burning Workouts for Outdoors
Category:
Health / Fitness  

RFID buzz creates market for more secure services
Category:
Computers  

Personal Bankruptcy and Home Buying
Category:
Business  

Skin Care Tips
Category:
Health / Fitness  

How a Rocket Works
Category:
Home And Family  

You Have the Keywords You Have the Website but Do You Have the S...
Category:
Marketing  

Search Engine Optimization another perspective
Category:
Marketing  

The Viral Affect of Article Marketing
Category:
Marketing  

Good News There Are Legitimate Home based Business Opportunities...
Category:
Business  

Exotic Destinations
Category:
Travel  

Get In Shape For Golf
Category:
Sports  

King Quotes Garner Inspiration
Category:
Education  

How to Choose a Reputable Search Engine Optimization Content Wri...
Category:
Marketing  

The Heaven And the Earth
Category:
Writing  

Panic Disorder
Category:
Health / Fitness  

The North Cyprus Palace at Vouni
Category:
Travel  

An Anti Aging Skin Cream What To Look For
Category:
Health / Fitness  

Black jack basics
Category:
Hobbies / Pastimes  

Digital Music Recounts Music Therapy
Category:
Entertainment / Television  

Microsoft Dynamics AX GP NAV CRM trends international recommenda...
Category:
Computers  

No credit check personal loans when the best loan can t offset b...
Category:
Finance / Investment  

Internet marketing Make your presence felt online
Category:
Marketing  

How To Trade Stock Timing Is Everything
Category:
Finance / Investment  

Light Weight Camping Adventure Food
Category:
Hobbies / Pastimes  

Confused About Finding The Vacuum Steam Cleaner Reviews To Locat...
Category:
Home And Family  

Fundraising Ideas For Your Next Fundraiser
Category:
Finance / Investment  

Business Phones Keep Up With Changes In Technology
Category:
Business  

Snakes They re not as slithery as some people I know
Category:
Home And Family  

Selecting Furniture for a Play Room
Category:
Home And Family  

Values in The Family Before Civilization Begins
Category:
Pets  

Wedding Rice The Dark Side
Category:
Self Help  

Credit Card Insurance What Do They All Do
Category:
Business  

How Acid Reflux Can Worsen Your Asthma
Category:
Health / Fitness  

Homemade Bodybuilding Shakes How to Use Protein Powders Creative...
Category:
Sports  

Correctly Interpreting Your Website Traffic Statistics
Category:
Webmaster  

Cheque Book Loans provides access to instant cash
Category:
Finance / Investment  

Baby monitors How to Choose the Ideal Monitor for Your Baby
Category:
Home And Family  

Autism Disease disorder handicap or disability
Category:
Education  

Business Health Wealth and Workers
Category:
Business  

Article Plagiarism the Next Internet Ripoff
Category:
Marketing  

How To Compare Credit Card Offers
Category:
Business  

Understanding insurance policies
Category:
Finance / Investment  

XENICAL A POPULAR WEIGHT SUPPRESSANT PILL
Category:
Health / Fitness  

17 Important Things To Remember As You Prepare For An Interview
Category:
Home And Family

Out-of-vocabulary word determination and user interface for text input via reduced keypad keys Number:7,385,591 from the United States Patent and Trademark Office (PTO) owispatent

Home    Author Login    Submit Article    Article Search    Add Your Link    Edit Your Link    Contact Us    Advertising    Disclaimer

   

 
Web LinkGrinder.com

Top Breaking News
     Greek, Cypriot Leaders Resume Unification Talks in Nicosia by Nathan Morley
     Indonesia Tobacco Sales Grow, Raising Health Fears
     South Korea Allows Top Defector to Travel Overseas by VOA News

Title: Out-of-vocabulary word determination and user interface for text input via reduced keypad keys

Abstract: Out-of-vocabulary (OOV) word determination corresponding to a key sequence entered by the user on a (typically numeric) keypad, and a user interface for the user to select one of the words, are disclosed. A word-determining logic determines letter sequences corresponding to the entered key sequence, and presents the sequences within the user interface in which the user can select one of the letter sequences as the intended word, or select the first letter of the intended word. When letters are selected, the word-determining logic determines new letter sequences, consistent with the key sequence and the selected letters, and presents the new letter sequences. The user again selects one of the letter sequences as the intended word, or selects the second letter of the intended word. This process is repeated until the user has selected the intended word.

Patent Number: 7,385,591 Issued on 06/10/2008 to Goodman


Inventors: Goodman; Joshua T. (Redmond, WA)
Assignee: Microsoft Corporation (Redmond, WA)
Appl. No.: 09/823,585
Filed: March 31, 2001


Current U.S. Class: 345/172 ; 341/22; 345/168; 345/169
Field of Search: 345/156-173 341/22-28


References Cited [Referenced By]

U.S. Patent Documents
5818437 October 1998 Grover et al.
5953541 September 1999 King et al.
6011544 January 2000 Sato
6782357 August 2004 Goodman et al.
2002/0180689 December 2002 Venolia
2002/0188448 December 2002 Goodman et al.
2003/0023420 January 2003 Goodman

Other References

F Jelinek, "Self Organized Language Modeling for Speech Recognition", Language Processing for Speech Recognition, pp. 450-503. cited by other .
K. Seymore et al., "Scalable Backoff Language Models", In Proc. ICSLP, vol. 1, pp. 232-235, Philadelphia, 1996. cited by other .
Stolcke, "Entropy-based Pruning of Backoff Language Models", Proc. DRAPA News Transcription and Understanding Workshop, pp. 270-274, Lansdowne, VA. cited by other .
Chen et al., "An Empirical Study of Smoothing Techniques for Language Modeling", TR-10-98, Computer Science Group, pp. 1-64, Harvard University, 1998. cited by other .
Lehikoinen et al., "BinScroll: A Rapid Selection Technique for Alphanumeric Lists", CHI 2000, pp. 261-262, Apr. 1-6, 2000. cited by other .
Ahlberg et al., "The Alphaslider: A Compact and Rapid Selector", Proc. CHI 94, p. 365-371. cited by other.

Primary Examiner: Lewis; David L.
Attorney, Agent or Firm: Westman, Champlin & Kelly, P.A.

Claims



I claim:

1. A method for selecting an intended word entered using a reduced keypad, where each of one or more keys of the reduced keypad is mapped to a plurality of letters, the method comprising: for an entered key input indicative of pressing one or more keys in the reduced keypad using a single-tap approach in which one of the keys mapped to a plurality of letters is pressed only once for each letter such that each key press is mapped to only one letter, determining one or more sequences of letters as the intended word based on a score for each of the one or more sequences of letters; and presenting the one or more sequences of letters as the intended word, where a user selects the intended word from the one or more sequences of letters, without resorting to a multiple-tap approach in which one of the keys mapped to a plurality of letters is pressed at least once for each letter, such that a number of times one of the keys is pressed indicates only one letter, and where the user can indicate, without resorting to the multiple-tap approach, an accepted one or more initial letters of the intended word from the one or more sequences of letters, the one or more initial letters having less letters than the intended word, to cause redetermination of the one or more sequences of letters presented as the intended word as a function of the accepted one or more initial letters.

2. The method of claim 1, wherein the reduced keypad is a numeric keypad.

3. The method of claim 1, wherein the sequences of letters each corresponds to a word not listed in a predetermined dictionary.

4. The method of claim 1, wherein the sequences of letters each corresponds to a pseudo-word.

5. The method of claim 1, further comprising receiving selection of the intended word from the user from the one or more sequences of letters.

6. The method of claim 1, further comprising: receiving indication of a first letter of the intended word from the user; and repeating the method such that the one or more sequences of letters are redetermined taking into account the first letter of the intended word indicated by the user.

7. The method of claim 6, further comprising: receiving indication of a second letter of the intended word from the user; and repeating the method such that the one or more sequences of letters are redetermined taking into account the first and the second letters of the intended word indicated by the user.

8. The method of claim 1, wherein the user has accepted a number of letters of the intended word, the number equal to zero or more, and determining the one or more sequences of letters comprises determining the one or more sequences of letters consistent with the entered key input and the number of letters accepted by the user.

9. The method of claim 8, wherein the one or more sequences of letters comprises a sequence of letters for each letter corresponding to a number within the entered key input immediately after a part of the entered key input corresponding to the number of letters accepted by the user.

10. The method of claim 9, wherein the sequences of letter for each letter corresponding to the number within the entered key input immediately after the part of the entered key input corresponding to the number of letters accepted by the user comprises a most likely sequence of letters for each letter corresponding to the number within the entered key input immediately after the part of the entered key input corresponding to the number of letters accepted by the user.

11. The method of claim 10, wherein the most likely sequence of letters for each letter corresponding to the number within the entered key input immediately after the part of the entered key input corresponding to the number of letters accepted by the user is determined by using a letter language model.

12. The method of claim 11, wherein using the letter language model comprises using an n-gram letter model.

13. The method of claim 1, wherein determining the one or more sequences of letters comprises using a letter language model.

14. The method of claim 13, wherein using the letter language model comprises using an n-gram model.

15. The method of claim 1, further comprising receiving the entered key input.

16. The method of claim 1, further comprising: determining a word corresponding to the entered key input as the intended word; determining whether the word determined is in a dictionary of words; and ending the method in response to determining that the word determined is in the dictionary of words.

17. The method of claim 1, wherein the method is performed by execution of a computer program by a processor from a computer-readable medium.

18. A computer-readable medium having instructions stored thereon for execution by a processor to perform a method for selecting an intended word entered using a reduced keypad, where each of one or more keys of the reduced keypad is mapped to a plurality of letters, the method comprising: repeating, for an entered key input, a user having accepted a number of letters of the intended word, the number equal to zero or more and less than a number of letters of the intended word, determining one or more sequences of letters as the intended word consistent with the entered key input and the number of letters accepted by the user; presenting the one or more sequences of letters as the intended word to the user; and receiving indication that an additional one of the letters of the intended word has been accepted by the user, such that the number of letters of the intended word accepted is increased by one, until indication has been received that the user has selected one of the one or more sequences of letters presented as the intended word.

19. The medium of claim 18, wherein the reduced keypad is a numeric keypad.

20. The medium of claim 18, where the sequences of letters each corresponds to one of a word not listed in a predetermined dictionary and a pseudo-word.

21. The medium of claim 18, wherein the one or more sequences of letters comprises at least one sequence of letters for each letter corresponding to a number within the entered key input immediately after a part of the entered key input corresponding to the number of letters accepted by the user.

22. The medium of claim 21, wherein the at least one sequence of letters for each letter corresponding to the number within the entered key input immediately after the part of the entered key input corresponding to the number of letters accepted by the user comprises a most likely sequence of letters for each letter corresponding to the number within the entered key input immediately after the part of the entered key input corresponding to the number of letters accepted by the user.

23. The medium of claim 18, wherein the one or more sequences of letters is determined by using a letter language model.

24. The medium of claim 23, wherein using the letter language model comprises using an n-gram letter model.

25. The medium of claim 18, the method further comprising receiving the entered key input.

26. The medium of claim 18, the method further comprising: determining a word corresponding to the entered key input as the intended word; determining whether the word determined is in a dictionary of words; and, ending the method in response to determining that the word determined is in the dictionary of words.

27. A method for selecting a word entered using a reduced keypad, where each of one or more keys of the reduced keypad is mapped to a plurality of letters, the method comprising: receiving key input corresponding to the word, the key input having a left context; for each word in a vocabulary that is consistent with the key input, determining a probability of the word given the left context, and adding the word and the probability of the word to an array of word-probability pairs; finding one or more potential words from a dictionary of words, where each potential word has a cost between the entered key input and a sequence corresponding to the potential word less than a maximum cost; determining a probability of each potential word given the left context and taking into account a probability that each letter of the potential word is misspelled, and adding the potential word and the probability of the word to the array; determining one or more sequences of letters consistent with the entered key input and a number of letters accepted by a user, the number equal to zero or more, the one or more sequences of letters including at least one sequence of letters for each letter corresponding to a number within the entered key input immediately after a part of the entered key input corresponding to the number of letters accepted by the user; determining a probability of each sequence of letters taking into account an out-of-vocabulary penalty and a first occurrence bonus, and adding the sequence of letters and the probability of the sequence of letters to the array; sorting the array of word-probability pairs in decreasing order of probability; and presenting a first number of words from the array of word-probability pairs to the user, where the user selects the word corresponding to the entered key input from the first number of words presented and where the user can indicate additional letters have been accepted to increase the number of letters accepted by the user, wherein the number of letters accepted is less than a number of letters in the word, and to cause redetermination of the one or more sequences of letters based on the letters accepted.

28. The method of claim 27, wherein the reduced keypad is a numeric keypad.

29. The method of claim 27, further initially comprising, for each word in a cache that is consistent with the key input, determining a probability of the word given the left context, and adding the word and the probability of the word to an array of word-probability pairs.

30. The method of claim 27, further comprising: for each word in the vocabulary that is consistent with the key input as an initial part of the word, determining a probability of the word given the left context, and, upon determining that the probability is greater than a greatest probability so far determined, setting the greatest probability to the probability and a greatest probability word associated with the greatest probability to the word; upon determining that the greatest probability is at least a number of times greater than a word of a first word-probability pair of the array of word probability-pairs, adding the greatest probability word associated with the greatest probability and the greatest probability a new first word-probability pair to the array.

31. The method of claim 27, further comprising: finding one or more additional potential words from the dictionary, where each additional potential word has a cost between the entered key input and a prefix of a sequence corresponding to the potential word less than a maximum cost; determining a probability of each potential additional word given the left context and taking into account a partial word penalty, and upon determining that the probability is greater than the greatest probability so far determined, setting the greatest probability to the probability of the potential additional word and the greatest probability word associated with the greatest probability to the potential additional word.

32. The method of claim 27, wherein the one or more sequences of letters are determined by using a letter language model.

33. The method of claim 31, wherein using the letter language model comprises using an n-gram letter model.

34. The method of claim 27, wherein the method is performed by execution of a computer program by a processor from a computer-readable medium.

35. An apparatus comprising: a plurality of keys of a reduced keypad, each of one or more of the keys mapped to a plurality of letters, the plurality of keys used to enter key input corresponding to a word using a single-tap approach in which one of the keys mapped to a plurality of letters is pressed only once for each letter, the key input having at least one of a left context and a right context; and, a word-determining logic designed to determine one or more sequences of letters as the word and to present the one or more sequences of letters, where a user selects the word corresponding to the key input from the one or more sequences of letters without resorting to a multiple-tap approach in which one of the keys mapped to a plurality of letters is pressed at least once for each letter, and where the user can indicate without resorting to the multiple-tap approach an accepted one or more initial letters of the word from the one or more sequences, the one or more initial letters having less letters than the word, to cause redetermination of the one or more sequences of letters presented as a function of the accepted one or more initial letters.

36. The apparatus of claim 35, wherein the reduced keypad is a numeric keypad.

37. The apparatus of claim 35, further comprising a spell-checking logic designed to provide potential alternative words for the word corresponding to the key input entered, where the word is misspelled, taking into account that the word was entered using the plurality of keys, as opposed to a keyboard having a unique key for each of the plurality of letters.

38. The apparatus of claim 37, wherein the spell-checking logic is further to determine one or more potential words to the word where the word is not found in a dictionary of words, by at least finding the one or more potential words from the dictionary, each potential word having a cost between the key input and a sequence corresponding to the potential word less than a maximum cost.

39. The apparatus of claim 35, wherein the word-determining logic is further designed to determine the word corresponding to the key input by using a machine learning approach based on one or more of the at least one of the left context and the right context of the key input.

40. The apparatus of claim 39, wherein the spell-checking logic is part of the word-determining logic.

41. The apparatus of claim 35, wherein the apparatus is a telephone.

42. The apparatus of claim 41, wherein the apparatus is a mobile telephone.

43. The apparatus of claim 41, wherein the apparatus is one of: a cellular telephone, a corded telephone, a cordless telephone, a digital telephone, and a radio telephone.

44. The apparatus of claim 35, wherein the apparatus is one of: a pager, a desktop computer, a laptop computer, a handheld device, a personal-digital assistance (PDA) device, and a remote control device.

45. The apparatus of claim 35, wherein the word-determining logic comprises a computer program stored on a computer-readable medium for execution by a processor.
Description



FIELD OF THE INVENTION

The invention relates generally to text input using a reduced keypad, such as a numeric keypad, and more particularly to determining out-of-vocabulary words, and presenting a user interface to allow the user to select one of the words, for text input entered using this keypad.

BACKGROUND OF THE INVENTION

Mobile phones, and other devices having only a limited set of input keys, have become increasingly popular. While the numeric keys of a mobile phone are adequate for entering phone numbers and other number sequences, they are difficult to use for entering text. A standard keyboard has keys for both letters and numbers, whereas the numeric keys of a mobile phone have no intuitive way by which to enter text. Text may need to be entered on such devices, for example, to associate a name with a phone number in an address book. Since mobile phones and other such devices are becoming more popular for accessing the Internet, such as to browse web sites and send and receive email, this limitation will likely become increasingly acute in the future.

Currently, there are two common ways to achieve text input using numeric keys, a multiple-tap approach, and a single-tap approach. With the multiple-tap approach, a user presses a numeric key a number of times to enter the desired letter, where most of the numeric keys are mapped to three or four letters of the alphabet. For example, the two key is usually mapped to the letters A, B, and C. If the user presses the two key once, the letter A is entered. If the user presses the two key twice, the letter B is entered, and if the user presses the two key three times, the letter C is entered. Pauses between entry of successive letters of a word are sometimes necessary so that the device knows when to advance the cursor to the next letter-entry position. For example, to enter the word "cab," the user presses the two key three times to enter the letter C, pauses, presses the two key once to enter the letter A, pauses again, and presses the two key twice to enter the letter B. To enter numbers, symbols, or switch between upper- and lower-case letters, typically other keys that are present on numeric keypads, such as the pound ("#") and asterisk ("*") keys, among other keys, are mapped for these purposes.

While the multiple-tap approach is usable in that users can enter any word using only the numeric keys, it is disadvantageous for quick and intuitive text entry. A word such as "cab" that only requires three key presses on a standard keyboard, one for each letter, requires six key presses on numeric keys using the multiple-tap approach. As compared to using a standard keyboard, using numeric keys with the multiple-tap approach to achieve text entry means that the user presses many keys even for short messages. Furthermore, errors can be frequent. For example, if the user intends to enter the letter B, but pauses too long between the first and the second presses of the two key, two letters A will be entered instead. The device in this case interprets the pause as the user having finished with the current letter entry, an A, and proceeds to the next letter-entry position, where it also enters an A.

Another approach to text entry using numeric keys is the single-tap-dictionary approach, an approach popularized by a company called Tegic. Under the single-tap approach, the user presses the numeric key associated with the desired letter once, even though the numeric key may be mapped to three or four different letters. When the user is finished entering a number sequence for a word, the device attempts to discern the word that the user intended to enter, based on the number sequence. Each number sequence is mapped to a common word that corresponds to the sequence. For example, the number sequence 43556 can potentially correspond to any five-letter word having a first letter G, H, or I, since the four key is usually mapped to these letters. Similarly, the sequence potentially corresponds to any five-letter word having a second letter D, E, or F, a third and fourth letter selected from the letters J, K, and L, and a fifth letter M, N, or O, since the three, five, and six keys are usually mapped to these respective letters. However, because the most common five-letter word corresponding to the number sequence 43556 is the word "hello," the single-tap approach may always enters this word when the user presses the four, three, five, five, and six keys in succession to input this number sequence.

The single-tap approach has advantages over the multiple-tap approach, but presents new disadvantages. Advantageously, the single-tap approach ensures that the user only has to press the same number of keys as the number of letters in a desired word. For example, the multiple-tap approach requires the user to press the two key six times to enter the word "cab." Conversely, the single-tap approach potentially only requires the user to press the two key three times to enter this word, assuming that the number sequence 222 is mapped to the word "cab." Therefore, the single-tap approach is more key-efficient than the multiple-tap approach for text entry using numeric keys. It is as key-efficient as using a standard keyboard that has a single key for each letter.

The single-tap approach is disadvantageous in that the word mapped to a given number sequence may not be the word the user intended to enter by inputting the sequence. For example, the numeric key sequence 7333 corresponds to both the words "seed" and "reed." Because only one word is mapped to each numeric key sequence, the word "seed" may be entered when the user keys in the numeric key sequence 7333, whereas the user may have intended to enter the word "reed." The single-tap approach is primarily useful where there is only one unique word for a given numeric key sequence, or, if there are a number of words for a given sequence, when the user wishes to input the most common word associated with the sequence. For entry of uncommon words corresponding to number sequences to which words that are more common also correspond, the approach is less useful. The single-tap approach is also not useful for the entry of all but the most common proper names, and scientific, legal, medical, and other specialized terms, all of which will not usually be mapped to number sequences. Where the word mapped by the single-tap approach is not the intended word, text entry may revert back to the multiple-tap approach, or to an error-correction mode. Ultimate text entry of the intended word may then require more keystrokes than if the user had started with the multiple-tap approach.

The problem of a given number sequence mapping to multiple words is referred to as the ambiguity limitation of the single-tap approach. Some prior art approaches exist to overcome this limitation by attempting to disambiguate the intended word when the user enters a number sequence that corresponds to more than one word. One disambiguation approach is to show the user a number of different words that correspond to the entered number sequence, in order of decreasing frequency of use--that is, in decreasing order of how common the different words are. The user then selects a word from the list. This approach is described in detail in U.S. Pat. No. 5,953,541, issued on Sep. 14, 1999. The primary disadvantage to this disambiguation approach is that after the user has entered the number sequence, he or she is forced to expend additional effort reviewing the presented list of words, and selecting the desired word from the list. While this may be better than forcing the user back into a multiple-tap approach to reenter the intended word with additional keystrokes, it still can considerably delay text entry using numeric keys.

An improvement to this disambiguation approach is described in detail in U.S. Pat. No. 6,011,554, issued on Jan. 4, 2000, and which is a continuation-in-part of the U.S. patent application that issued as U.S. Pat. No. 5,818,437 on Oct. 6, 1998. Under the improved disambiguation approach, the word corresponding to the entered number sequence that has the highest frequency of use is automatically selected by default when the user begins to enter a new number sequence using the numeric keys. This is advantageous because, if the user's intended words are those having the highest frequency of use for the entered number sequences, the user does not have to select them from presented lists. However, at best occasionally, and at worst frequently, the user still has to select the desired word from a list, when the desired word is not the word with the highest frequency of use for the entered number sequence. This means that text entry delays are still inevitable even with this improved disambiguation approach.

Perhaps the primary disadvantage to either the original disambiguation approach, or the improved disambiguation approach, is that the order of words presented in the list intrinsically depends on only the current number sequence entered by the user. The described disambiguation approaches only consider the frequency of use of the words that correspond to the current number sequence in ordering the list of words from which the user can select a desired word. For a given number sequence entered, the list of words presented to the user is always the same. Therefore, using one of the previously described examples, when the user enters the number sequence 7333, if the word "seed," which corresponds to this number sequence, has a higher frequency of use than the word "reed," which also corresponds to the sequence, the former word is always displayed in the list ahead of the latter word. The list of words does not take into account that in some situations the word "reed" is a better choice than the word "seed." As an example, if the user is entering the sentence "The first reed is shorter than the second reed," the device will present the user with the word "seed" for both the first and the second time the user enters in the sequence 7333 for the intended word "reed." The device does not discern that if the user has most recently selected the word "reed" for the sequence 7333, the user more likely wishes to enter this word, and not "seed," when entering the sequence again.

Another disadvantage of these approaches is that they do not take into account the user making a mistake when entering in a word using the numeric keys. For example, the user may have intended to enter the word "relief" using the single-tap approach. The user should have entered the number sequence 735433, but instead entered the number sequence 735343, which corresponds to the incorrect spelling "releif" of this word. When the current single-tap approaches encounter the number sequence 735343, they may map the number sequence to an actual word that has this number sequence. Because the single-tap approach is an ambiguous manner by which to enter words, the number sequence may correspond to other words, besides the incorrect spelling of the word "relief." For example, the number sequence 735343 corresponds to an alternative name Peleid for the Greek mythological hero Achilles. Even though it is more likely that the user had intended to enter the word "relief," and just misspelled the word, the single-tap approach, if the word "Peleid" is in its dictionary, is likely to propose this word as the word the user had intended to enter.

Furthermore, current spell checking approaches, such as those used in word processing programs, do not operate well in the ambiguous environment of text entry using numeric keys. These spell checking approaches operate on the letters of the word, and therefore most assume, at least implicitly, that the word has been entered using a standard keyboard having a unique key for each letter of the alphabet. As an example, sophisticated spell checking approaches may determine that when the user has entered the nonsensical word "xome," he or she really meant to enter the word "come." This is because the X key is next to the C key in a standard keyboard, such that the user may have accidentally, and easily, pressed the latter key instead of the former key.

These sophisticated spell checking approaches do not carry over very well to text input entered using numeric keys via the single-tap approach. For example, the non-word "xome" has the number sequence 9663, whereas the word "come" has the number sequence 2663. Determining that the user had entered the word "come" instead of the word "xome" in this case is likely incorrect, since the 2 key is far away from the 9 key on most numeric keypads. For example, the user is more likely to have intended to enter the number sequence 8663, corresponding to the word "tome." Furthermore, the single-tap approach in the first instance is likely to map the entered number sequence 9663 to the common word "wood," such that the spell checking approach would never even be given the opportunity to provide alternative words.

As has been indicated, another disadvantage to the prior art single-tap approaches is that the user expends an inordinate effort to enter words not in the vocabulary being used. For example, the user may intend to enter the word "iolfen," and therefore enters the number sequence 465336 using the numeric keys. However, this number sequence may correspond to the word "golden" in the vocabulary being used. When presented with the word "golden," the user is likely to have to revert to a multiple-tap approach to enter the intended word "iolfen." As a result, the user expends considerable effort to enter the desired word. First, the user uses the single-tap approach, entering the number sequence 465336. When the desired word "iolfen" is not presented, then the user must re-enter the word using the multiple-tap approach. Ultimate text entry of the intended word may thus require more keystrokes than if the user had started with the multiple-tap approach.

Furthermore, the improved disambiguation approaches that have been described are not useful in situations where the intended word of the user is not in the vocabulary being used. Referring to the example of the previous paragraph, when the user enters the number sequence 465336, the improved disambiguation approaches may have in their vocabulary two words that map to this sequence, the word "golden," and the word "holden." The word "golden" may have a higher frequency of use than the word "holden," and therefore is selected by default. The user is then given the opportunity to alternatively select the word "holden." The intended word "iolfen," if not mapped to the number sequence 465336, will not be presented to the user as one of the choices. The improved disambiguation approaches, in other words, are not useful in situations where the user enters a number sequence for a particular intended word that is not mapped to the number sequence in the vocabulary. In this situation, the user again is likely to have to reenter the word using the multiple-tap approach. For these reasons, as well as other reasons, there is a need for the present invention.

SUMMARY OF THE INVENTION

The invention relates to determining out-of-vocabulary (OOV) words corresponding to a sequence on a reduced keypad, such as a number sequence entered by the user on a numeric keypad, as well as a user interface to enable the user to select one of the words. Most of the keys are mapped to three or four letters. For example, on a numeric keypad, the two key is usually mapped to the letters A, B, and C. The user uses a single-tap approach to enter a number sequence corresponding to an intended word. A word-determining logic determines letter sequences corresponding to the number sequence, and presents the sequences within a user interface in which the user can select one of the letter sequences as the intended word, or select the first letter of the intended word. When letters are selected, the word-determining logic determines new letter sequences, consistent with the number sequence and the selected letters, and presents the new letter sequences. The user is then afforded the opportunity to again select one of the letter sequences as the intended word, or select the second letter of the intended word. This process is repeated until the user has selected the intended word from the letter sequences presented.

For example, the user may intend to enter the word "iel," which has a number sequence 435. The word-determining logic determines and presents three letter sequences, "gel," "hek," and "ifk," which are all consistent with the number sequence 435. Because the intended word is not in this list of sequences, the user instead accepts the first letter of the sequence "ifk." The word-determining logic determines three new letter sequences, "iel," "ifj," and "idl." Each of these letter sequences is consistent with the first letter that has been accepted by the user, the letter I, as well as with the number sequence 435. Because the intended word is now in the list of sequences, the user selects the sequence "iel" as the intended word.

The out-of-vocabulary word determination and user interface of the invention is advantageous in situations where text entry is accomplished via numeric keys. The user never has to resort to the multiple-tap approach to enter a word that is not in the vocabulary, or dictionary, being used. Rather, the user only has to repeatedly select letters of the intended word, from the letter sequences presented to the user, until the intended word is displayed. The result is that entry of OOV words, such as specialized legal and medical terms, and proper names, are quickly entered using numeric keys.

Methods and devices of varying scope are encompassed by the invention. Other aspects, embodiments and advantages of the invention, beyond those described here, will become apparent by reading the detailed description and by referencing the drawings. The invention is substantially described with respect to a numeric keypad. However, the invention itself is applicable to any set of reduced keys, referred to generally as a reduced keypad. A reduced keypad is defined non-restrictively as a number of keys, where each of one or more of the keys is mapped to, or corresponds to, more than one letter. For example, a numeric keypad is a reduced keypad, because typically most of the number keys are mapped to three or four different letters.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 is a diagram of an example device apparatus according to an embodiment of the invention.

FIG. 2 is a flowchart of a method summarizing the invention.

FIG. 3 is a flowchart of a method showing how one embodiment implements the out-of-vocabulary (OOV) word determination and user interface of FIG. 2 in more detail.

FIG. 4 is a flowchart of a method showing how one embodiment implements the OOV word determination of FIG. 3 in more detail.

FIGS. 5a and 5b are flowcharts of a method showing how one embodiment implements the OOV word determination of FIG. 4 in more detail.

FIG. 6 is a flowchart of a method showing how one embodiment integrates the OOV word determination and user interface with spell checking and contextual word determination.

FIGS. 7a and 7b are flowcharts of a method showing how one embodiment implements the word determination of FIG. 6 in more detail.

FIGS. 8 and 9 are flowcharts of methods showing how one embodiment implements the spell checking of FIGS. 7a and 7b in more detail.

FIG. 10 is a flowchart of a method showing how one embodiment implements the partial word determination of FIGS. 7a and 7b in more detail.

DETAILED DESCRIPTION OF THE INVENTION

In the following detailed description of exemplary embodiments of the invention, reference is made to the accompanying drawings that form a part hereof, and in which is shown by way of illustration specific exemplary embodiments in which the invention may be practiced. These embodiments are described in sufficient detail to enable those skilled in the art to practice the invention. Other embodiments may be utilized, and logical, mechanical, electrical, and other changes may be made without departing from the spirit or scope of the present invention. The following detailed description is, therefore, not to be taken in a limiting sense, and the scope of the present invention is defined only by the appended claims.

Overview

FIG. 1 is a diagram 100 showing an example device 102 according to an embodiment of the invention. The device 102 can be a telephone, such as a mobile phone, a cordless phone, a corded phone, a radio phone, or another type of telephone.

The device 102 can also be a device other than a telephone. For example, the device 102 may be a computer, such as a desktop computer, a laptop computer, a handheld computer, or another type of computer. As another example, the device 102 may be a handheld device such as a personal-digital assistant (PDA) device, a remote control, a pager, or another type of device.

The device 102 has at least a display 104, a (typically numeric) keypad 106, and a word-determining logic 108. The device 102 may have other components besides those shown in FIG. 1. The display 104 is used to convey information visually to the user. This information can include visual feedback regarding the entry the user is effecting by pressing keys on the numeric keypad 106. The display 104 is typically a small, flat display, such as a liquid crystal display (LCD). However, the display 104 can also be a larger display, such as a cathode-ray tube (CRT) display, or another type of larger display, such as a larger LCD or other flat-panel display (FPD).

The numeric keypad 106 includes a number of numeric keys, as well as other types of keys. In general, the numeric keypad 106 is distinguished from a standard keyboard in that it does not have a unique key for each letter. As such, the numeric keypad 106 is referred to as having a reduced or a limited set of keys. In particular, the numeric keypad 106 has the following number keys: a one key 110a, a two key 110b, a three key 110c, a four key 110d, a five key 110e, a six key 110f, a seven key 110g, an eight key 110h, a nine key 110i, and a zero key 110j. The numeric keypad 106 also has an asterisk key 110k, and a pound sign key 1101. The numeric keypad 106 may also have other specialized keys beyond those shown in FIG. 1, or fewer keys than those shown in FIG. 1. The layout of the keys of the numeric keypad 106 as shown in FIG. 1 is typical of that found on most telephones, such as mobile phones. The keys of the numeric keypad 106 may be real, physical keys, or virtual, soft keys displayed on the display 104, where the display 104 is a touch-sensitive screen.

All of the number keys of the numeric keypad 106, except for the one key 110a and the zero key 110j, correspond to three or four letters of the alphabet. The two key 110b corresponds to the letters A, B, and C. The three key 110c corresponds to the letters D, E, and F. The four key 110d corresponds to the letters G, H, and I. The five key 110e corresponds to the letters J, K, and L. The six key 110f corresponds to the letters M, N, and O. The seven key 110g corresponds to the letters P, Q, R, and S. The eight key 110h corresponds to the letters T, U, and V. Finally the nine key 110i corresponds to the letters W, X, Y, and Z. That a given number key corresponds to three or four specific letters means that the number key is pressed one or more times by the user to signify input of any of the specific letters. Punctuation characters such as .about.!@#$?*()_+-+\|; :'",./<>? may be included either on unused keys, such as the one key 110a, or may be included also on the other number keys, along with the letters.

In the context of the invention, the user uses the numeric keys of the numeric keypad 106 to enter a number sequence corresponding to a word using the single-tap approach. For each letter the user wishes to enter, the user presses the numeric key corresponding to the letter. For example, to enter the word "hello," the user presses the four key 110d, the three key 110c, the five key 110e twice, and the six key 110f, in succession. Because the number sequence entered, 43556, may correspond to other words than the word "hello," the intended word is ambiguous. The device 102 therefore employs a word-determining logic 108 to disambiguate the word. The logic 108 is designed to determine the word or words corresponding to numeric key input entered by the user on the numeric keypad 106.

The logic 108 can make this determination based on the context of the numeric key input. The logic 108 examines the words, or their corresponding number sequences, that have already been entered to the left and/or the right of the current numeric key input to assist in determining what word the user intended to enter with the current numeric key input. The logic 108 may display the results of its determination on the display 104. The logic 108 uses in general a machine learning approach to determine the word corresponding to the current numeric key input based on the context. The word-determining logic 108 can be implemented as a computer program stored on a computer or machine-readable medium such as a memory, and executed by a processor.

The word-determining logic 108 may also include a spell-checking logic. The spell-checking logic corrects spelling errors given the unique nature of reduced-key text input. For example, a given key combination may correspond to a common misspelling of a word, or the user may have mistyped one number key for another number key. This is a more difficult spelling error to detect than typical spelling errors entered using standard keyboards, because it is more difficult to discern what word the user had intended to enter. The spell-checking logic examines misspellings directly on the number sequence entered, instead of converting the number sequence to a letter sequence, and then examining misspellings. The spell-checking logic may be separate from the word-determining logic 108.

The word-determining logic 108 preferably includes an out-of-vocabulary (OOV) user interface. The OOV user interface enables the user to select words that are not present in the vocabulary used by the logic 108, without resorting to the multiple-tap approach. For example, where the user enters the numeric key input 465336, the word-determining logic 108 may determine that the user intends the word "golden" or the word "holden," depending on the context in which the user has entered the numeric key input. If, however, the user intends the word "iolfen," this word is likely not to ever be selected by the logic 108 if it is not in the vocabulary used by the logic 108. Therefore, the OOV user interface enables the user to select the word "iolfen" without resorting to the multiple-tap approach.

FIG. 2 shows a flowchart of a method 200 that presents the overall OOV user interface approach followed by one embodiment of the invention. In 202, numeric key input corresponding to a word is received. The input may have a context. In 204, the word is determined, for example, by the word-determining logic 108 of the device 102 of FIG. 1. If the word determined in 204 is in the dictionary, or vocabulary, used by the logic 108, then the method proceeds from 206 to 208, where the method is finished. This is because the OOV user interface logic may not be necessary in all embodiments of the invention where the word corresponding to the entered numeric key input can be unambiguously determined to some degree.

Alternatively, the OOV user interface may always be invoked, for example, to allow the user to select a word that is not in the dictionary, even where the numeric key input entered corresponds to a word that is in the dictionary. The word determination performed in 204 and the resulting checking in 206 may be optional, and the method 200 proceeds directly from 202 to 210. One alternative embodiment that always invokes the OOV user interface is described in a succeeding section of the detailed description.

In 210, if the numeric key input likely corresponds to an OOV word, one or more pseudo-words that may be the word intended by the user are determined. The pseudo-words are more generally sequences of letters that correspond to the numeric key input entered by the user, as received in 202. While these phrases are used somewhat interchangeably, the phrase sequences of letters is more accurate in that it includes sequences of letters that correspond to words, as well as sequences of letters that do not correspond to words. Conversely, the phrase pseudo-words may be interpreted as only the latter, or as sequences of letters that correspond to words that are not in the dictionary, or vocabulary.

The pseudo-words are presented to the user in 210, and the user is allowed to select the intended word from the sequences of letters presented. If the intended word is not present within the sequences of letters displayed to the user, the user also has the ability to select the first letter of the intended word. This causes the pseudo-words to be redetermined, where the new pseudo-words all have the first letter indicated by the user. If the intended word is still not in the pseudo-words presented to the user, the user repeats this process with the second and subsequent letters, until the intended word is presented to and selected by the user.

This OOV word determination and user interface approach is shown in more detail in the method 210 of the flowchart of FIG. 3. In 300, a position variable is initialized to zero. The position variable indicates the number of letters that the user has accepted. It is initialized to zero because at first the user has not selected any letters of the intended word. In 302, pseudo-words, which are more generally sequences of letters, are determined. The pseudo-words are consistent with the user-selected letters through the position variable. There is also preferably a pseudo-word for each letter corresponding to the number of the entered numeric key input at the position variable plus one. The pseudo-words determined in the first iteration through the method 210 therefore include a pseudo-word for each letter corresponding to the first number of the entered numeric key input. This is because the user has not yet selected any letters, such that the position variable is zero.

As an example, the user may have entered the numeric key input 6883435337 to correspond to the infrequent word "mutegelder." In the first iteration of 302, a pseudo-word is determined for each letter corresponding to the number 6 of the numeric key input. Because the number 6 maps to the letters M, N, and O, these pseudo-words may include the sequences of letters "mtvehelder," "outfielder," which is an actual word, and "nutehelder."

The pseudo-words determined in 302 are then presented to the user in 304. For example, there may be a single entry displaying the most likely pseudo-word corresponding to the entered numeric key input, where the user has the option of viewing the other pseudo-words as well. In 306, the user is given two options when examining the pseudo-words. First, he or she can select one of the pseudo-words as the intended word. If this is the case, the method proceeds from 308 to 310, where it is finished. Second, the user can select the next letter of the intended word. In the case of the first iteration through the method 210, the next letter is the first letter. In general, the next letter is the position within the word denoted by the position variable plus one. Where the user has indicated the next letter of the intended word, the method 210 proceeds from 308 to 312, where the position variable is incremented by one, and the method goes back to 302.

Continuing the previous example, the user, presented with the pseudo-words "mtvehelder," "outfielder," and "nutehelder," selects the letter M of the pseudo-word "mtvehelder," because the intended word is "mutehelder." This causes re-determination of the pseudo-words. The new pseudo-words are consistent with the letters accepted by the user, which in this case means that all the new pseudo-words start with the letter M. The new pseudo-words are also consistent with the entered numeric key input 6883435337. There is a new pseudo-word for each of the letters T, U, and V that correspond to the second number within the entered numeric key input, eight. It is noted that this is required so that if the pseudo-words presented to the user do not include the intended word, the user is able to select the next letter of the intended word.

The new pseudo-words presented to the user may be "mtvehelder," "mutehelder," and "mvudgelder." Because the user's intended word is "mutegelder," the user will in subsequent iterations of the method 214 select the letters U, T, and E of the pseudo-word "mutehelder." At this point, new pseudo-words are determined that are consistent with the accepted letters M, U, T, and E, in that order, and that are also consistent with the numeric key input 6883435337. There is a new pseudo-word for each of the letters G, H, and I that correspond to the fifth number within the entered numeric key input, four. This is because the user has accepted the first four letters, such that the position variable is equal to four, and the position variable plus one is equal to five. The new pseudo-words may include the words "mutehelder," "muteiedler," and "mutegelder." Because the last pseudo-word is the intended word of the user, the user selects this word, ending the method 214.

The user may navigate the pseudo-words presented and select letters or a pseudo-word by using special keys commonly found on the device 100 of FIG. 1. For example, there may be a down arrow key, a left arrow key, a right arrow key, an up arrow key, and a select key, which are not specifically shown in FIG. 1. The user uses the down and up arrow keys to navigate the pseudo-words presented to him or her. When a desired pseudo-word is displayed, the user uses the select key to select the pseudo-word as the intended word. The select key may be one of the unused keys of the numeric keypad, such as the 1, * or #. It may also be a special way of hitting a key, such as holding down the "2" key for a period of time. It may also be the same key used to enter the space character, such that optionally a space may be inserted after it is pressed. Alternatively still, it may be an additional key.

The user uses the right and left arrow keys to select and de-select, respectively, letters of the intended word. A cursor may be used to indicate the letters that the user has already selected. For example, when "|mtvhelder" is displayed, this means that the user has not selected any letters. Pressing the right key once accepts the letter M, such that the new pseudo-words may be presented as "m|tvehelder," "m|utehelder," and "m|vudgelder." The cursor position in this case indicates that the user has accepted the letter M. The user uses the up and down arrow keys to select one of the new pseudo-words, and presses the right key when the desired pseudo-word is indicated to accept the second letter of the indicated pseudo-word. A cursor is not the only way to indicate which letters have been accepted. The system may distinguish between accepted and unaccepted letters in any of a number of ways, including font, size, boldface, italics, underline, color, or background color, or inverse text, or symbols other than a |.

To determine the pseudo-words, a statistical letter language model can be used. Generally, a language model estimates the probability of a sequence of language units, which in this case are letters. For example, if l is a specified sequence of Q letters, l=l.sub.1, l.sub.2, . . . , l.sub.Q (1) then the language model estimates the probability p(l). This can be factored into conditional probabilities by

.function..times..times..times..times. ##EQU00001## Next, the approximation that each letter depends only on the previous n letters is made: p(l.sub.i|l.sub.1, l.sub.2, . . . , l.sub.i-1).apprxeq.p(l.sub.i|l.sub.i-n+1, l.sub.i-n+2, . . . , l.sub.i-1) (3) Substituting equation (3) into equation (2),

.function..apprxeq..times..times..times..times. ##EQU00002## which is known and referred to as an n-gram letter language model, where n is greater than or equal to 1. Note that Q is equal to the length of the entered numeric key input. For example, if the user has entered the numeric key input 5665, then Q is four. In general, the probabilities are evaluated by occurrence counting in any type of database, such as a database of magazine articles, books, newspapers, or another type of database. For large values of n, it is necessary to prune the language model, using, for instance, count cutoffs, or a relative entropy based technique, such as Stolcke pruning, as known in the language modeling art.

A language model can optionally also take into account letters to the left of the entered sequence. For instance, if 1.sub.-9 1.sub.-8 1.sub.-7 1.sub.-6 1.sub.-5 1.sub.-4 1.sub.-3 1.sub.-2 1.sub.-1 1.sub.-0 are the last then letters entered by the user, and equation (4) is used, the probability will depend on the previous letters. Also optionally, it may be assumed that the next letter following the observed letters is "space". The letter probabilities are then multiplied by p("space" |1.sub.Q-n1.sub.Q-n+1. . . 1.sub.Q) This makes a sequence like "req" relatively less likely, which is reasonable, considering that "q" almost never ends a word, even though the sequence "req" is reasonably likely in contexts such as "request" or "require."

An n-gram letter language model can therefore be used to determine the pseudo-words presented to the user, or, more generally, to determine the sequences of letters presented to the user. That is, an n-gram letter model can be used to examine the previous n-1 letters to determine the current, nth, letter of the current number sequence. An n-gram letter model is generally constructed by examining a database, or training corpus. Typically, the model needs to be smoothed, as is known in the art of language modeling. The model can be improved over time by retraining the model with more complete databases, or by considering what the user has him or herself specifically entered in the past. The latter is referred to as using a cache model, where the last x letters entered are stored in a cache, with x being large, typically 100-10000.

Finally, it is noted that there is no need for the pseudo-words to be limited to letters. Just as real words can contain numbers and punctuation, so can pseudo-words. For instance, "$1", "1st", "Mr." "under-achiever" "555-1212" and "http://www.research.microsoft.com/.about.joshuago" are all possible strings the user might wish to enter. Therefore, punctuation and digits may be included as "letters" as used herein. Similarly, a user may or may not wish to distinguish upper and lower case letters. For some applications, there may be a special shift key, which can be used to determine if a lowercase or uppercase letter is desired. Alternatively, lowercase and uppercase letters can be modeled as different letters, and the letter n-gram model can be used to decide between them.

Determining Sequences of Letters such as Pseudo-Words

FIG. 4 is a flowchart of a method 302 showing how one embodiment in particular determines sequences of letters that are consistent with the letters already accepted by the user, and are consistent with the entered numeric key input. The method 302 starts in 400 with a prefix equal to the letters already accepted by the user, and the entered numeric key input. If no letters have been accepted by the user yet, then the prefix is null. In 402, the letters consistent with the number sequence at the position corresponding to the prefix plus one are determined. For example, if the number sequence is 3287, and the prefix is equal to "e," then the letters determined would be the letters mapped to the number 2, or, A, B, and C. This is because the prefix corresponds to the first number of the number sequence, such that the number at the position corresponding to the prefix plus one within the number sequence 3287 is the number 2.

In 404, the method 302 starts with the first consistent letter as the current consistent letter. In the case of the example, this is the letter A. In 406, the most likely pseudo-word that is consistent with the number sequence, and that begins with the prefix followed by the current consistent letter, is determined. In the example, this is the most likely pseudo-word that is consistent with the number sequence 3287 and that begins with the letters "ea." The letter E is from the prefix, whereas the letter A is the current consistent letter. In 408, if there are more consistent letters, then the method proceeds to 410, where the next consistent letter is advanced to as the current consistent letter, and the method 302 repeats 406. Otherwise, the method proceeds to 412, where the most likely pseudo-words are returned. There will be a pseudo-word returned for each letter that is consistent with the number sequence at the position within the number sequence after the prefix.

FIGS. 5a and 5b are flowcharts of a method 406 that one embodiment follows to determine the most likely pseudo-word for a new prefix and a number sequence. The method 406 starts in 500 with the new prefix and the number sequence. The new prefix is the prefix of FIG. 4 plus the current consistent letter. In 502, a first array of word-probability pairs is initialized with the pseudo-word-probability pair (new prefix, probability of the new prefix). The probability of the new prefix is the probability that the new prefix is what the user intended to enter for the part of the number sequence that corresponds to the new prefix. The probability is determined based on an n-gram letter modeling approach.

Continuing the previous example, the user may have already accepted the letter e, and the current consistent letter is the letter a, such that the new prefix is "ea." The probability of this prefix is determined using a letter n-gram. In 502, an array with a single entry, the new prefix, "ea", is initialized, as well as its probability, as determined by a letter n-gram model. For simplicity of the example, it is assumed that the letter n-gram model ignores characters in previous words. Then, following equation (4),

.function."".apprxeq..times..times..times..times."".times..times."".times.- .times."" ##EQU00003## To determine p("e"|space), the number of times a space occurred in the training corpus is found, such as 200,000 times, and the number of times an "e" followed the space, is found, such as 40,000 times. Then p("e"|space) can be estimated as 40,000/200,000=0.2. To determine p("a"|space "e"), the number of times the sequence space "e" occurred in the training corpus is found, such as 40,000 times, and the number of times an "a" followed space "e" is found, such as 5,000 times. This yields an estimate of 5,000/40,000=0.125. Multiplying these together, the probability of the sequence "ea" after a space is estimated to be 0.025. Thus, the array is initialized in 502 to have the single entry "ea", 0.025.

In 504, the number n is set to the first number in the number


Free Web Sudoku Puzzles.
Solve with your browser.
7   3           1
  6   2   3      
  4         9    
        8     9  
1   7       6   5
  2     4        
    6         5  
      6   7   2  
4           7   9
What is it?



Add Your Site · Terms Of Service · Privacy Policy


DISCLAIMER
Linkgrinder is a free service that searches the Internet and indexes all files found so that you may search quickly and easily for shared files. These files are created and made available individually by users whose identity we are not aware of and who we have no control over. In essence we function like a search engine tool; these files ARE NOT STORED OR SERVED BY OUR NETWORK. We are not responsible for any materials obtained by using our service. We do not monitor any of the contents of these files. These files may contain viruses, illegal materials, materials inappropriate for minors, offensive files and the like. BY USING OUR SERVICE, YOU ASSUME FULL RESPONSIBILITY FOR DOWNLOADING THESE MATERIALS AND WILL INDEMNIFY US FOR ANY DAMAGES THAT MAY BE INCURRED.

For More Specific Information VIEW OUR TERMS OF SERVICE.

Thank you and Enjoy!