Worksheet 3: Predictive Text Entry

MSc Software Workshop, Spring Term 2018-19

Designed by Seyyed Shah and Uday Reddy

Assigned: Thursday, 7th February, 2019

Intermediate Progress Review : parts 1 and 2, Thursday, 14th February, 6:30pm

Final Deadline : All Parts, Thursday, 21st February, 9:00pm.

As usual, include in your submission:

1.appropriate comments and JavaDoc.

2.thorough testing. (You may use JUnit wherever applicable.)

As well as data structures and algorithm complexity, this exercise assesses several concepts taught during the course. If you don’t understand any part of the exercise, please ask course instructors.

Start early. The questions get progressively harder. All work and progress on the exercise must be submitted using Canvas.

Contents

1 Prototypes and Design (25%) 3
2 Storing and Searching a Dictionary(25%) 5
3 More E  ciency (25%) 7
4 Pre x-matching (25%) 8

1

Introduction

In this worksheet, you will write the algorithms for a sample application using the Java Collection classes. In the next worksheet, you will attach a Graphical User Interface (GUI) to make it a full application. The sample application is that of predictive text.

Before the advent of touch screens, mobile telephones in English-speaking countries used a keypad like this one:

1 2 (abc) 3 (def)

4 (ghi) 5 (jkl) 6 (mno)

7 (pqrs) 8 (tuv) 9 (wxyz)

  • space #

As you notice, there are keys for digits 1{9, used for dialing phone numbers. But these keys were also used to enter letters a{z. When a text message needed to be entered, the keys corresponding to the letters would be used. However, since there are multiple letters on each key, the required letter needed to be disambiguated somehow.

In the basic system without predictive text, the user must press the appropriate key a number of times for a particular letter to be shown. Consider the word hello”. With this method, the user must press 4, 4, 3, 3, 5, 5, 5, then pause, then 5, 5, 5, 6, 6, 6.

To enter text more easily, the system of predictive text (also called T9″) was devised. The user presses each key only once and the mobile phone uses a dictionary to guess what word is being typed using a dictionary, and displays the possible matches. So the word hello” can be typed in 5 button presses 43556″ without pauses, instead of 13 in the standard system. The numeric string 43556″ is referred to as a signature” of the world hello”. If this is the only match, the user can press space and carry on. If there are multiple matches, the user might need to select one of them before proceeding.

A given numeric-signature may correspond to more than one word. Predictive text technology is possible by restricting available words to those in a dictionary. Entering the numeric signature 4663″ produces the words gone” and home” in many dictionaries.

In this exercise, you will design and develop a predictive text system. For simplicity, assume that the user does not need punctuation or numerals. You must also limit your solutions to producing only lower-case words.

The nal version of your programs should use the words dictionary available with the worksheet on canvas. However, during testing, it is better for you to create a small dictionary le of your own for which you know what outputs to expect.

All the classes in this worksheet should be placed in a package called predictive. Use the class/method names given in the questions.

2

1Prototypes and Design (25%)

This part deals with building a prototype” for the predictive text problem, which is not expected to be e cient, but it will be simple and allow you to compare it with the e cient implementation to be done in later parts.

Write the rst two methods in a class named PredictivePrototype inside the package predictive.

1.(5%) : Write a method wordToSignature with the type: public static String wordToSignature(String word)

The method takes a word and returns a numeric signature. For example, home” should return 4663″. If the word has any non-alphabetic characters, replace them with a ” (space) in the resulting signature. Accumulate the result character-by-character.

You should do this using the StringBuffer class rather than String. Explain, in your comments, why this will be more e cient.

  1. (10%): Write another method signatureToWords with the type:

public static Set signatureToWords(String signature)

It takes the given numeric signature (passed to it as a String) and returns a set of possible matching words from the dictionary (as a Set of Strings). The returned list must not have duplicates and each word should be in lower-case.

The method signatureToWords will need to use the dictionary to nd words that match the string signature and return all the matching words.

In this part of the exercise, you should not store the dictionary in your Java program.

Explain in the comments why this implementation will be ine  cient.

3.(10%): Create command-line p