Minimum Unique Word Abbreviation
Description
A string such as "word" contains the following abbreviations:
["word", "1ord", "w1rd", "wo1d", "wor1", "2rd", "w2d", "wo2", "1o1d", "1or1", "w1r1", "1o2", "2r1", "3d", "w3", "4"] Given a target string and a set of strings in a dictionary, find an abbreviation of this target string with the smallest possible length such that it does not conflict with abbreviations of the strings in the dictionary.
Each number or letter in the abbreviation is considered length = 1. For example, the abbreviation "a32bc" has length = 4.
Note: In the case of multiple answers as shown in the second example below, you may return any one of them. Assume length of target string = m, and dictionary size = n. You may assume that m ≤ 21, n ≤ 1000, and log2(n) + m ≤ 20. Examples: "apple", ["blade"] -> "a4" (because "5" or "4e" conflicts with "blade")
"apple", ["plain", "amber", "blade"] -> "1p3" (other valid answers include "ap3", "a3e", "2p2", "3le", "3l1").
Hint
Train of Thought
the idea is quite simply, just try each length of abbreviation from min to max, whenever find a valid abbr, return it. Some optimization:
- use char[] to keep track of the abbr we already has
- skip abbr has length of 1, since it will have the same length with not abbr and has less key elements to distinguish a word
- preprocess a abbr first before checking all the words in dictionary
Code
public String minAbbreviation(String target, String[] dictionary) {
char[] c = target.toCharArray();
char[] tmp = new char[c.length];
// traverse length from min to max
for (int l = 1; l <= target.length(); l++){
String abbr = minAbbreviation(c, 0, tmp, 0, dictionary, l);
if (abbr != null) return abbr;
}
return null;
}
private String minAbbreviation(char[] c, int p, char[] tmp, int t, String[] dictionary, int l){
if (l == 0){// all length has been used up
if (p == c.length && !conflict(tmp, t, dictionary, c.length)) return new String(tmp, 0, t);
else return null;
}
if (t == 0|| tmp[t - 1] > '9'){// can use abbr
// c.length - 1 - (end + 1) + 1 >= l - 1 => c.length - end >= l
for (int end = p + 1; end <= c.length - l; end++){// we don't need to check length of abbr = 1, it will have the same length with the one that does not use abbr and has less elements to distinguish a word
int s = end - p + 1;
if (s >= 10) {
tmp[t] = (char)(s / 10 + '0');
tmp[t + 1] = (char)(s % 10 + '0');
String r = minAbbreviation(c, end + 1, tmp, t + 2, dictionary, l - 1);
if (r != null) return r;
}
else{
tmp[t] = (char)(s + '0');
String r = minAbbreviation(c, end + 1, tmp, t + 1, dictionary, l - 1);
if (r != null) return r;
}
}
}
// use original character
tmp[t] = c[p];
return minAbbreviation(c, p + 1, tmp, t + 1, dictionary, l - 1);
}
private boolean conflict(char[] abbr, int t, String[] dictionary, int l){
char[] pattern = new char[abbr.length];
int p = 0; // pointer for pattern
int count = 0;
for (int i = 0; i < t; i++){
char c = abbr[i];
if (c <= '9') count = count * 10 + c - '0';
else{
if (count != 0) {
pattern[p++] = (char)count; // store count to pattern. (note that count must be less than 22)
count = 0;
}
pattern[p++] = c;
}
}
//if (count != 0) pattern[p++] = (char)count; tailing pattern doesn't need to check
for (String s : dictionary){
if (s.length() != l) continue;
int j = 0;
boolean match = true;
for (int i = 0; i < p; i++){
if (pattern[i] < 22) j += pattern[i]; // pass count characters
else if (s.charAt(j) != pattern[i]){
match = false;
break;
}
else j++; // match one character
}
if (match) return true;
}
return false;
}