使预测文本algorithm变得更快

我正在开发一个Windows手机拨号器应用程序,我已经在我的应用程序中实现了预测文本。 当用户在键盘上轻击时,会生成与input相匹配的联系人。 预测太慢,它也阻止我的主线程,这就是为什么我已经实现了BackGroundWorker但仍然有性能问题我的代码是:

private void dialer_TextChanged(object sender, TextChangedEventArgs e) { MainPage.DialerText = dialer.Text; if(!bw1.IsBusy) bw1.RunWorkerAsync(); } void bw1_DoWork(object sender, DoWorkEventArgs e) { try { var digitMap = new Dictionary<int, string>() { { 1, "" }, { 2, "[abcABC]" }, { 3, "[defDEF]" }, { 4, "[ghiGHI]" }, { 5, "[jklJKL]" }, { 6, "[mnoMNO]" }, { 7, "[pqrsPQRS]" }, { 8, "[tuvTUV]" }, { 9, "[wxyzWXYZ]" }, { 0, "" }, }; var enteredDigits = DialerText; var charsAsInts = enteredDigits.ToCharArray().Select(x => int.Parse(x.ToString())); var regexBuilder = new StringBuilder(); foreach (var val in charsAsInts) regexBuilder.Append(digitMap[val]); MainPage.pattern = regexBuilder.ToString(); MainPage.pattern = ".*" + MainPage.pattern + ".*"; } catch (Exception f) { // MessageBox.Show(f.Message); } } void bw1_RunWorkerCompleted(object sender, RunWorkerCompletedEventArgs e) { SearchListbox.ItemsSource = listobj.FindAll(x => x.PhoneNumbers.Any(a=>a.Contains(MainPage.DialerText)) | Regex.IsMatch(x.FirstName, MainPage.pattern)); } 

BackGroundWorker也阻止我的主线程,因此当我点击键盘时有一个滞后input值被添加到文本框。 我想添加inputTextTox没有任何滞后,如何做到这一点? 谢谢。

你想快点走吗?

停止直接处理单词列表,并将其投入到更高效的数据结构中。

为了快速查找任何大小的单词列表(但在内存方面更昂贵),您应该构建一个包含整个单词列表的树结构。

根节点表示零拨号数字,并且连接到(多达)另外十个节点,其中连接节点的边代表按下0到9之一的可能数字。

然后,每个节点包含可能的词汇,这些词汇可以通过从根节点通过树的路径形成,路径代表被按下的数字。

这意味着搜索不再需要迭代整个单词列表,并且可以在极少的操作中完成。

这是我在网上找到的100000个单词列表中的概念。 搜索需要大约0.02ms在我的桌面上。 很好,很快。 内存似乎需要约50MB。

 void Main() { var rootNode = new Node(); //probably a bad idea, better to await in an async method LoadNode(rootNode).Wait(); //let's search a few times to get meaningful timings for(var i = 0; i < 5; ++i) { //"acres" in text-ese (specifically chosen for ambiguity) var searchTerm = "22737"; var sw = Stopwatch.StartNew(); var wordList = rootNode.Search(searchTerm); Console.WriteLine("Search complete in {0} ms", sw.Elapsed.TotalMilliseconds); Console.WriteLine("Search for {0}:", searchTerm); foreach(var word in wordList) { Console.WriteLine("Found {0}", word); } } GC.Collect(); var bytesAllocated = GC.GetTotalMemory(true); Console.WriteLine("Allocated {0} bytes", bytesAllocated); } async Task LoadNode(Node rootNode) { var wordListUrl = "http://www-01.sil.org/linguistics/wordlists/english/wordlist/wordsEn.txt"; Console.WriteLine("Loading words from {0}", wordListUrl); using(var httpClient = new HttpClient()) using(var stream = await httpClient.GetStreamAsync(wordListUrl)) using(var reader = new StreamReader(stream)) { var wordCount = 0; string word; while( (word = await reader.ReadLineAsync()) != null ) { word = word.ToLowerInvariant(); if(!Regex.IsMatch(word,@"^[az]+$")) { continue; } rootNode.Add(word); wordCount++; } Console.WriteLine("Loaded {0} words", wordCount); } } class Node { static Dictionary<int, string> digitMap = new Dictionary<int, string>() { { 1, "" }, { 2, "abcABC" }, { 3, "defDEF" }, { 4, "ghiGHI" }, { 5, "jklJKL" }, { 6, "mnoMNO" }, { 7, "pqrsPQRS" }, { 8, "tuvTUV" }, { 9, "wxyzWXYZ" }, { 0, "" }}; static Dictionary<char,int> letterMap; static Node() { letterMap = digitMap .SelectMany(m => m.Value.Select(c=>new {ch = c, num = m.Key})) .ToDictionary(x => x.ch, x => x.num); } List<string> words = new List<string>(); //the edges collection has exactly 10 //slots which represent the numbers [0-9] Node[] edges = new Node[10]; public IEnumerable<string> Words{get{ return words; }} public void Add(string word, int pos = 0) { if(pos == word.Length) { if(word.Length > 0) { words.Add(word); } return; } var currentChar = word[pos]; int edgeIndex = letterMap[currentChar]; if(edges[edgeIndex] == null) { edges[edgeIndex] = new Node(); } var nextNode = edges[edgeIndex]; nextNode.Add(word, pos+1); } public Node FindMostPopulatedNode() { Stack<Node> stk = new Stack<Node>(); stk.Push(this); Node biggest = null; while(stk.Any()) { var node = stk.Pop(); biggest = biggest == null ? node : (node.words.Count > biggest.words.Count ? node : biggest); foreach(var next in node.edges.Where(e=>e != null)) { stk.Push(next); } } return biggest; } public IEnumerable<string> Search(string numberSequenceString) { var numberSequence = numberSequenceString .Select(n => int.Parse(n.ToString())); return Search(numberSequence); } private IEnumerable<string> Search(IEnumerable<int> numberSequence) { if(!numberSequence.Any()) { return words; } var first = numberSequence.First(); var remaining = numberSequence.Skip(1); var nextNode = edges[first]; if(nextNode == null) { return Enumerable.Empty<string>(); } return nextNode.Search(remaining); } } 

您可以通过一些优化来提高速度:

  • .*前缀和后缀添加到您的正则表达式模式是没有必要的,因为IsMatch将在字符串中的任何位置检测匹配
  • 使用本地Dictionary<int,string>可以替换为static数组
  • 将数字转换为int s可以用减法来替换
  • foreach循环和追加可以被string.Join替换

这里是如何:

 private static string[] digitMap = new[] { "" , "", "[abcABC]", "[defDEF]" , "[ghiGHI]", "[jklJKL]", "[mnoMNO]" , "[pqrsPQRS]", "[tuvTUV]", "[wxyzWXYZ]" }; void bw1_DoWork(object sender, DoWorkEventArgs e) { try { MainPage.pattern = string.Join("", DialerText.Select(c => digitMap[c-'0'])); } catch (Exception f) { // MessageBox.Show(f.Message); } } 

我怀疑阻塞的原因是你的后台工作线程并没有真正做到。

我期望bw1_RunWorkerCompleted方法(在后台工作完成之后在主线程上运行)远远更长! 你可以把这些代码全部移到bw1_DoWork方法吗? 显然,“listobj”将不得不被后台线程访问 – 只是传递给你的委托,并通过DoWorkEventArgs.Argument属性来访问它…

 private void dialer_TextChanged(object sender, TextChangedEventArgs e) { MainPage.DialerText = dialer.Text; if(!bw1.IsBusy) bw1.RunWorkerAsync(listobj); } void bw1_DoWork(object sender, DoWorkEventArgs e) { try { var digitMap = new Dictionary<int, string>() { { 1, "" }, { 2, "[abcABC]" }, { 3, "[defDEF]" }, { 4, "[ghiGHI]" }, { 5, "[jklJKL]" }, { 6, "[mnoMNO]" }, { 7, "[pqrsPQRS]" }, { 8, "[tuvTUV]" }, { 9, "[wxyzWXYZ]" }, { 0, "" }, }; var enteredDigits = DialerText; var charsAsInts = enteredDigits.ToCharArray().Select(x => int.Parse(x.ToString())); var regexBuilder = new StringBuilder(); foreach (var val in charsAsInts) regexBuilder.Append(digitMap[val]); MainPage.pattern = regexBuilder.ToString(); MainPage.pattern = ".*" + MainPage.pattern + ".*"; var listobj = (ListObjectType)e.Argument; e.Result = listobj.FindAll(x => x.PhoneNumbers.Any(a=>a.Contains(MainPage.DialerText)) | Regex.IsMatch(x.FirstName, MainPage.pattern)); } catch (Exception f) { // MessageBox.Show(f.Message); } } void bw1_RunWorkerCompleted(object sender, RunWorkerCompletedEventArgs e) { SearchListbox.ItemsSource = (IEnumerable<ListObjectItemType>)e.Result; } 

NB – 你显然需要用实际的类型替换ListObjectType和ListObjectItemType!