c# - Code to collapse duplicate and semi-duplicate records? -
i have list of models of type:
public class tourdude { public int id { get; set; } public string name { get; set; } }
and here list:
public ienumerable<tourdude> getallguides { { list<tourdude> guides = new list<tourdude>(); guides.add(new tourdude() { name = "dave et", id = 1 }); guides.add(new tourdude() { name = "dave eton", id = 1 }); guides.add(new tourdude() { name = "dave etz5", id = 1 }); guides.add(new tourdude() { name = "danial maze a", id = 2 }); guides.add(new tourdude() { name = "danial maze b", id = 2 }); guides.add(new tourdude() { name = "danial", id = 3 }); return guides; } }
i want retrieve these records:
{ name = "dave et", id = 1 } { name = "danial maze", id = 2 } { name = "danial", id = 3 }
the goal collapse duplicates , near duplicates (confirmable id), taking shortest possible value (when compared) name.
where start? there complete linq me? need code equality comparer?
edit 1:
var result = x in getallguides group x.name x.id g select new tourdude { test = exts.longestcommonprefix(g), id = g.key, }; ienumerable<ienumerable<char>> test = result.first().test; string str = test.first().tostring();
if want group items id
, find longest common prefix of name
s within each group, can follows:
var result = x in guides group x.name x.id g select new tourdude { name = longestcommonprefix(g), id = g.key, };
using algorithm finding longest common prefix here.
result:
{ name = "dave et", id = 1 } { name = "danial maze ", id = 2 } { name = "danial", id = 3 }
static string longestcommonprefix(ienumerable<string> xs) { return new string(xs .transpose() .takewhile(s => s.all(d => d == s.first())) .select(s => s.first()) .toarray()); }
Comments
Post a Comment