C
C#2y ago
Shillelagh

❔ Troubles getting all Largest Common Subsequences of two strings.

For fun trying to write a system to take differential backups of strings. The backbone is this method I wrote, which should find all LCSs between two strings:
public static List<Tuple<int, int, int>> GetAllLCS(string _new, string _old)
{
List<Tuple<int, int, int>> _matches = new List<Tuple<int, int, int>>(); // Get all common substrings
for (int i = 0; i < _new.Length; i++)
{
for (int j = _new.Length; j > i; j--)
{
var o = _old.IndexOf(_new.Substring(i, j - i));
if (o != -1)
{
_matches.Add(Tuple.Create(i, j, o));
}
}
}

_matches.Sort(new Comparison<Tuple<int,int,int>>((x, y) => { // Sort them from largest to smallest substrings
return (x.Item2 - x.Item1)-(y.Item2 - y.Item1);
}));
var _outmatches = new List<Tuple<int, int, int>>();

foreach (var i in _matches) // Duplicate the list
{
_outmatches.Add(i);
}

foreach (var i in _matches) // Remove any substrings that overlap
{
foreach (var j in _matches)
{
if (i == j) continue;
if ((j.Item1 > i.Item1 && j.Item1 < i.Item2 - 1) || (j.Item2 > i.Item1 && j.Item2 < i.Item2 - 1)) _outmatches.Remove(j);
}
}

return _outmatches;
}
public static List<Tuple<int, int, int>> GetAllLCS(string _new, string _old)
{
List<Tuple<int, int, int>> _matches = new List<Tuple<int, int, int>>(); // Get all common substrings
for (int i = 0; i < _new.Length; i++)
{
for (int j = _new.Length; j > i; j--)
{
var o = _old.IndexOf(_new.Substring(i, j - i));
if (o != -1)
{
_matches.Add(Tuple.Create(i, j, o));
}
}
}

_matches.Sort(new Comparison<Tuple<int,int,int>>((x, y) => { // Sort them from largest to smallest substrings
return (x.Item2 - x.Item1)-(y.Item2 - y.Item1);
}));
var _outmatches = new List<Tuple<int, int, int>>();

foreach (var i in _matches) // Duplicate the list
{
_outmatches.Add(i);
}

foreach (var i in _matches) // Remove any substrings that overlap
{
foreach (var j in _matches)
{
if (i == j) continue;
if ((j.Item1 > i.Item1 && j.Item1 < i.Item2 - 1) || (j.Item2 > i.Item1 && j.Item2 < i.Item2 - 1)) _outmatches.Remove(j);
}
}

return _outmatches;
}
14 Replies
Shillelagh
ShillelaghOP2y ago
Which returns the start and end indices of where the substring is present in _new and the start index of where the substring is present in _old As a test I am running it with the following parameters: _new = "Old New Old" _old = "Old Old"` If you get the substrings in the ranges returned by the function the LCSs are: Old and Old (One with a space before, one after) I would like for it to only return Old and Old (Only one with the space, doesn't matter which.) I've been unable to tweak the overlap removal bit to get the desired result Desired output: |Old|, | Old| or |Old |, |Old| Current output: |Old |, | Old|`` (With vertical bars representing where the substrings end to make the spaces clear
Angius
Angius2y ago
FYI you can make the code clearer by using named tuples or records. Right now it's kinda hard to figure the code out because of parts that mean nothing, like o and .Item1 (also, _name is used for private fields, neither parameters nor local variables should start with an underscore)
Shillelagh
ShillelaghOP2y ago
Ah, nice. Never heard of a named tuple! Didn't want to use a struct for something I was going to use once.
Angius
Angius2y ago
You use it multiple times, though
Shillelagh
ShillelaghOP2y ago
Only did this because new is already a keyword lol
Angius
Angius2y ago
So it does warrant a struct or a record @new then
Shillelagh
ShillelaghOP2y ago
Ah Cool
MODiX
MODiX2y ago
Angius
REPL Result: Success
int @int = 0;
int @int = 0;
Compile: 370.253ms | Execution: 27.959ms | React with ❌ to remove this embed.
Angius
Angius2y ago
There And sorry for the ping, Int lol
Shillelagh
ShillelaghOP2y ago
Poor guy lmao
Unknown User
Unknown User2y ago
Message Not Public
Sign In & Join Server To View
MODiX
MODiX2y ago
@new _ = new(new(new()), new(new()), new(new(), new()));

public record @new(neW NeW, New New, nEw NEw);public record neW(NEw NEw);
public record NEw();public record New(NEW nEW);public record NEW();
public record nEw(nEW NEw, NeW NeW);public record nEW();public record NeW();
@new _ = new(new(new()), new(new()), new(new(), new()));

public record @new(neW NeW, New New, nEw NEw);public record neW(NEw NEw);
public record NEw();public record New(NEW nEW);public record NEW();
public record nEw(nEW NEw, NeW NeW);public record nEW();public record NeW();
Unknown User
Unknown User2y ago
Message Not Public
Sign In & Join Server To View
Accord
Accord2y ago
Was this issue resolved? If so, run /close - otherwise I will mark this as stale and this post will be archived until there is new activity.

Did you find this page helpful?