Deleting sublists from lists

ReplaceRepeated is fine for short lists but it will get very slow if the list is long, because it starts over from the beginning of the list after each replacement. A better approach is to start the next replacement after the point of the previous one. One implementation of that:

fn1 = # /.
      {a___, aa_Symbol, _Integer, bb_Symbol, b___} :> 
        Join[{a, aa}, fn1 @ {bb, b}] &;

With jjc385's original as:

jjc = # //. {a___, aa_Symbol, _Integer, bb_Symbol, b___} :> {a, aa, bb, b} &;

Because my function is recursive I will need to raise $RecursionLimit for this benchmark.

$RecursionLimit = 1*^4;
big = RandomChoice[{1, 2, a, b, c, d}, 10000];

AbsoluteTiming[r1 = jjc[big];]
AbsoluteTiming[r2 = fn1[big];]

r1 === r2
{60.2394, Null}

{0.328343, Null}

True

Another example of this method:

  • Replace "," in a list with "."

A different method that might be of interest is SequencePosition, though it proves to be slower than fn1:

fn2 =
  Delete[#, 
    SequencePosition[big, {_Symbol, _Integer, _Symbol}][[All, {1}]] + 1] &;

AbsoluteTiming[r3 = fn2[big];]

r1 === r3
{1.35279, Null}

True

Performance Race

jjc385 challenged back with a method an order of magnitude faster than my own fn1 proposal. In reply, for the sake of performance tuning I shall make an assumption: that the list is entirely composed of Symbol and Integer expressions.

fn3 =
  Pick[
    #, 
    Unitize @ Subtract[ListCorrelate[{4, 2, 1}, Boole[IntegerQ /@ #], 2, 1], 2],
    1
  ] &;

Test:

big = RandomChoice[{1, 2, a, b, c, d}, 50000];

jjc2[big]; // RepeatedTiming
fn3[big];  // RepeatedTiming

fn3[big] === jjc2[big]
{0.112, Null}

{0.0153, Null}

True

Try

testList //. {a___, aa_Symbol, _Integer, bb_Symbol, b___} :> {a, aa, bb, b}
{a, b, c, 4, 5, d, e, f, g, 4}
% == resultList
True

Note that I replaced __ (BlankSequence) with ___ (BlankNullSequence), so this will work for an integer which appears as the second element in the list.


Mr.Wizard inspired me to improve. While his recursive approach is elegant, the problem clearly can be done linearly. Indeed:

jjc2 = (
   Sow@First@#;
   BlockMap[ 
      If[ ! MatchQ[#, {_Symbol, _Integer, _Symbol}], Sow@#[[2]] ]; &,
      #, 3, 1 ];
   Sow@Last@big;
     // Reap 
    // Last@*Last
  ) &

(r1=jjc2@big); // AbsoluteTiming

{0.102811, Null}

Mr.Wizard's faster answer:

fn1 = # /.
      {a___, aa_Symbol, _Integer, bb_Symbol, b___} :> 
        Join[{a, aa}, fn1 @ {bb, b}] &;

Block[{$RecursionLimit = 1*^4}, (r2=fn1@big);] // AbsoluteTiming

{1.26777, Null}

r1 === r2
True

An order of magnitude? I'll take it! :)