Matlab Vectorization - none-zero matrix row indices to cell

At the bottom of this answer is some benchmarking code, since you clarified that you're interested in performance rather than arbitrarily avoiding for loops.

In fact, I think for loops are probably the most performant option here. Since the "new" (2015b) JIT engine was introduced (source) for loops are not inherently slow - in fact they are optimised internally.

You can see from the benchmark that the mat2cell option offered by ThomasIsCoding here is very slow...

Comparison 1

If we get rid of that line to make the scale clearer, then my splitapply method is fairly slow, obchardon's accumarray option is a bit better, but the fastest (and comparable) options are either using arrayfun (as also suggested by Thomas) or a for loop. Note that arrayfun is basically a for loop in disguise for most use-cases, so this isn't a surprising tie!

Comparison 2

~~I would recommend you use a for loop for increased code readability and the best performance.~~

Edit:

If we assume that looping is the fastest approach, we can make some optimisations around the find command.

Specifically

Make M logical. As the below plot shows, this can be faster for relatively small M, but slower with the trade-off of type conversion for large M.
Use a logical M to index an array 1:size(M,2) instead of using find. This avoids the slowest part of the loop (the find command) and outweighs the type conversion overhead, making it the quickest option.

Here is my recommendation for best performance:

Click to copy

function A = f_forlooplogicalindexing( M )
    M = logical(M);
    k = 1:size(M,2);
    N = size(M,1);
    A = cell(N,1);
    for r = 1:N
        A{r} = k(M(r,:));
    end
end

I've added this to the benchmark below, here is the comparison of loop-style approaches:

Comparison 3

Benchmarking code:

Click to copy

rng(904); % Gives OP example for randi([0,1],3)
p = 2:12; 
T = NaN( numel(p), 7 );
for ii = p
    N = 2^ii;
    M = randi([0,1],N);

    fprintf( 'N = 2^%.0f = %.0f\n', log2(N), N );

    f1 = @()f_arrayfun( M );
    f2 = @()f_mat2cell( M );
    f3 = @()f_accumarray( M );
    f4 = @()f_splitapply( M );
    f5 = @()f_forloop( M );
    f6 = @()f_forlooplogical( M );
    f7 = @()f_forlooplogicalindexing( M );

    T(ii, 1) = timeit( f1 ); 
    T(ii, 2) = timeit( f2 ); 
    T(ii, 3) = timeit( f3 ); 
    T(ii, 4) = timeit( f4 );  
    T(ii, 5) = timeit( f5 );
    T(ii, 6) = timeit( f6 );
    T(ii, 7) = timeit( f7 );
end

plot( (2.^p).', T(2:end,:) );
legend( {'arrayfun','mat2cell','accumarray','splitapply','for loop',...
         'for loop logical', 'for loop logical + indexing'} );
grid on;
xlabel( 'N, where M = random N*N matrix of 1 or 0' );
ylabel( 'Execution time (s)' );

disp( 'Done' );

function A = f_arrayfun( M )
    A = arrayfun(@(r) find(M(r,:)),1:size(M,1),'UniformOutput',false);
end
function A = f_mat2cell( M )
    [i,j] = find(M.');
    A = mat2cell(i,arrayfun(@(r) sum(j==r),min(j):max(j)));
end
function A = f_accumarray( M )
    [val,ind] = ind2sub(size(M),find(M.'));
    A = accumarray(ind,val,[],@(x) {x});
end
function A = f_splitapply( M )
    [r,c] = find(M);
    A = splitapply( @(x) {x}, c, r );
end
function A = f_forloop( M )
    N = size(M,1);
    A = cell(N,1);
    for r = 1:N
        A{r} = find(M(r,:));
    end
end
function A = f_forlooplogical( M )
    M = logical(M);
    N = size(M,1);
    A = cell(N,1);
    for r = 1:N
        A{r} = find(M(r,:));
    end
end
function A = f_forlooplogicalindexing( M )
    M = logical(M);
    k = 1:size(M,2);
    N = size(M,1);
    A = cell(N,1);
    for r = 1:N
        A{r} = k(M(r,:));
    end
end

You can try arrayfun like below, which sweep through rows of M

Click to copy

A = arrayfun(@(r) find(M(r,:)),1:size(M,1),'UniformOutput',false)

A =
{
  [1,1] =  2
  [1,2] =

     1   3

  [1,3] =

     1   2   3

}

or (a slower approach by mat2cell)

Click to copy

[i,j] = find(M.');
A = mat2cell(i,arrayfun(@(r) sum(j==r),min(j):max(j)))

A =
{
  [1,1] =  2
  [2,1] =

     1
     3

  [3,1] =

     1
     2
     3

}

Using accumarray:

Click to copy

M = [0 1 0
     1 0 1
     1 1 1];

[val,ind] = find(M.');

A = accumarray(ind,val,[],@(x) {x});

Matlab Vectorization - none-zero matrix row indices to cell

Tags:

Matlab

Vectorization

Related

Recent Posts