Replace values which obey certain criteria
Up front you have a choice between pattern-based and numeric manipulation of an array. Pattern-based is more general; numeric is usually fastest when applicable.
a = {{21, 95, 50}, {39, 32, 76}, {9, 12, 75}};
Examples of pattern based methods:
a /. n_Integer /; n > 30 -> 30
a /. n_?NumericQ /; n > 30 -> 30
Replace[a, n_?(#>30&) -> 30, {2}]
Examples of numeric methods:
Clip[a, {-∞, 30}]
(a - 30) UnitStep[30 - a] + 30
Other, less desirable methods:
If[# > 30, 30, #, #] & //@ a
Map[#~Min~30 &, a, {-1}]
Fast numeric methods for the second example:
Clip[aa, {-∞, 100}, {0, 50.}]
(1 - #) 50. + aa # & @ UnitStep[100 - aa]
aa = {300., 150., 100., 76.8421, 64.0909, 55.567, 49.1935, 44.2262,
40.2247, 36.9355, 34.1667, 31.8138, 29.7887, 28.0087, 26.4537,
25.0612, 23.8186, 22.7059, 21.6854, 20.7663, 19.9265, 19.1519,
18.4323, 17.7739, 17.1672, 16.6007, 16.0648, 15.5605, 15.098};
ReplacePart[aa, Position[aa, x_ /; x > 100] -> 50]
Testing results for different size vectors
Summary:
ReplacePart
is the slowest. Clip
is fastest as well as having linear performance with size of problem (10 times as long a vector, 10 times as much CPU) which is a good thing (TM). It was also faster than Matlab's find
.
Report
Since there are timings being done (good thing), I thought I add Matlab's timing for this, on same PC I have to compare. Used Matlab's tic
and toc
which measures the elapsed time. This corresponds to Mathematica's AbsoluteTiming
EDU>> help tic
tic Start a stopwatch timer.
tic and TOC functions work together to measure elapsed time.
This is full results of all tests (thanks for input from others) for vector sizes of 1,000,000
and 10,000,000
and 100,000,000
(I have lots of RAM and lots of coffee so it is ok :)
This all were done on windows 7, using Mathematica 9.01 on 64 bit OS and PC. 16 GB RAM. Matlab is 2013a
version. 32 bit.
Here is the code used to generate all these tables
ClearAll["Global`*"];
size = 1000000; (*change this to change the vector size *)
aa = RandomInteger[400, size];
examples = {HoldForm[
ReplacePart[aa, Position[aa, x_ /; x > 100] -> 50.]; //AbsoluteTiming // First],
HoldForm[If[# > 100, 50., #] & /@ aa; // AbsoluteTiming // First],
HoldForm[Replace[aa, x_ /; x > 100 -> 50., 1]; // AbsoluteTiming // First],
HoldForm[Clip[aa, {-100., 100.}, {0., 50.}]; // AbsoluteTiming // First],
HoldForm[Clip[aa, {-\[Infinity], 100.}, {0., 50.}]; // AbsoluteTiming // First],
HoldForm[Clip[aa, {-\[Infinity], 100}, {0, 50.}]; // AbsoluteTiming // First],
HoldForm[With[{t = UnitStep[100 - aa]}, (1 - t) 50. + t aa];//AbsoluteTiming//First]
};
res = {#, ReleaseHold[#]} & /@ examples;
Grid[AppendTo[
res, {" clear all; a=randi(400,1000000,1); tic; a(find(a>100))=50; \
toc", 0.0155}], Frame -> All, Alignment -> Left,
Spacings -> {0.5, 1}, FrameStyle -> LightGray]
1,000,000
EDU>> clear all; a=randi(400,1000000,1); tic; a(find(a>100))=50; toc
Elapsed time is 0.015441 seconds.
EDU>> clear all; a=randi(400,1000000,1); tic; a(find(a>100))=50; toc
Elapsed time is 0.018643 seconds.
EDU>> clear all; a=randi(400,1000000,1); tic; a(find(a>100))=50; toc
Elapsed time is 0.014538 seconds.
10,000,000
EDU>> clear all; a=randi(400,10000000,1); tic; a(find(a>100))=50; toc
Elapsed time is 0.149870 seconds.
EDU>> clear all; a=randi(400,10000000,1); tic; a(find(a>100))=50; toc
Elapsed time is 0.150574 seconds.
100,000,000
After few hrs, the full table was still not completed and all memory was used and with no way to know how long it will take, had to stop the computation so that I can use the PC. Removed the first test, which turned out to be the case of the problem. Now the table builds fast. Here it is
EDU>> clear all; a=randi(400,100000000,1); tic; a(find(a>100))=50; toc
Elapsed time is 1.496174 seconds.
EDU>> clear all; a=randi(400,100000000,1); tic; a(find(a>100))=50; toc
Elapsed time is 1.496570 seconds.
EDU>> clear all; a=randi(400,100000000,1); tic; a(find(a>100))=50; toc
Elapsed time is 1.501019 seconds.
150,000,000
EDU>> clear all; a=randi(400,150000000,1); tic; a(find(a>100))=50; toc
Elapsed time is 2.240782 seconds.
EDU>> clear all; a=randi(400,150000000,1); tic; a(find(a>100))=50; toc
Elapsed time is 2.241474 seconds.
EDU>> clear all; a=randi(400,150000000,1); tic; a(find(a>100))=50; toc
Elapsed time is 2.244419 seconds.
Here are some options:
aa = {300., 150., 100., 76.8421, 64.0909, 55.567, 49.1935, 44.2262,
40.2247, 36.9355, 34.1667, 31.8138, 29.7887, 28.0087, 26.4537,
25.0612, 23.8186, 22.7059, 21.6854, 20.7663, 19.9265, 19.1519,
18.4323, 17.7739, 17.1672, 16.6007, 16.0648, 15.5605, 15.098};
If[# > 100, 50, #] & /@ aa
Replace[aa, x_ /; x > 100 -> 50, {1}]
Some performance tests:
aa=RandomInteger[400,1000000];
nasser01 = ReplacePart[aa,Position[list,x_/;x>100]->50];//AbsoluteTiming//First
murta01 = If[#>100,50,#]&/@aa;//AbsoluteTiming//First
murta02 = Replace[aa,x_/;x>100->50,{1}];//AbsoluteTiming//First
nasser01 = 2.636266
murta01 = 0.052457
murta02 = 0.537735