Searching a phrase in all *.nb files
Here is a way to search from within mathematica:
notebooks = Quiet@FileNames["*.nb", NotebookDirectory[], 2];
Monitor[Select[
Table[{nb,
StringJoin@Select[ StringSplit[Import[nb, "Plaintext"], "\n"] ,
((If[#, Print["match on:", nb]]; #) &@
StringMatchQ[#, "*NIntegrate*"]) &, 5]},
{nb,notebooks}], #[[2]] != "" &], {nb}] // Grid[#, Alignment -> {Left, Top}, Dividers -> All] &
This is painfully slow, but it does just search and show only the plain text of the notebook.
Note: the following method isn't robust. See this answer of mine for a robust solution.
Here is an approach which does not rely on the NBImport.exe (which actually performs importing of the NB files as "Plaintext"
under the hood) and performs all the operations in the Kernel only. Currently NBImport.exe contains a bug due to which it returns $Failed
when have to import a NB file with non-ASCII file path.
The weak side of the following method is that it relies upon the ability of MakeExpression
to convert a low-level Notebook
expression into the high-level DocumentNotebook
what it doesn't always able to do even for correct NB files (and this ability is not guaranteed by the developers in general). This conversion is necessary because ToString
doesn't accept raw boxes as the low-level representation of a WL expression (even wrapping the raw boxes by RawBoxes
is simply ignored).
The simple function presented below currently fails in many situations but demonstrates the idea.
Here is a function which Get
s the contents of a NB file as Notebook
expression, then extracts all the Cell
s as the actual WL expressions wrapped by HoldComplete
, converts them into strings and checks whether they contain specified string pattern or not:
findInNBFile[NBFilePath_String, stringPattern_] :=
Module[{expr = MakeExpression[Get[NBFilePath], StandardForm], cellExprPos, foundPos},
cellExprPos = Replace[Position[expr, ExpressionCell | TextCell], 0 -> 1, {2}];
foundPos =
Flatten@Position[
StringFreeQ[
StringTake[ToString /@ Extract[expr, cellExprPos, HoldComplete], {14, -2}],
stringPattern], False];
If[foundPos =!= {},
Grid[Join[{{Row[{"Found \"", stringPattern, "\" in file \"", NBFilePath, "\""}],
SpanFromLeft}, {"Cell #", "The Cell"}},
Transpose[{foundPos, Extract[expr, Most /@ cellExprPos[[foundPos]], HoldForm]}]],
Frame -> All], {NBFilePath, False}]
];
It can be used as follows:
findInNBFile["ExampleData/document.nb", "abcde"]
A couple of additional solutions. The first, with FindList
, is probably the simplest and quickest.
Using FindList
searchDir = "<NB dir>";
fnames = FileNames["*.nb", searchDir, 2];
Length@fnames
sres = {#, FindList[#, {"curve"}, WordSearch -> False]} & /@ fnames;
sres = Select[sres, Length[#[[2]]] > 0 &];
Grid[sres, Dividers -> All, Alignment -> {Left, Top}]
See the options of FindList
.
Using CreateSearchIndex
searchDir = "<NB dir>";
index = CreateSearchIndex[searchDir]
sobjs = TextSearch[index, {"curve", "regression"}]
sres = MapThread[{#1,
StringCases[#2, "curve" ~~ (Except["\n"] ...) ~~ "regression",
IgnoreCase -> True]} &,
{Through[sobjs["Location"]], Through[sobjs["Plaintext"]]}];
Grid[sres, Dividers -> All, Alignment -> {Left, Top}]
See the signature of TextSearch
-- it allows complicated "and", "or", "except" searches.
This solution seems to be fairly slow.