What are the requirements for a well behaved indexed variable? Subscript, ToExpression, Downvalue?
General usage
Here is what I think
- Using strings and subsequently
ToString
-ToExpression
just to generate variable names is pretty much unacceptable, or at the very least should be the last thing you try. I don't know of a single case where this couldn't be replaced with a better solution - Using subscripts is also pretty bad and should be avoided, except for purely presentation purposes - as you noted
For cases when you need to use many generated variables, indexed variables are usually the best way to go. They usually take the form
head[index]
and can be used im most places where usual variables can be used, particularly in equations or other expressions of symbolic (inert) nature. You need a bit more care with indexed variables, than plain symbols, in particular it is best to ensure that the
index
is either numeric or, if an expression, should be inert in the sense of evaluation (keep the same value always, or no value).Sometimes, you can also use the symbols generated by using
Unique[...]
. Usually, they are used as temporary anonymous placeholders in some intermediate transformations, but then you will have to make sure they are destroyed after you no longer need them.
Assignments and state
A very important aspect here is whether the variables are intended to be inert symbolic entities, or you plan to store some values in them. Here are a few things to keep in mind:
Values stored in variables will be stored in different types of rules for symbol variables and indexed variables:
- For symbol-based variables, these will be in
OwnValues
- For indexed variables, these will be in
DownValues
, or sometimesSubValues
, if you use nested indices.
- For symbol-based variables, these will be in
Only symbols allow part assignments. So, for example, you can do
a = Range[10]; a[[5]] = 100;
but you can't do
a[1]=Range[10]; (* Ok by itself *) a[1][[5]] = 100 (* Won't work *)
This can be a big deal, for some applications
Only symbols can serve as local variables / constants in
Module
,Block
,With
,Function
,Pattern
, etc.For the case of many variables, indexed variables may be easier to manage, since you have to clear only one symbol.
To selectively clear a given indexed variable, you have to use
Unset
, notClear
:a[1]=.
Indexed variables can not be used inside
Compile
, although it may appear that they can.If you must do assignments to many (indexed) variables, I'd consider using an
Association
instead. This may make it easier from the resource management point of view, since you can store an association in a single variable. An additional bonus is that then, part assignments to particular indexed variables are allowed:assoc = <|a -> {1, 2, 3}, b -> {4, 5, 6}|>; assoc[[Key[a], 2]] = 10; assoc (* <|a -> {1, 10, 3}, b -> {4, 5, 6}|> *)
Notes
As far as I can recall now, being AtomQ
is not a requirement for most uses for variables. Being a plain Symbol
is required in some cases, like for local variables in scoping constructs, or part assignments - as I explained above.
In general, my experience is that most of uses for indexed variables in pure programming context are more or less equivalent to using a hash table. In the context of symbolic manipulations, indexed variables can be quite useful in many ways - they can represent, for example, coefficients for powers in a polynomial, and many other things.
For anything involving programming / transformations, I'd stay away from Subscript
, Notation`
, Symbolize
, and all other things that can mix evaluation and presentation aspects. Using them in code is just an invitation for trouble. If you want to format an expression in some way, write special functions which would do that, as a separate stage.
Using DownValues
enables you to format the display in the subscripted form without using Notation
and Symbolize
(Format[#[n_]] := Subscript[#, n]) & /@ {x, σ, a};
kvar[k_] := Through[{x, σ, a}[k]]
kvar[3]
kvar[n]
If you will never use a symbolic index then you can restrict the argument of kvar
to Integer
as you did originally.
What are the requirements for well behaved variables?
Functions are not variables, although in most cases, the kernel treats undefined variables and functions identically. Sometimes it doesn't. After all, there are places in mathematics where the difference between a number and a function is important.
One extreme and undocumented example is Dt[]
, the total derivative function. There,f[1]
is very much different from f1
. f[1]
is a number, the value of f
at 1, constant by definition of a mathematical function of one variable, while f1
is not assumed to be constant unless an explicit declaration is made.
f[1] f[1]
Head[f[1]] f
AtomQ[f[1]] False
Dt[f[x], x] f'[x]
Dt[f[y], x] Dt[y, x] f'[y]
Dt[f[1], x] 0
Dt[f1, x] Dt[f1, x]
D[f[1], x] 0
What is the recommended and most elegant form of indexed variables?
A very simple method is to make this definition for each indexed variable:
x[i_Integer] := x[i] = With[{u = Unique[x]}, Format[u] = Subscript[x, i]; u]
Or,
defineIndexedVariable[x_Symbol] := (
x[i_Integer] := x[i] = With[{u = Unique[x]}, Format[u] = Subscript[x, i]; u]
)
It allows negative subscripts, doesn't use Symbol
, and Unique[x]
handles Context
properly, but InputForm[Array[x,5]]
prints (say) {x$2136, x$2152, x$2163, x$2164, x$2165}
. There is no temptation to sometimes write x[1]
and sometimes write x$2136
.
Another more complicated method of constructing symbols on the fly is well, not very elegant, and is an example of how complicated things can suddenly collapse into simplicity, but it does avoid repeated string operations and problems with $Context
. It allows us to write v[1]
and have it print as Subscript[v, 1]
(which displays $\text{v}_1$), and have {InputForm[v[1]], Head[v[1]], AtomQ[v[1]]}
evaluate to {v⎵1, Symbol, True}
. InputForm[Array[v,5]]
prints {v⎵1, v⎵2, v⎵3, v⎵4, v⎵5}
, and we can sometimes write v[1]
and sometimes v⎵1
.
Simply paste this function into your notebook and evaluate it for each indexed symbol. Each nonnegative integer subscript will evaluate the symbol's function exactly once. The first version evaluates Context[FUN]<>SymbolName[FUN] every time a new subscript is encountered.
defineIndexedVariable[FUN_Symbol] :=
FUN[ix_Integer /; ix ≥ 0] := With[{
v=Symbol[Context[FUN]<>SymbolName[FUN]<>"⎵"<>ToString[ix]]
},
Format[v] = Subscript[FUN,ix];
FUN[ix] = v
]
For one symbol, optimize it by hand and specify context explicitly everywhere:
FUN`FUN[ix_Integer /; ix ≥ 0] := With[{
v=Symbol["FUN`FUN⎵"<>ToString[ix]]
},
Format[v] = Subscript[FUN`FUN,ix];
FUN`FUN[ix] = v
]
Or get the same optimization with this version:
defineIndexedVariable[FUN_Symbol] := With[{
prefix = Context[FUN] <> SymbolName[FUN] <> "⎵"
},
FUN[ix_Integer /; ix >= 0] := With[{
v = Symbol[prefix <> ToString[ix]]
},
Format[v] = Subscript[FUN, ix];
FUN[ix] = v
]
]
The Context[FUN]
prefix ensures that all of the new symbols will be
in the same context as the function definition. The FUN[ix] =
memoizes the function so Symbol
is called only once for each distinct index, and the Format[v]
definition is made only once. The absence of a semicolon after the FUN[ix] = v
is absolutely essential.
After defineIndexedVariable[v]
, we get (OutputForm
on a text terminal)
v
v[1] 1
Head[v[1]] Symbol
AtomQ[v[1]] True
Dt[v[x], x] v'[x]
Dt[v[y], x] Dt[y, x] v'[y]
Dt[v , x]
Dt[v[1], x] 1
Dt[v1, x] Dt[v1, x]
D[v[1], x] 0
Information["v"]
:
Global`v
v[1] = v⎵1
v[ix$_Integer/;ix$>=0]:=With[{v$=Symbol["Global`v⎵"<>ToString[ix$]]},Format[v$]=Subscript[v,ix$];v[ix$]=v$]
Using the standard front end, ??v does not showv⎵1
. But InputForm[Array[x,5]]
prints {v⎵1, v⎵2, v⎵3, v⎵4, v⎵5}
Code that printed the above tables:
Function[x,{Table[" ",{Length[x]}],HoldForm/@Unevaluated[x],x}, HoldAll][{v[1],InputForm[v[1]],Head[v[1]],AtomQ[v[1]],Dt[v[x],x],Dt[v[y],x],Dt[v[1],x],Dt[v1,x],D[v[1],x]}]//Transpose//TableForm//Print