How to disallow the application of functions outside a white list?

Here's a sample implementation of 2 although I think simply writing a generalizable list-like interface is probably the safer route:

whitelisted =
  Association@
   Thread[
    Thread@
      Hold@
       {
        Hold, HoldPattern, Print
        } ->
     Null];

discreteData::unknown = 
  "Result of applying function `` to discreteData object is unknown";
discreteData /:
 HoldPattern[
  func_?(Function[Null, ! KeyMemberQ[whitelisted, Hold[#]], 
      HoldAllComplete])[d_discreteData]
  ] :=
 (
  Message[discreteData::unknown, HoldForm[func]];
  d
  );
discreteData /:
  HoldPattern[
   func_?(Function[Null, ! KeyMemberQ[whitelisted, Hold[#]], 
       HoldAllComplete])[a___, d_discreteData, b___]
   ] :=
  (
   Message[discreteData::unknown, HoldForm[func]];
   HoldComplete[func[a, d, b]]
   );

Then:

In[727]:= List[discreteData[1]]

During evaluation of In[727]:= discreteData::unknown: Result of applying function List to discreteData object is unknown

Out[727]= discreteData[1]

But

In[728]:= Hold@discreteData[1]

Out[728]= Hold[discreteData[1]]

And including Simon Wood's example now:

In[734]:= discreteData[1] + 1

During evaluation of In[734]:= discreteData::unknown: Result of applying function Plus to discreteData object is unknown

Out[734]= HoldComplete[1 + discreteData[1]]

You may continue to use UpValues and construct a pattern that raises a Message when a function not in the white list is called.

First create a Message to raise.

discreteData::undefsym = "discreteData not defined for `1`.";

UpValue pattern to raise the message when a non-white list function is used.

discreteData /:
 f_[___, d_discreteData, ___] /; 
  MemberQ[{Normal, Format, Length, Part, Set, If, List}, f] == False :=
 Message[discreteData::undefsym, f]

White list functions work as expected and others will not.

Sqrt /@ dd

discreteData::undefsym: discreteData not defined for Map.

Note that this will work because discreteData is sufficiently deep in Map and defined for Normal when it is entered as a parameter to Normal.

Normal /@ {dd, dd}
Join[{dd}, {dd}]

This will not work and errors on Sqrt since discreteData is not defined for Sqrt when it is entered as a parameter to Sqrt.

Sqrt /@ {dd, dd}
Join[dd, dd]

Hope this helps.


Since, to the best of my knowledge, is not possible to program a custom interface that is integrated at the level of atomic objects such as a SparseArray I think some compromise is unavoidable.

Edmund's method is the first one that came to mind, and I voted for it. It may still be your best option.

However it is not robust in the manner I think you are seeking. Consider for example:

expr = Hold[{1, discreteData[x, y, {z1, z2, z3}], 3}];

Apply[foo, expr, {2}]
Apply[foo, expr, {3}]
Map[foo, expr, {3}]
Hold[{1, foo[x, y, {z1, z2, z3}], 3}]

Hold[{1, discreteData[x, y, foo[z1, z2, z3]], 3}]

Hold[{1, discreteData[foo[x], foo[y], foo[{z1, z2, z3}]], 3}]

I imagine that none of these are the result you would hope to see from each operation.

Perhaps it would be better for these functions to ignore your construct rather than mutating it errantly.


Proposal

One compromise to achieve this is to make a separate definition for each Symbol to which a data set is assigned.

  • To do this I shall use a public function setDiscreteData and a global variable $DD

  • All your UpSets should be given in the body of setDiscreteData

 

Attributes[setDiscreteData] = HoldFirst;

setDiscreteData[s_Symbol, rhs_] := With[{a := $DD[s]},

    a = rhs;

    s=. ^:= (ClearAll[s]; a=.);

    Set[s, new_] ^:= a = new;

    Length[s]    ^:= a // Query["tally", Tr, Last];

  ]

Now you would define a data set like this:

setDiscreteData[dd,
 <|"scale" -> 1.234, "bias" -> 5.678, 
  "tally" -> {{-5, 2}, {-4, 251}, {-3, 5941}, {-2, 60383}, {-1, 241185}, {0, 
     383613}, {1, 241644}, {2, 61035}, {3, 5686}, {4, 259}, {5, 1}}|>
];

The defined rules apply:

Length[dd]
1000000

Every other operation sees dd as an atomic Symbol and handles it as such:

expr = Hold[{1, dd, 3}];

Apply[foo, expr, {2}]
Apply[foo, expr, {3}]
Map[foo, expr, {3}]
Hold[{1, dd, 3}]

Hold[{1, dd, 3}]

Hold[{1, dd, 3}]
Replace[dd, x_ :> foo, 1]

dd + 8
dd

8 + dd

Once a data set Symbol (dd) has been assigned it can be reassigned with Set:

dd = 
  <|"scale" -> 1.234, "bias" -> 5.678, "tally" -> {{-1, 2}, {0, 7}, {1, 4}}|>;

Length[dd]
13

The Symbol may be reset to normal (all definitions cleared) using Unset:

dd=.

This method is incidentally related to How to Set parts of indexed lists?