Do you find you still need variables you can change, and if so why?

The hardest problem I've come across is shuffling a list. The Fisher-Yates algorithm (also sometimes known as the Knuth algorithm) involves iterating through the list swapping each item with a random other item. The algorithm is O(n), well known and long-since proven correct (an important property in some applications). But it requires mutable arrays.

That isn't to say you can't do shuffling in a functional program. Oleg Kiselyov has written about this. But if I understand him correctly, functional shuffling is O(n . log n) because it works by building a binary tree.

Of course, if I needed to write the Fisher-Yates algorithm in Haskell I'd just put it in the ST monad, which lets you wrap up an algorithm involving mutable arrays inside a nice pure function, like this:

-- | Implementation of the random swap algorithm for shuffling.  Reads a list
-- into a mutable ST array, shuffles it in place, and reads out the result
-- as a list.

module Data.Shuffle (shuffle) where


import Control.Monad
import Control.Monad.ST
import Data.Array.ST
import Data.STRef
import System.Random

-- | Shuffle a value based on a random seed.
shuffle :: (RandomGen g) => g -> [a] -> [a]
shuffle _ [] = []
shuffle g xs = 
    runST $ do
      sg <- newSTRef g
      let n = length xs
      v <- newListArray (1, n) xs
      mapM_ (shuffle1 sg v) [1..n]
      getElems v

-- Internal function to swap element i with a random element at or above it.
shuffle1 :: (RandomGen g) => STRef s g -> STArray s Int a -> Int -> ST s ()
shuffle1 sg v i = do
  (_, n) <- getBounds v
  r <- getRnd sg $ randomR (i, n)
  when (r /= i) $ do
    vi <- readArray v i
    vr <- readArray v r
    writeArray v i vr
    writeArray v r vi


-- Internal function for using random numbers
getRnd :: (RandomGen g) => STRef s g -> (g -> (a, g)) -> ST s a
getRnd sg f = do
  g1 <- readSTRef sg
  let (v, g2) = f g1
  writeSTRef sg g2
  return v

If you want to make the academic argument, then of course it's not technically necessary to assign a variable more than once. The proof is that all code can be represented in SSA (Single Static Assignment) form. Indeed, that's the most useful form for many kinds of static and dynamic analysis.

At the same time, there are reasons we don't all write code in SSA form to begin with:

It usually takes more statements (or more lines of code) to write code this way. Brevity has value.
It's almost always less efficient. Yes I know you're talking about higher languages -- a fair scoping -- but even in the world of Java and C#, far away from assembly, speed matters. There are few applications where speed is irrelevant.
It's not as easy to understand. Although SSA is "simpler" in a mathematical sense, it's more abstract from common sense, which is what matters in real-world programming. If you have to be really smart to grok it, then it has no place in programming at large.

Even in your examples above, it's easy to poke holes. Take your case statement. What if there's an administrative option that determines whether '*' is allowed, and a separate one for whether '?' is allowed? Also, zero is not allowed for the integer case, unless the user has a system permission that allows it.

This is a more real-world example with branches and conditions. Could you write this as a single "statement?" If so, is your "statement" really different from many separate statements? If not, how many temporary write-only variables do you need? And is that situation significantly better than just having a single variable?

Do you find you still need variables you can change, and if so why?

Tags:

Variables

Functional Programming

Haskell

Ssa

Modern Languages

Related

Recent Posts