Monday, December 31, 2012

Evaluating probabilistic cellular automata is comonadic, too! (part II)

Part I is here. The complete code is here.

We have defined our CA datatype and a local update rule for it. Since our CA is probabilistic, the update rule has a monadic effect in the Rand monad.

To perform a "global update" of the CA, we pass the current CA state and the local update rule to the extend function of Comonad:

extend localRule ca :: EnvT Probs U (Rand StdGen Bool)

The returning type is almost, but not quite, what we need. We now have a CA in which each cell is a monadic random effect. What we would like is to have the whole CA inside the Rand monad, so that a single call to runRand or evalRand gives us a value of EnvT Probs U Bool. We need a way to aggregate the monadic effects of each singular cell into a global monadic effect for the whole CA.

This is similar to the problem of transforming a list of IO actions into an IO action that returns a list. List is an instance of Traversable, so we can use the sequence function on it. Likewise, we must make our type U an instance of Traversable.

The easiest way is to have the instance derived for you automatically. Just use the DeriveTraversable extension and add Traversable to the deriving clause when declaring the U datatype.

But it doesn´t work. More precisely, it works only for one half of the universe! I wanted to print the CA's new state, 8 cells around the center (the showca function is defined in the gist):

    putStrLn . showca 8 
             . lower 
             . flip evalRand (mkStdGen 7) 
             . sequence 
             $ extend localRule ca

This printed only the left side of the universe. When it tried to print the first cell to the right, the whole computation just hanged. Why?

sequence is defined in terms of traverse. The default definition provided by the compiler first traverses the left side of the universe, then the center cell and the right side last. The StdGen generator is threaded throughout. But each side is infinite; when traversing the left side, we can get the updated generator to use on the right side only after an infinite number of steps. Laziness doesn't help here, because there is a data dependency.

This stumped me for a while, but finally I came up with an alternate version of traverse which doesn't suffer this problem:

    instance Traversable U where
        traverse f (U lstream focus rstream) = 
            let pairs =  liftA unzip 
                       . sequenceA . fmap (traversepair f) 
                       $ zip lstream rstream 
                traversepair f (a,b) = (,) <$> f a <*> f b
                rebuild c (u,v) = U u c v
            in rebuild <$> f focus <*> pairs

The trick is to zip the two streams together before traversing them, and unzip them afterwards inside the monad. This way, accessing the value in the first cell to the right doesn't require an infinite number of steps.

That solved, there is still one problem left. When we use evalRand to obtain the new state of the CA, we don't get an updated generator alongside it. And we can't use runRand to get it because the CA is potentially infinite. As we lazily consume the returned CA state, the generator we supplied is still used "under the hood" for generating new values. If we feed the updated generator returned by runRand to the following iteration, the whole computation would hang, again.

Fortunately, the MonadRandom class provides a getSplit operation which "bifurcates" a generator. At each iteration, we split the generator, throwing one half into the black hole of evalRand, and keeping the other half for the next iteration. We now know enough to define a function that, given a generator and an initial state of the CA, produces a lazy stream of all the succesive states of the automaton:

evolve :: EnvT Probs U Bool -> Rand StdGen (EnvT Probs U Bool)
evolve ca = sequence $ extend localRule ca

history :: StdGen -> EnvT Probs U Bool -> Stream (EnvT Probs U Bool)
history seed initialca =     
   let unfoldf (ca,seed) = 
           let (seed',seed'') = runRand getSplit seed 
               nextca = evalRand (evolve ca) seed'
           in  (nextca,(nextca,seed''))
   in unfold unfoldf (initialca,seed)

That's all folks! Happy New Year.

Edit: I wasn't sure of the correctness of my Traversable instance, so I posted a question on Stack Overflow.