F# Immutable data structures for high frequency real-time streaming data

You should consider FSharpx.Collections.Vector. Vector<T> will give you Array-like features, including indexed O(log32(n)) look-up and update, which is within spitting distance of O(1), as well as adding new elements to the end of your sequence. There is another implementation of Vector which can be used from F# at Solid Vector. Very well documented and some functions perform up to 4X faster at large scale (element count > 10K). Both implementations perform very well up to and possibly beyond 1M elements.


In his answer, Jack Fox suggests using either the FSharpx.Collections Vector<'T> or the Solid Vector<'t> by Greg Rosenbaum (https://github.com/GregRos/Solid). I thought I might give back a bit to the community by providing instructions on how to get up and running with each of them.

Using the FSharpx.Collections.Vector<'T>

The process is pretty straight forward:

  1. Download the FSharpx.Core nuget package using either the Project Manager Console or Manager Nuget Packages for Solution. Both are found in Visual Studio -> tools -> Library Manager.
  2. If you're using it in F# script file add #r "FSharpx.Core.dll". You may need to use a full path.

Usage:

open FSharpx.Collections

let ListOfTuples = [(1,true,3.0);(2,false,1.5)] 
let vector = ListOfTuples |> Vector.ofSeq

printfn "Last %A" vector.Last
printfn "Unconj %A" vector.Unconj
printfn "Item(0) %A" (vector.[0])
printfn "Item(1) %A" (vector.[1])
printfn "TryInitial %A" dataAsVector.TryInitial
printfn "TryUnconj %A" dataAsVector.Last

Using the Solid.Vector<'T>

Getting setup to use the Solid Vector<'t> is a bit more involved. But the Solid version has a lot more handy functionality and as Jack pointed out, has a number of performance benefits. It also has a lot of useful documentation.

  1. You will need to download the visual studio solution from https://github.com/GregRos/Solid
  2. Once you have downloaded it you will need to build it as there is no ready to use pre-built dll.
  3. If you're like me, you may run into a number of missing dependencies that prevent the solution from being built. In my case, they were all related to the nuit testing frameworks (I use a different one). Just work through downloading/adding each of the dependencies until the solutions builds.
  4. Once that is done and the solution is built, you will have a shiny new Solid.dll in the Solid/Solid/bin folder. This is where I went wrong. That is the core dll and is only enough for C# usage. If you only include a reference to the Solid.dll you will be able to create a vector<'T> in f#, but funky things will happen from then on.
  5. To use this data structure in F# you will need to reference both the Solid.dll and the Solid.FSharp.dll which is found in \Solid\SolidFS\obj\Debug\ folder. You will only need one open statement -> open Solid

Here is some code showing usage in a F# script file:

#r "Solid.dll"
#r "Solid.FSharp.dll" // don't forget this reference

open Solid

let ListOfTuples2 = [(1,true,3.0);(2,false,1.5)] 
let SolidVector = ListOfTuples2 |> Vector.ofSeq

printfn "%A" SolidVector.Last
printfn "%A" SolidVector.First
printfn "%A" (SolidVector.[0])
printfn "%A" (SolidVector.[1])
printfn "Count %A" SolidVector.Count

let test2 = vector { for i in {0 .. 100} -> i }