TransWikia.com

What's new in purely functional data structures since Okasaki?

Theoretical Computer Science Asked on October 30, 2021

Since Chris Okasaki’s 1998 book “Purely functional data structures”, I haven’t seen too many new exciting purely functional data structures appear; I can name just a few:

  • IntMap (also invented by Okasaki in 1998, but not present in that book)
  • Finger trees (and their generalization over monoids)

There are also some interesting ways of implementing already known datastructures, such as using “nested types” or “generalized algebraic datatypes” to ensure tree invariants.

Which other new ideas have appeared since 1998 in this area?

6 Answers

Following up on the 2012 paper linked above, the work on RRB vectors has since been extended and published in ICFP'15.

RRB vector: a practical general purpose immutable sequence http://dl.acm.org/citation.cfm?id=2784739

Answered by Mike Rainey on October 30, 2021

Rangemaps

It is a specialized data structure, but it can be used as a substitute for Martin Erwig's DIET, with slightly different properties, so at least there is one existing data structure to compare it to. The DIET itself was described in an article in JFP in 1998, so perhaps it is not included in Purely Functional Data Structures.

Answered by Complicated see bio on October 30, 2021

I'd add McBride's version of zippers as derivatives of data types.

Answered by none on October 30, 2021

To the excellent notes already made, I’ll add Zippers.

Huet, Gerard. “Functional Pearl: The Zipper” Journal of Functional Programming 7 (5): 549-554, September 1997.

Wikipedia: Zipper (data structure)

Answered by Matt Might on October 30, 2021

New purely functional data structures published since 1998:

Known in 1997, but not discussed in Okasaki's book:

  • Many other styles of balanced search tree. AVL, brother, rank-balanced, bounded-balance, and many other balanced search trees can be (and have been) implemented purely functionally by path copying. Perhaps deserving special mention are:

    • Biased Search Trees, by Samuel W. Bent, Daniel D. Sleator, and Robert E. Tarjan: A key element in Brodal et al.'s 2006 paper and Demaine et al.'s 2008 paper.
  • Infinite sets that admit fast exhaustive search, by Martín Escardó: Perhaps not a data structure per se.

  • Three algorithms on Braun Trees, by Chris Okasaki: Braun trees offer many stack operations in worst-case O(lg n). This bound is surpassed by many other data structures, but Braun trees have a cons operation lazy in its second argument, and so can be used as infinite stacks in some ways that other structures cannot.

  • The relaxed min-max heap: A mergeable double-ended priority queue and The KD heap: An efficient multi-dimensional priority queue, by Yuzheng Ding and Mark Allen Weiss: These happen to be purely functional, though this is not discussed in the papers. I do not think the time bounds achieved are any better than those that can be achieved by using finger trees (of Hinze & Paterson or Kaplan & Tarjan) as k-dimensional priority queues, but I think the structures of Ding & Weiss uses less space.

  • The Zipper, by Gérard Huet: Used in many other data structures (such as Hinze & Paterson's finger trees), this is a way of turning a data structure inside-out.

  • Difference lists are O(1) catenable lists with an O(n) transformation to usual cons lists. They have apparently been known since antiquity in the Prolog community, where they have an O(1) transformation to usual cons lists. The O(1) transformation seems to be impossible in traditional functional programming, but Minamide's hole abstraction, from POPL '98, discusses a way of allowing O(1) append and O(1) transformation within pure functional programming. Unlike the usual functional programming implementations of difference lists, which are based on function closures, hole abstractions are essentially the same (in both their use and their implementation) as Prolog difference lists. However, it seems that for years the only person that noticed this was one of Minamide's reviewers.

  • Uniquely represented dictionaries support insert, update, and lookup with the restriction that no two structures holding the same elements can have distinct shapes. To give an example, sorted singly-linked lists are uniquely represented, but traditional AVL trees are not. Tries are also uniquely represented. Tarjan and Sundar, in "Unique binary search tree representations and equality-testing of sets and sequences", showed a purely functional uniquely represented dictionary that supports searches in logarithmic time and updates in $O(sqrt{n})$ time. However, it uses $Theta(n lg n)$ space. There is a simple representation using Braun trees that uses only linear space but has update time of $Theta(sqrt{n lg n})$ and search time of $Theta(lg^2 n)$

Mostly functional data structures, before, during, and after Okasaki's book:

  • Many procedures for making data structures persistent, fully persistent, or confluently persistent: Haim Kaplan wrote an excellent survey on the topic. See also above the work of Demaine et al., who demonstrate a fully persistent array in $O(m)$ space (where $m$ is the number of operations ever performed on the array) and $O(lg lg n)$ expected access time.

  • 1989: Randomized Search Trees by Cecilia R. Aragon and Raimund Seidel: These were discussed in a purely functional setting by Guy E. Blelloch and Margaret Reid-Miller in Fast Set Operations Using Treaps and by Dan Blandford and Guy Blelloch in Functional Set Operations with Treaps (code). They provide all of the operations of purely functional fingertrees and biased search trees, but require a source of randomness, making them not purely functional. This may also invalidate the time complexity of the operations on treaps, assuming an adversary who can time operations and repeat the long ones. (This is the same reason why imperative amortization arguments aren't valid in a persistent setting, but it requires an adversary with a stopwatch)

  • 1997: Skip-trees, an alternative data structure to Skip-lists in a concurrent approach, by Xavier Messeguer and Exploring the Duality Between Skip Lists and Binary Search Trees, by Brian C. Dean and Zachary H. Jones: Skip lists are not purely functional, but they can be implemented functionally as trees. Like treaps, they require a source of random bits. (It is possible to make skip lists deterministic, but, after translating them to a tree, I think they are just another way of looking at 2-3 trees.)

  • 1998: All of the amortized structures in Okasaki's book! Okasaki invented this new method for mixing amortization and functional data structures, which were previously thought to be incompatible. It depends upon memoization, which, as Kaplan and Tarjan have sometimes mentioned, is actually a side effect. In some cases (such as PFDS on SSDs for performance reasons), this may be inappropriate.

  • 1998: Simple Confluently Persistent Catenable Lists, by Haim Kaplan, Chris Okasaki, and Robert E. Tarjan: Uses modification under the hood to give amortized O(1) catenable deques, presenting the same interface as an earlier (purely functional, but with memoization) version appearing in Okasaki's book. Kaplan and Tarjan had earlier created a purely functional O(1) worst-case structure, but it is substantially more complicated.

  • 2007: As mentioned in another answer on this page, semi-persistent data structures and persistent union-find by Sylvain Conchon and Jean-Christophe Filliâtre

Techniques for verifying functional data structures, before, during, and after Okasaki's book:

Imperative data structures or analyses not discussed in Okasaki's book, but related to purely functional data structures:

  • The Soft Heap: An Approximate Priority Queue with Optimal Error Rate, by Bernard Chazelle: This data structure does not use arrays, and so has tempted first the #haskell IRC channel and later Stack Overflow users, but it includes delete in o(lg n), which is usually not possible in a functional setting, and imperative amortized analysis, which is not valid in a purely functional setting.

  • Balanced binary search trees with O(1) finger updates. In Making Data Structures Persistent, James R Driscoll, Neil Sarnak, Daniel D. Sleator, and Robert E. Tarjan present a method for grouping the nodes in a red-black tree so that persistent updates require only O(1) space. The purely functional deques and finger trees designed by Tarjan, Kaplan, and Mihaescu all use a very similar grouping technique to allow O(1) updates at both ends. AVL-trees for localized search by Athanasios K. Tsakalidis works similarly.

  • Faster pairing heaps or better bounds for pairing heaps: Since Okasaki's book was published, several new analyses of imperative pairing heaps have appeared, including Pairing heaps with O(log log n) decrease Cost by Amr Elmasry and Towards a Final Analysis of Pairing Heaps by Seth Pettie. It may be possible to apply some of this work to Okasaki's lazy pairing heaps.

  • Deterministic biased finger trees: In Biased Skip Lists, by Amitabha Bagchi, Adam L. Buchsbaum, and Michael T. Goodrich, a design is presented for deterministic biased skip lists. Through the skip list/tree transformation mentioned above, it may be possible to make deterministic biased search trees. The finger biased skip lists described by John Iacono and Özgür Özkan in Mergeable Dictionaries might then be possible on biased skip trees. A biased finger tree is suggested by Demaine et al. in their paper on purely functional tries (see above) as a way to reduce the time-and space bounds on finger update in tries.

  • The String B-Tree: A New Data Structure for String Search in External Memory and its Applications by Paolo Ferragina and Roberto Grossi is a well studied data structure combining the benefits of tries and B-trees.

Answered by jbapple on October 30, 2021

Answered by Radu GRIGore on October 30, 2021

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP