MapDB 4 update

I resumed work on MapDB 4 recently. Here is an update about my progress and plans.

What is finished

Not much. I have many notes, some code and design.

Plans for March

  • Restart work on MapDB 4
    • get skeleton with most features
    • release milestone 1 (see bellow)
    • get back into increment development
  • Catalog all unit tests for MapDB (and java collections)
    • port tests into MapDB 4 code
  • Create a tool to export/import data from MapDB 3
  • Read all Issues on Github and create Roadmap
  • Maintenance release for MapDB 3 and maybe other versions
  • Fix website and documentation

MapDB 4 milestone 1

  • This milestone should flash out my design ideas into code
  • Limited usability
    • fixed store size (2GB)
    • several heap memory leaks (support data structures will use onheap collections)
  • use fixed sizes ByteBuffers (in memory, mmap files)
    • 2GB or less (limit for single ByteBuffer)
    • store size change is responsible for many complications
  • will be single threaded
    • single writer, multiple readers (ReadWriteLock)
  • will include snapshots
    • snapshots play major role in MapDB4 design
    • COW, use heap collection t
    • old snapshots discarded by GC
  • very basic durable TX
    • store will use file swap
  • should include most collection types (Map, List, Queues)
    • needed for benchmarking, even if initial performance is bad
  • primitive maps from Eclipse Collections backed by ByteBuffer
    • needed for temporary internal structures to track free space etc..

Compatibility with older versions

  • MapDB Backup Format
    • there will be new format for backups, imports and exports
    • JSON based
    • MapDB 1,2,3,4 will get exporters
      • modify old code bases and making new releases with modifications
    • MapDB 4 will have importer from this format
    • Latter maybe support for other dbs (LevelDB, Redis, SQL, Cassandra…)
  • command line tools to convert files into new MapDB 4 format

  • MapDB 4 will use different package names
    • it will be possible to use MapDB 4 with older versions when full class names are used (org.mapdb.DB)
    • class names will not conflict with other versions
    • org.mapdb will stay empty
    • org.mapdb.volume (introduced in MapDB3) will not be used
    • org.mapdb.serializer was renamed to org.mapdb.ser

Major changes in MapDB 4

  • eliminate class inheritance
    • always use interfaces
    • specialized classes generated by code generator
    • inheritance JIT performance issues with multiple inherited classes
    • package size will skyrocket
      • 10MB is ok, 20MB maybe, 50MB bad
      • I know Scala and issues it causes on Android
  • Volumes are gone
    • it was an IO layer abstraction (ByteBuffer, RandomAccessFile)
    • it is not possible to wrap different types if IO into single interface
    • use code generator to inline IO directly into Store
      • many store implementations (StoreDirect_RandomAccessFile, StoreDirect_ByteBuffer)…
  • Elsa and other POJO serializations are gone from default config
    • Default serializer will only support basic JVM data types (Long, String…)
    • POJO serialization is too complex to handle with default configuration
      • class rename, performance overhead….
      • POJO serialization must be configured separately

Commends:

Pranas • a year ago

Sounds like real-logic/agrona could love both 2G limitations (buffers) and provide small packages size (primitive collections to replace Eclipse collections) and provide some more concurrency wise clever primitives.

Money Manager • 2 years ago

Will JSON handling be added as well ? XPATH like query etc.

Pranas -> Money Manager • a year ago

https://commons.apache.org/… could be used as wrapper on top of the stores. It’s nice not to have bloated library …