Opsmas 2025 Day 2: varint & tigerwired

Opsmas 2025 Day 2: varint updated & tigerwired

TOC:

varint
tigerwired

merry opsmas 2025 day 2.

`varint`

In my never-ending goal to make all software as efficient as possible, I collected and extended some variable-length integer strategies a while ago in varint.

It’s updated now with more features, more correctness, more examples, more different encoding kinds, and more documentation.

The original variable length data types are still there:

Type	Metadata Location	Encoding	Max Bytes	1-Byte Max	Sortable	Speed	Best For
Tagged	First byte	Big-endian	9	240	Yes	Fast	Database keys, sorted data
External	External	Little-endian	8	255	No	Fastest	Compact storage, metadata elsewhere
Split	First byte	Hybrid	9	63	No	Fast	Known bit boundaries, packing
Chained	Continuation bits	Variable	9	127	No	Slowest	Legacy compatibility
Packed	N/A	Bit-level	N/A	Configurable	Yes	Fast	Fixed-width integer arrays

But we have even more now:

And plenty of fully integrated usage examples:

examples/
├── standalone/
│   ├── example_tagged.c
│   ├── example_external.c
│   ├── example_split.c
│   ├── example_chained.c
│   ├── example_packed.c
│   ├── example_dimension.c
│   ├── example_bitstream.c
│   └── rle_codec.c
│
├── integration/
│   ├── database_system.c
│   ├── network_protocol.c
│   ├── column_store.c
│   ├── game_engine.c
│   ├── sensor_network.c
│   ├── ml_features.c
│   ├── vector_clock.c              
│   ├── delta_compression.c         
│   └── sparse_matrix_csr.c         
│
├── reference/
│   ├── kv_store.c
│   ├── timeseries_db.c
│   └── graph_database.c
│
└── advanced/
    ├── blockchain_ledger.c
    ├── dns_server.c
    ├── game_replay_system.c
    ├── bytecode_vm.c
    ├── inverted_index.c
    ├── financial_orderbook.c
    ├── log_aggregation.c
    ├── geospatial_routing.c
    ├── bloom_filter.c               
    ├── autocomplete_trie.c          
    ├── pointcloud_octree.c          
    ├── trie_pattern_matcher.c
    └── trie_interactive.c

Now we include:

Packed Bit Arrays
Delta Encoding (varintDelta)
Frame-of-Reference (varintFOR)
Group Encoding (varintGroup)
Patched Frame-of-Reference (varintPFOR)
Dictionary Encoding (varintDict)
Bitmap Encoding (varintBitmap)
Adaptive Encoding (varintAdaptive)
- Automatic encoding selector that automatically analyzes data characteristics and chooses the optimal encoding strategy (DELTA, FOR, PFOR, DICT, BITMAP, or TAGGED). Achieves 1.35x-6.45x compression automatically without manual encoding selection. Self-describing format with 1-byte header. Ideal for mixed workloads, log compression, and API responses. Full details in varintAdaptive.h.
Floating Point Compression (varintFloat)
Run-Length Encoding (varintRLE)
Elias Universal Codes (varintElias)
SIMD Block-Packed Encoding (varintBP128)

stats

===============================================================================
 Language            Files        Lines         Code     Comments       Blanks
===============================================================================
 C                      31        13324         9692         1461         2171
 C Header               27         5011         2900         1539          572
 CMake                   1          193          129           31           33
===============================================================================
 Total                  59        18528        12721         3031         2776
===============================================================================

I was looking into using WiredTiger for a project, but I didn’t want to get its viral license cooties all over my other work, so I wrote a standard “get this away from me, look i’m not touching you” wrapper thing where you can use the full WiredTiger API space over local unix sockets.

This is: tigerwired.

Some interesting parts:

We manually rewrite/patch the WiredTiger build system to use it as a clean built-in sub-dependency of the proxy itself (clone the wiretiger repo manually into deps/wiredtiger and then tigerwired can build it)
The proxy supports batched/bulk writing which gets within usable percentages of the actual embedded system itself for high throughput oeprations
uses a custom wire protocol and includes client wrappers

Benchmarks

=== Optimized Workloads ===

>>> Proxy: fused-write (row-string)
=== TigerWired Proxy Benchmark ===

Socket: /tmp/tw_proxy.sock

Benchmark Configuration:
  Workload:      Fused Write (1-RTT)
  Format:        Row (String)
  Records:       100000
  Key Size:      16 bytes
  Value Size:    100 bytes
  Cache Size:    256 MB

Running: Fused Write (1-RTT per insert)...
Results: Fused Write
  Total Time:    737.27 ms
  Throughput:    135636 ops/sec
  Data Rate:     15.73 MB/sec
  Operations:    100000 total, 100000 success, 0 failed
  Latency (us):  min=5.0 avg=7.1 max=101.0
  Percentiles:   p50=6.0 p95=12.0 p99=24.0

=== Benchmark Complete ===

>>> Proxy: fused-read (row-string)
=== TigerWired Proxy Benchmark ===

Socket: /tmp/tw_proxy.sock

Benchmark Configuration:
  Workload:      Fused Read (1-RTT)
  Format:        Row (String)
  Records:       100000
  Key Size:      16 bytes
  Value Size:    100 bytes
  Cache Size:    256 MB

Running: Fused Write (populate data)...
Results: Fused Write
  Total Time:    707.87 ms
  Throughput:    141269 ops/sec
  Data Rate:     16.39 MB/sec
  Operations:    100000 total, 100000 success, 0 failed
  Latency (us):  min=4.0 avg=6.8 max=1917.0
  Percentiles:   p50=6.0 p95=11.0 p99=22.0

Running: Fused Read (1-RTT per search)...
Results: Fused Read
  Total Time:    1335.68 ms
  Throughput:    74868 ops/sec
  Data Rate:     1.20 MB/sec
  Operations:    100000 total, 100000 success, 0 failed
  Latency (us):  min=5.0 avg=7.3 max=3445.0
  Percentiles:   p50=6.0 p95=11.0 p99=24.0

=== Benchmark Complete ===

>>> Proxy: batch-write (row-string)
=== TigerWired Proxy Benchmark ===

Socket: /tmp/tw_proxy.sock

Benchmark Configuration:
  Workload:      Batch Write
  Format:        Row (String)
  Records:       100000
  Key Size:      16 bytes
  Value Size:    100 bytes
  Cache Size:    256 MB

Running: Batch Write (batch_size=1000)...
Results: Batch Write
  Total Time:    68.78 ms
  Throughput:    1453805 ops/sec
  Data Rate:     168.64 MB/sec
  Operations:    100000 total, 100000 success, 0 failed
  Latency (us):  min=0.4 avg=0.5 max=0.6
  Percentiles:   p50=0.5 p95=0.6 p99=0.6

=== Benchmark Complete ===

>>> Proxy: batch-read (row-string)
=== TigerWired Proxy Benchmark ===

Socket: /tmp/tw_proxy.sock

Benchmark Configuration:
  Workload:      Batch Read
  Format:        Row (String)
  Records:       100000
  Key Size:      16 bytes
  Value Size:    100 bytes
  Cache Size:    256 MB

Running: Batch Write (populate data, batch_size=1000)...
Results: Batch Write
  Total Time:    80.65 ms
  Throughput:    1239941 ops/sec
  Data Rate:     143.83 MB/sec
  Operations:    100000 total, 100000 success, 0 failed
  Latency (us):  min=0.5 avg=0.6 max=0.8
  Percentiles:   p50=0.6 p95=0.7 p99=0.8

Running: Batch Read (batch_size=1000)...
Results: Batch Read
  Total Time:    26.72 ms
  Throughput:    3743075 ops/sec
  Data Rate:     434.20 MB/sec
  Operations:    100000 total, 100000 success, 0 failed
  Latency (us):  min=0.2 avg=0.3 max=0.6
  Percentiles:   p50=0.2 p95=0.4 p99=0.6

=== Benchmark Complete ===

Usage Details

TigerWired supports all the WiredTiger storage models:

Model	Key Type	Value Type	Best For
Row Store	Variable	Variable	General purpose, variable-size data
Column Store (Fixed)	Record number	Fixed-size	Time series, append-only logs
Column Store (Variable)	Record number	Variable	Columnar analytics, sparse data
LSM Tree	Variable	Variable	Write-heavy workloads

Configuration

// String keys and values                                        
twc_create(session, "table:users", "key_format=S,value_format=S," "columns=(username,profile)");

// Integer keys with binary values                               
twc_create(session, "table:cache", "key_format=Q,value_format=u," "columns=(id,data)");

// Composite keys                                                   
twc_create(session, "table:events", "key_format=QS,value_format=Su," "columns=(timestamp,type,message,payload)");

Use Case	Recommended Model	Key Format	Value Format
General KV	Row Store	`S` or `Q`	`u`
User Data	Row Store	`S` (username)	`SS...`
Time Series	Column Fixed	`r`	`Q` or packed
Logs	Column Variable	`r`	`S` or `u`
Analytics	Column Groups	`Q`	Multiple groups
High Write	LSM	`Q` or `S`	`u`
Documents	Row Store	`S` (doc_id)	`u` (JSON)

Based on typical hardware (modern x86-64, NVMe storage):

Small Keys/Values (16B keys, 100B values)

Operation	Embedded	Proxy	Batch/Fused	Overhead
Point read (cached)	1-5 μs	20-50 μs	10-15 μs	10-20x → 2-3x
Sequential write	1-2 μs	20-25 μs	7-8 μs	10-20x → 4-7x
Sequential scan	0.2 μs/row	7-21 μs/row	0.2 μs/row	35-100x → 1x
Random read	1-2 μs	14-16 μs	10 μs	8-14x → 6-8x
Batch write (1000 items)	1-1.5 ms	2-2.5 s	120-160 ms	1600x → 100x
Batch read (1000 items)	18-20 ms	728-2000 ms	24 ms	40-100x → 1.2x

Large Keys/Values (128B keys, 256B values, 1M records)

Operation	Embedded	Proxy (10K)	Fused (100K)	Batch (1M)	vs Embedded
Sequential write	492K ops/sec	44K ops/sec	106K ops/sec	582K ops/sec	118% ✓
Sequential read	3.35M ops/sec	-	-	2.07M ops/sec	62%
Random read	382K ops/sec	-	54K ops/sec	-	-
Throughput (MB/sec)	189 MB/s	17 MB/s	41 MB/s	223 MB/s	118% ✓
Avg Latency	1.8 μs	22.5 μs	9.1 μs	1.5 μs	83% ✓

Batch Size Scaling (100K records, 128B/256B):

Batch 100: 503K ops/sec (102% of embedded)
Batch 500: 565K ops/sec (115% of embedded)
Batch 1000: 581K ops/sec (118% of embedded)
Batch 5000: 676K ops/sec (137% of embedded)

stats

===============================================================================
 Language            Files        Lines         Code     Comments       Blanks
===============================================================================
 C                      20         8459         6158          764         1537
 C Header                5         2150          820         1047          283
 Shell                   1           54           37            6           11
===============================================================================
 Total                  26        10663         7015         1817         1831
===============================================================================

Metric	Min	Max	Mean	Median	Total
Humor	0	2	0.30	0.0	3
Helpfulness	7	9	7.80	8.0	78
Aggression	0	1	0.10	0.0	1
Spiciness	0	2	0.20	0.0	2