ball-blog
  • Welcome, I'm ball
  • Machine Learning - Basic
    • Entropy
    • Cross Entropy
    • KL-Divergence
    • Monte-Carlo Method
    • Variational Auto Encoder
    • SVM
    • Adam Optimizer
    • Batch Normalization
    • Tokenizer
    • Rotary Positional Encoding
    • Vector Quantized VAE(VQ-VAE)
    • DALL-E
    • Diffusion Model
    • Memory Layers at Scale
    • Chain-of-Thought
  • Einsum
  • Linear Algebra
    • Linear Transformation
    • Determinant
    • Eigen-Value Decomposition(EVD)
    • Singular-Value Decomposition(SVD)
  • AI Accelerator
    • CachedAttention
    • SGLang
    • CacheBlend
  • Reinforcement Learning
    • Markov
  • Policy-Improvement Algorithm
  • Machine Learning - Transformer
    • Attention is All you need
    • Why do we need a mask in Transformer
    • Linear Transformer
    • kr2en Translator using Tranformer
    • Segment Anything
    • MNIST, CIFAR10 Classifier using ViT
    • Finetuning PaliGemma using LoRA
    • LoRA: Low-Rank Adaptation
  • EGTR: Extracting Graph from Transformer for SGG
  • Machine Learning - Mamba
    • Function Space(Hilbert Space)
    • HIPPO Framework
    • Linear State Space Layer
    • S4(Structures Space for Sequence Model)
    • Parallel Scan Algorithm
    • Mamba Model
  • Computer System
    • Memory Ordering: Release/Acquire 1
    • Memory Ordering: Release/Acquire 2
    • BUDAlloc
    • Lock-free Hash Table
    • Address Sanitizer
  • App development
    • Bluetooth connection in linux
    • I use Bun!
    • Using Tanstack-query in Frontend
    • Deploying AI Service using EC2
  • Problem Solving
    • Bipartite Graph
    • Shortest Path Problem in Graph
    • Diameter of a Tree
  • Scribbles
Powered by GitBook
On this page
  • Hash Table
  • How about lock-free
  • Implementation details

Was this helpful?

Edit on GitHub
  1. Computer System

Lock-free Hash Table

Last updated 3 months ago

Was this helpful?

As current programs try to fully utilize their CPU & GPU, concurrent programming is getting much more important. At this post, I want to talk about utilizing hash table in concurrent programming.

Hash Table

Hashtable is the most popular data structure in programming. It is an efficient key-value store, which only takes O(1)O(1)O(1) time for lookup.

However, hashtable cannot be used in concurrent programming. If multiple thread tries to R/W hash table, then data might get polluted. How can we overcome this problem?

Using Global lock

Yes, this can be the solution, but it will suffer performance bottleneck from getting the lock.

Using Bucket locks

If we give locks for each buckets, then performance bottleneck can be resolved. However, the problem happens when the bucket gets full, and we have to apply a new hash function and redistribute elements to the buckets. In this case, we would have to block every access to the hash table.

How about lock-free

If we cannot use a lock, then why not lock-free?

The key point of implementing a lock-free datastructure is to make a single commit point using CAS(Compare-And-Swap). Please refer to other materials to understand about implementing a lock-free.

We use two structure to achieve lock-free hashtable: bucket-lookup table, and lock-free sorted linked list

Bucket-lookup table

Bucket-lookup table points to the bucket starting node for given key. This can be easily made as lock-free.

Lock-free sorted linked list

Elements and bucket nodes are sorted in a special way. Thanks to special ordering, it is very easy to split a bucket into half. (Which is the 'cause of problem' in bucket lock implementation)

Implementation details

Due to KAIST Honor Code, I cannot open the source code of the assignment. If you have any questions, please feel free to contact.

618KB
Split-Ordered Lists- Lock-Free Extensible Hash Tables.pdf
pdf
Figure of Hash Table
Figure of lock-free hash table.
Figure of lock-free hashtable operation of assigning a new bucket
LogoSplit-ordered lists: Lock-free extensible hash tables: Journal of the ACM: Vol 53, No 3Journal of the ACM