Sharding Notation Explorer

A sharding assigns mesh axes to the logical dimensions of an array. Blocks are coloured by their data identity: the same colour appearing on several devices means that block is replicated.

Logical view: A[I_X, J_Y]

Global shape
bf16[1024, 2048]
Local (per-device) shape
bf16[512, 1024]
Blocks
2 × 2
Replication factor
1×
Bytes per device
1.05 MB
Total bytes on mesh
4.19 MB

×n badges mark blocks replicated on n devices (mesh axes not used by the sharding).

Device mesh view (2×2, X vertical / Y horizontal)

TPU 0 (0,0)
block (0,0)
TPU 1 (0,1)
block (0,1)
TPU 2 (1,0)
block (1,0)
TPU 3 (1,1)
block (1,1)

Each device holds the block selected by its coordinates along the sharded axes. Try IXY vs IYX to see how the subscript order changes the traversal order of the grid.