Sharding Notation Explorer
A sharding assigns mesh axes to the logical dimensions of an array. Blocks are coloured by their data identity: the same colour appearing on several devices means that block is replicated.
Logical view: A[I_X, J_Y]
- Global shape
- bf16[1024, 2048]
- Local (per-device) shape
- bf16[512, 1024]
- Blocks
- 2 × 2
- Replication factor
- 1×
- Bytes per device
- 1.05 MB
- Total bytes on mesh
- 4.19 MB
×n badges mark blocks replicated on n devices (mesh axes not used by the sharding).
Device mesh view (2×2, X vertical / Y horizontal)
TPU 0 (0,0)
TPU 1 (0,1)
TPU 2 (1,0)
TPU 3 (1,1)
Each device holds the block selected by its coordinates along the sharded axes. Try IXY vs IYX to see how the subscript order changes the traversal order of the grid.