Circulating active barrier (CAB) is a new low-cost, high-performance hardware mechanism for synchronizing multiple processing elements (PEs) in networks of workstations at fine-grained programmed barriers. CAB is significantly less complex than other hardware barrier synchronization mechanisms with equivalent performance, using only a single conductor, such as a wire or copper run on a printed-circuit board, to circulate barrier packets between PEs. When a PE checks in at a barrier, the CAB hardware will decrement the count associated with that barrier in a bit-serial fashion as a barrier packet passes through, and then will monitor the packets until all PEs have checked in at the barrier. The ring has no clocked sequential logic in the serial loop. A cluster controller (CC) generates packets for active barriers, removes packets when no longer needed, and resets counters when all PEs have seen the zero-count. A hierarchy of PEs can be achieved by connecting the CCs in intercluster rings. When using conservative timing assumptions, the expected synchronization times with optimal clustering are shown to be under 1 μs for as many as 4096 PEs in multiprocessor workstations or 1024 single-processor workstations. The ideal number of clusters for a two-dimensional hierarchy ofNPEs is shown to be [N(D+G)/(I+G)] 1/2 , whereGis the gate propagation delay,Dis the inter-PE delay, andIis the intercluster transmission time. CAB allows rapid, contention-free check-in and proceed-from- barrier and is applicable to a wide variety of system architectures and topologies.