An efficient, broadband SiGe HBT cascode nonuniform distributed power amplifier (NDPA) is presented for low-cost, fully integrated Si-based phased arrays. Optimum load impedances at each SiGe HBT cascode in a four-stage NDPA core are obtained by scaling the characteristic impedance (Z0) of the collector transmission lines (TLs) and tapering the SiGe HBT emitter area simultaneously. A novel compact, lumped-element two-section λ/4 output impedance transformer (OIT) is proposed to lower the NDPA load impedance (ZL) from 50 to 25 Ω over more than one decade bandwidth (BW). Each λ/4 impedance transformer is realized by four cascaded CLC π-networks integrated into a single three-turn symmetric inductor in order to achieve compact size, high passive efficiency, and high LC cutoff frequency (fc). The systematic design approach of a lumped-element λ/4 impedance transformer with an arbitrary Z0 is described in detail. The prototype NDPA was fabricated in 0.13-μ m SiGe HBT BiCMOS technology. The proposed SiGe HBT cascode NDPA supports both high linearity (HL) and high gain (HG) modes, each suited to a specific application. The NDPA attains a peak power gain of 10.3/12.5 dB, a saturated output power (Pout) of 21.3/21.5 dBm, and a power added efficiency (PAE) of 12.2%/12.5%-21.6%/22.0% for HL/HG modes, with a 3-dB BW from 1.5 to 24.0 GHz. The NDPA delivers 13.0-dBm average Pout with a PAE of 10.0% at 6-Gbit/s data rate 64 QAM modulation.