Recent experiments have demonstrated the generation of widely spaced parametric sidebands that can evolve into “clustered” optical frequency combs in Kerr microresonators. Here we describe the physics that underpins the formation of such clustered comb states. In particular, we show that the phase matching required for the initial sideband generation is such that (at least) one of the sidebands experiences anomalous dispersion, enabling the sideband to drive frequency comb formation via degenerate and non-degenerate four-wave mixing. We validate our proposal through a combination of experimental observations made in a magnesium-fluoride microresonator and corresponding numerical simulations. We also investigate the coherence properties of the resulting clustered frequency combs. Our findings provide valuable insights on the generation and dynamics of widely spaced parametric sidebands and clustered frequency combs in Kerr microresonators.