Recent evidence shows that children can use cross-situational statistics to learn new object labels under referential ambiguity (e.g., Smith & Yu, 2008). Such evidence has been interpreted as support for proposals that statistical information about word-referent co-occurrence plays a powerful role in word learning. But object labels represent only a fraction of the vocabulary children acquire, and arguably represent the simplest case of word learning based on observations of world scenes. Here we extended the study of cross-situational word learning to a new segment of the vocabulary, action verbs, to permit a stronger test of the role of statistical information in word learning. In two experiments, on each trial 2.5-year-olds encountered two novel intransitive (e.g., "She's pimming!"; Experiment 1) or transitive verbs (e.g., "She's pimming her toy!"; Experiment 2) while viewing two action events. The consistency with which each verb accompanied each action provided the only source of information about the intended referent of each verb. The 2.5-year-olds used cross-situational consistency in verb learning, but also showed significant limits on their ability to do so as the sentences and scenes became slightly more complex. These findings help to define the role of cross-situational observation in word learning.