Learning to grok: Emergence of in-context learning and skill in modular arithmetic tasks