Hi all,

The official paper (http://arxiv.org/abs/2307.09288) mentions “LLama 2-Chat temporal Perception” abilities (p33) and figure 22 illustrates “Time awareness” of the model.

But how did they provide yearly context to the model?

I’d like to replicate a similar benchmark for Llama2 and other LLMs but it isn’t clear how to replicate the same setup.