Could you use a traditional embedding, and then somehow search for a vector that represents the semantic feature you are interested in? What I mean is that, since LLMs can understand the concept of numbers, and this is a pretty fundamental part of language, presumably (but not necessarily) there is a vector in the high dimensional embedding space that represents the concept of “how many”. I’m thinking, of course, along the lines of the traditional example of “king” - “male” + “female” = “queen”, where you could, for example, define a “gender” vector based on “male”, “female” and perhaps a set of other related words.
I’m not sure how feasible that is at all - I’m just curious if it’s something you explored or read about as you were doing this?
One big reason for this is that there is a difference between prediction and inference. Most machine learning papers are not testing a hypothesis.
That said - ML definitely does get applied to inference as well, but in those cases the lack of p-values is often one of the lesser complaints.