[D] What role does data quality plays in the LLM scaling laws?

IAmBlueNebula@alien.top · 1 year ago

[D] What role does data quality plays in the LLM scaling laws?

thedabking123@alien.top · 1 year ago

Measuring and improving quality of NLP datasets in a comprehensive way is probably the main migraine there.

You can measure and improve quality by many dimensions that practitioners disagree on… ( accuracy, completeness, consistency, timeliness, validity, and uniqueness are common ways to slice data quality) and there’s no consistent single measure for some of those either.