From my understanding (I’m a sound designer) it’s not so much about capturing content above the upper threshold of human hearing as it is about having additional samples to play with when manipulating content. If you halve the playback fequency of a 48KHz sample, you’re effectively resampling to 24KHz. Content with higher sample rates tends to be more robust in these kinds of scenarios.
Same goes for bit depth when you’re making radical alterations to dynamics.