For onboard (Headphone out) Cubase reports 96kHz/32 samples is 0.896ms in, 2.250ms out. FWIW, RME @96kHz / 64 samples is the same input latency (This seems to be a Mac OS input limit) but 1.339ms output. That output beats the very best Windows results on the DAWBench chart, though I haven’t verified that with a loopback test. That’s over Thunderbolt (tested with the Driverkit drivers).
ASIO, of course, is just a layer on top of Core Audio on the Mac. But it’s a tale of two driver types. A KEXT based ASIO driver is more like ASIO on Windows, in that it can place itself ahead of what the Mac OS kernel thinks is important in terms of priority. In theory, this means it will hold its performance better when the system is under a heavy CPU load. Soundcards which go the standard “device compliant” route, or via “Driverkit” cannot do this, and aren’t “really” ASIO in the same sense as for Windows devices. RME offer both types.
I use the DriverKit version since Apple will remove KEXT support at some future date, so there’s little point in getting used to something you know will be removed. Also situations, where that last extra few percent is required, just haven’t been common for me with Apple Silicon.
The bottom line is, unless you require more I/O, you don't really need anything beyond the onboard audio for low latency usage on the Mac.