The statements about "Long LBA" seem to originate from that one Seagate product manager. What she probably meant is that even though 48-bit LBA is the standard on hardware, 32-bit OSes and drivers may not support all 48 bits, since (a) they're 32-bit and that's more hassle, and (b) MBR only supports 32 bits for partitions anyway, so why bother.
When the 128GB barrier was broken by adopting 48-bit LBA, that was both on the hardware side, in the ATA specification, going up from 28 bits; and also on the OS/driver side, to make sure they didn't hard-code that old 28-bit limit. You can probably say that current well-written drivers actually conform to the actual 48-bit limit, but it's easy to see how somewhere along the chain, someone took the easy way out and only supports 32 bits in their 32-bit drivers. Given that 32-bit OSes are on the way out anyway, it may not be worth trying to make sure all of that works.
As you said, the real issue (for Windows at least) is booting a GPT disk from a BIOS-based (non-EFI) computer. The Protective MBR is designed to make the entire disk seem like a single unknown partition, so that a BIOS/MBR-aware computer won't even touch it. You can create a Hybrid disk, so that the MBR also contains other entries, for partitions below the 2TB barrier. But such Hybrid disks are fragile (easy to clobber with either MBR or GPT partition tools), and no longer officially GPT disks. You're also not booting any GPT partitions, you're booting MBR partitions. That might be OK if you just want to use the GPT partition as a data drive.
And why can't BIOS boot GPT? The short answer is that they haven't, and in order to add that capability, you'd need a smarter BIOS. And that's what EFI is for.
4KB sectors would give you 16TB disks with 32-bit LBA. (And fewer larger sectors means potentially less I/O overhead.) But all the OSes and drivers and even some apps would have to be written to support variable-sized sectors. All you need is one place where the sector size is hard-coded at 512 to break. So 4KB sectors is also not an "easy" solution because it would take a lot of work from many parties. But if you're going to write future software to support GPT (which is sector-size agnostic) and variable-size sectors, it might be adopted as the common practice at some point.