How does Proof of Stake (PoS) improve scalability compared to Proof of Work (PoW)?
As @hd49728 and @BlackHatCoiner wrote, scalability of a single coin on a chain is normally not tied to the consensus model (PoS, PoW or whatever).
However, I have seen this claim indeed several times, so I elaborate a bit more. In some situations it can
appear that PoS is "more scalable", but in most cases this isn't really the case.
First of all, if you have an universe of different tokens (like ERC-20s or memecoins on Solana), at a first glance you could think that you can improve scalability if instead of managing it on a single blockchain with PoW (like Bitcoin) you could also manage these tokens on different independent PoS blockchains, because PoS blockchains "don't waste energy" (or at least less). The more chains, the higher the scalability. However, in PoW a similar concept exists: merged mining. A single PoW can be re-used for Bitcoin and Namecoin for example, or for Litecoin and Dogecoin. So there is no real advantage.
Second, it's a long "scalability dream" of several altcoin communities to split the operation of a blockchain platform (or "coin") into several databases, with all nodes only validating one of these databases or a subset of transactions. So the work for every full node is decreased. This is called "sharding". Sharding apparently doesn't work well with proof-of-work, but on Proof of stake platforms there is some research.
However, as of 2024 there is no working blockchain platform to my knowledge which supports sharding in its originally envisioned form. Ethereum, which has had sharding on its roadmap for more than 5 years, is probably abandoning it.
You can consider "second layers" like Ethereum's rollups "shards". But then you could also consider Bitcoin sidechains like Nomic or tBTC "shards" of Bitcoin, and their centralization grade is similar to Ethereum's rollups, although Ethereum provides a bit more comfort to manage sidechain pegs because of its Turing-complete script language.
Third, PoS algorithms based on BFT principles offer indeed an advantage regarding latency. Nodes need to agree about the transaction set immediately when a block is found, while in PoW a disagreement can occur after several minutes creating a blockchain reorganization ("orphaning" an already created block).
However, this only can reduce the block interval (to a few seconds), not the storage or bandwidth requirements for nodes, and it comes with security and decentralization tradeoffs. A coin like Solana may offer a better latency and higher "throughput" than Bitcoin, but it has very high requirements for full nodes. So you either need to buy or rent an expensive PC or server with 128 GB RAM and excellent, datacenter-grade Internet connection, or rely on a light node.