This all ties back to checking if nodes really have the data stored, this is the easy part as calling and checking if the data is stored, the question is which way are you going to go about it? The most efficient is through the use of a committee.
I just thought maybe we do not need to store the data spreading it across the network. Maybe it will work better if you allow users to decide about what projects/repos they want to host. Like it is done in bittorrent protocol.
So if I want to protect some project from censorship I could host exactly this project's repo only.
Could be also, still this could all be incorporated together, if you have a system such as mentioned above there is nothing also preventing someone from hosting their preferred projects also.
To gain the most out of decentralization it would be best to put all 3 together. just doing one such as this would not provide a guarantee as only those who are popular would really be decentralized.