3. Isn't it cleaner for type-1 if k_i is the direct output of a HMAC?
Yes, it seems better to do k_i=I_L instead of k_i=k_par+I_L (mod n)
If I'm guessing correctly, the reason that BIP32 specifies that for I_L>=n that resulting keypair is invalid is because we would like to have the privkey chosen uniformly at random from the valid range, so if we take I_L modulo n then the smaller privkey values will have a little bit higher probability?
Edit: instead of specifying that k_i is invalid, we could specify for example k_i=SHA256(I_L) in this case (and invoke SHA256 again if k_i is still invalid, and so on), so that we couldn't have forced gaps in the indices of the HD wallet? But if the probability is less than 1/2^127 then specifying that i is invalid is also fine.
I still agree with thanke that for type-1 derivation it makes slightly more sense to have k_i=I_L and specify that k_i is invalid if I_L=0 or I_L>=n, rather than to have k_i=k_par+I_L (mod n) and specify that k_i is invalid if I_L>=n. Note that the OP of this thread also specifies for type-1 that "Privatekey(type,n) is then simply set to H(n|S|type)". With k_i=I_L we could still treat d=k_i-k_par as the difference, so d*G+K_par is the corresponding pubkey, in case this d in needed for compatibility with type-2 in some possible context.