An alternative way to generate entropy, is to not use the sum of the outputs as the seed, but write down the dice results (1, 2..., 6) and SHA256 the output. That way, you can have a fixed number of dice rolls.
I wouldn't do this. I by no means knowledgeable in this field, but I know enough to know that by using SHA256 as a randomness extractor like this you will almost certainly end up with much less entropy than you think you are achieving.
Here are a few relevant quotes from the original HKDF paper:
We end by observing that most of today’s standardized KDFs (e.g., [4, 5, 57, 40]) do not differentiate between the extract and expand phases but rather combine the two in ad-hoc ways under a single cryptographic hash function (refer to Section 8 for a description and discussion of these KDF schemes). This results in ad-hoc designs that are hard to justify with formal analysis and which tend to “abuse” the hash function, requiring it to behave in an “ideally random” way even when this is not strictly necessary in most KDF applications (these deficiencies are present even in the simple case where the source of keying material is fully random)]
Efficient constructions of generic (hence randomized) statistical extractors exist such as those built on the basis of universal hash functions [15]. However, in spite of their simplicity, combinatorial and algebraic constructions present significant limitations for their practical use in generic KDF applications. For example, statistical extractors require a significant difference (called the gap) between the min-entropy m of the source and the required number m′ of extracted bits (in particular, no statistical extractor can achieve a statistical distance, on arbitrary sources, better than 2-((m-m′)/2) [60, 63]). That is, one can use statistical extractors (with its provable properties) only when the min-entropy of the source is significantly higher than the length of output. These conditions are met by some applications, e.g., when sampling a physical random number generator or when gathering entropy from sources such as system events or human typing (where higher min-entropy can be achieved by repeated sampling). In other cases, very notably when extracting randomness from computational schemes such as the Diffie-Hellman key exchange, the available gap may not be sufficient (for example, when extracting 160 bits from a DH over a 192-bit group). In addition, depending on the implementation, statistical extractors may require from several hundred bits of randomness (or salt) to as many bits of salt as the number of input bits.
However, there is little hope that one could prove anything like this for regular cryptographic hash functions such as SHA; so even if the assumption is well defined for a specific hash function and a specific group (or collection of groups), validating the assumption for standard hash functions is quite hopeless. This is even worse when requiring that a family of hash functions behaves as a generic extractor (i.e., suitable for arbitrary sources) as needed in a multi-purpose KDFs.
There is a lot more to securely generating entropy than just feeding what you think is a long enough, random enough string in to a SHA256 function and being happy with the output. I would stick to either /dev/urandom, or a physical process which can generate your entropy directly, such as flipping a coin. Anything beyond that introduces too many possibilities for error, many of which the average user is completely oblivious to the very existence of.