I think you are writing the ones and zeros as bytes, when you should be writing them out as bits.
Here's what you should do to make your program faster. set a counter like i to 0, and then each time you perform a subtraction, do byte_value = 0 [or 1] << i; i = (i + 1) % 8. Then only do a write after every 8 iterations. Although, you can make the writing process even faster by waiting until you fill thousands of bytes like this, and then just write them all at once in one batch.