Thanks for the challenge! It's been a blast trying to solve it for the past 2 days. There's so many pieces of information we can think of, although connecting them all together is quite difficult.
One thing that I'm struggling with, do we actually know the plaintext S1? If the word have letters omitted from them, if there's symbols in numbers that can mark the word orders, extra obfuscation characters for more security, and if everything is in a securely random mixed position, doesn't that make S1 different from the actual plaintext? Or should we assume that the substitution rules involve those changes in positions, how to place marker numbers for the order (if there's one), going by the 12 word seed phrase in order?
"arena brisk seminar..."
Do we actually start with the seed phrase to end up with the C1, or is there arbitrary personal choices involved BEFORE applying the rules to substitute, that is out of cards' reach? I believe if the latter is the case, then it's much more difficult. I'm not sure if its unsolvable, but this "starting point" is really important.
Either way, I constantly find new things to try and connect them. It raises some important questions about how such a product might work, for better or worse.