I've got an idea based on how the human memory works:
For the memory, recognition is way easier than recall : for example, when you're speaking a foreign language, it's a lot more difficult to write the Japanese kanji for "love" (it's "愛") than to recognize it among other candidates. Or it's easier to understand some words of Japanese than to produce them yourself. Or in other words, speaking is always more difficult than understanding.
And secondly, the brain "loves" visual mode, way more than language/word mode.
So:
Imagine a steganographical image of a room for example. The steganography is not used to hide the code, but only because the brain likes pictures, not words. the code is inside a very small portion of the code, on which Average-Joe must click. there are lots of elements in the room picture (or landscape or city street etc) so that it's not easy for him to "know" beforehand what the cue is. But if he sees it, he recognizes it. Always based on the fact that recognition is superior to recall.
To increase the protection, the image must be large enough, and even using 3 or 4 different images like this (it's a bit like a Re-captcha actually).