Your idea about starting at a larger node is a good one, you would certainly want to debug on a cheap process.
There's nothing to debug at the transistor level that is process-independent. In fact, even the transistor model changed from BSIM3 to BSIM4-family when you move from cheap to expensive processes.
The general topology of the models is already well known and open sourced:
http://bsim.berkeley.edu/models/What is secret? The parameter values of those models. And even if you use MOSIS/Europractice or similar program you won't be able to publish those secret values. Without those you can't optimize in any sensible way beyond "sandbag the hell out of it and keep your fingers crossed". KnC did that already.