Key points are not available for this paper at this time.
It is also perhaps worth considering the case in which the alternative model y Zy + co contains the same regressors as the true model (i.e., Z X). Then the above proof shows that the probability limit of the estimated error variance is smaller when the true value of p is used to perform the Orcutt transformation than when any other value po is used. (This, in fact, constitutes a relatively simple proof of the consistency of the maximum likelihood estimate of p, for a correctly specified model.)
Bush et al. (Fri,) studied this question.