I’m seeing someone do something in a paper that strikes me as odd. So let me ask some of you stats guys what you think:
Suppose I run a regression to see what effect independent variables X1, X2, …, Xn have on Y. I come up with my regression coefficients on each of them. The coefficient on X3 is (say) 5.
Then I take out X3 from my list, and run the regression again. Obviously the coefficients on the other variables change.
The thing is, the R-squared in both regressions is about the same. I.e., when I took out variable X3, the “fit” of my predicted curve to the actual curve, is about the same.
Would it be correct for me to say, “According to my regression analysis, X3 has no effect on Y”?
UPDATE: I didn’t want to say what the regression was initially, because now politics will get involved. But it’s from this paper (Tables 2 and 3 at the end). My obvious concern with the economics of it, is that “refiner margin” is closely related to the implementation of Tier 2. I.e. the way Tier 2 regulations would (possibly) raise gasoline prices, is by first reducing refiner margins.