1.They are related to each other . SEE = Sqrt(SSE)/N
2. Conceptually The regression is trying to minimize the SSE by running a line thru the points that gets a minimum SSE .
3. The SEE gives you an estimate of how much the mean of the sample is different from the mean of the population. This estimate is higher when SSE is higher and lower when the sample size is larger.
When the sample size increases (that is N), SEE decreases making your regression even more accurate than before.
And as SSE is sum of SQUARED errors, when you take its square ROOT you get simple standard error of estimate which is better for comparability purposes.