Hey Folks,
Mind if I pick your brains for a few minutes? Does anyone one have a couple of sentences or a short, concise, paragraph explaining the difference between Binomial Distribution/Standard Deviation v. Multiple Regression in statistical analysis or proof?
I keep confusing myself. 8^]
Cordially,
Rush
Cordially,
Rush
elrushbo-[at]-theobviousadelphia.net
Remove the obvious...
Copyright © 2024 Einstein@Home. All rights reserved.
Binomial Distribution/Standard Deviation v. Multiple Regression.
)
Binomial Distribution is N!/n!/(N-n)! this shows up a lot in simple combinatorial and random distributions. Can also show up as (n1+n2)!/n1!/n2! Or extended to (n1+n2+n3)!/n1!/n2!/n3! etc. The ! means factorial ie n*(n-1)*(n-2)...3*2*1.
Standard Deviation is a kind of average difference from the mean. For N!/n!/(N-n)! it is approximately N/4. More precisely it is the squareroot of the average (difference from the mean)^2.
Multiple Regression is a technique used to predict a value based multiple variable. Usually looks something like V=c1*v1+c2*v2+c3*v3+...
The trick being finding the best cs. This is usually done by minimizing the standard deviation of known set of Vs and vs.
RE: Hey Folks, Mind if I
)
In multiple regression analysis you're trying to explain variability of the criterion variable using the set of predictor variables (with the coefficient of correlation between random variables, and in its squared form, the coefficient of determination, which indicates the amount of variance in the criterion variable that is accounted for by the variation in the predictor variable). So basically you're trying to explain phenomena or predict future events.
The binomial distribution is the discrete probablity distribution of the number of successes in a sequence of n independent yes/no experiments, each of which yields success with probability p.
The variance of a random variable is a measure of its statistical dispersion (how far from the expected value it typically is), and it is the square of the standard deviation (which basically measures how spread out the values in a data set are).
RE: In multiple regression
)
WOW!!!
Chipper, I think perhaps you might want to think about taking one or two of those tests again, in the a.m., and maybe an hour after breakfast.
I hope you never have an accident on those roller-blades, your coconut houses too fine an instrument to risk damage. check the thanksgiving thread
microcraft
"The arc of history is long, but it bends toward justice" - MLK
Hey Mark and
)
Hey Mark and Chipper,
Thank you both for contributing your thoughts. I was ending up with something more like Mark and less like Chipper. However, the readers will be laymen like myself, and that one isn't as user friendly, though it is better than what I came up with, and accuracy is important.
How about this:
In multiple regression analysis you're trying to explain why a specific variable, x, varies the way it does, by using other variables, a, b, or c, that have an effect on x. In essence, you're trying to explain why x is the way it is, or you're trying to predict the future value for x.
Binomial distribution is the probability that x would have a specific value as compared to the probability that x would have other specific values.
Standard deviation is simply the likelyhood of a value for x as x gets farther from its expected value.
Cordially,
Rush
elrushbo-[at]-theobviousadelphia.net
Remove the obvious...
RE: How about this: In
)
Anyone? Anyone? Bueller?
Cordially,
Rush
elrushbo-[at]-theobviousadelphia.net
Remove the obvious...
RE: Hey Mark and
)
Just one thing about regression. You can't truly explain in a causal sense, why x is the way x is. Correlation and regression are related (although you didn't ask about correlation). Correlation lets you assess the relationship between x and y, and doesn't say if x causes y, y causes x, or a third (or more) variable can explain the relationship you see. Regression (and it's extension multiple regression) is used to predict a value of y given certain scores on x. The general equation for regression is y = bx + a. The regression coefficient "b" will be the same as the correlation coefficient "r" when x and y are in standard (Z) scores.
That's what I can pull out of my head from my god only knows how many stats courses. I didn't do much regression. Mostly ANOVA and related techniques.
Kathryn
Kathryn :o)
Einstein@Home Moderator
RE: RE: How about
)
You seem to think of the well know Chebychev inequality. From http://en.wikipedia.org/wiki/Pafnuty_Chebyshev :
Chebyshev's inequality says that the probability that the outcome of a random variable with standard deviation σ is no less than aσ away from its mean is no more than 1/a2 ...
This is just a theorem, though - it's not used as a definition of variance.
The "variance" is a rather technical thing. As you hint, it describes the "spreadout-ness" - how far we (kind of) "expect" an actual realization x of X to "freak out" from it's "center".
The number E[ |X-mu| ], (i.e. the expected value of the distance from the mean) could be used instead, but for some rather sophisticated theoretical reasons the variance ( E[ (X-mu)2] ) is preferred.
The variance is somewhat related to the entropy of a distribution, but incorporates the actual numerical values that X may take as well. (The entropy is defined as the integral or sum of -f(x)ln(f(x)) over all possible x.)
So, while the *entropy* describes only the distribution itself, the *variance* incorporates the number the random variable assigns to the different events also.
The best way to learn this is to work out an actual example with pen and paper. Suggestion:
Calculate the different parameters (mu, variance, std and entropy) for the random variables corresponding to tossing a fair coin.
Then do the same thing wrt a fair die.
Then generalize this to the discrete uniform distribution with n possibilities.
Also, you should consider viewing a distribution not as an actual physical property of some system under study, but rather a measure of *uncertainty*. In many instances this can significantly aid comprehension. Both viewpoints are correct, but don't confuse them. Many textbooks do.
Greetings, Mr. Ragnar Schroder