Coming soon - Get a detailed view of why an account is flagged as spam!
view details

This post has been de-listed

It is no longer included in search results and normal feeds (front page, hot posts, subreddit posts, etc). It remains visible only via the author's post history.

6
Looking for some linear algebra/MV normal tricks.
Post Body

So, just to give a sense of my background: I work in theory and method development in statistical genetics, but I don't have a ton formal training in math or statistics, with the exception a few classes in linear algebra, diff-eq, probability theory, stochastic processes, etc., and about 2 years on the job at this point. So I'm somewhat conversant in a lot of basic statistical and mathematical areas, but there are likely a lot of standard tricks and whatnot that I probably am not familiar with.

Briefly, my situation is this: I have data in the form of a single vector "Z", which, under the null hypothesis, is expected to be a draw from some multivariate normal distribution with a mean of zero, and a variance-covariance matrix which I'll call F. The alternative hypothesis that I'm interested in is that there is more variance among the elements of Z than we would expect if it was truly a draw from F. Essentially, I know that F underlies the distribution of Z, but there may be another process operating, and if so, I need to know what effect that process has had on Z.

I can test the null hypothesis simply by recognizing that ZT F-1 Z is distributed as a chi squared random variable under the null, and thus if the value of this test statistic is way out in the right tail, then I've got something that looks more like my alternative hypothesis than the null, which is great. What I'd really like to be able to do on top of that, however, is to essentially ask the data where this signal is coming from. For example, some of the signal could plausibly be coming from having two elements of Z that are expected to be very tightly correlated under the null, but which are actually not very close. Alternatively, one could imagine having two clusters of elements, where the difference between the two clusters is much larger than expected given the covariance matrix.

One trick I know I can play is to use conditional multivariate normal distributions to ask whether certain elements or groups of elements look unusual, given the values observed for another set of elements. This is nifty, and quite handy, and I can definitely learn something about the data from it, but it seems like there should probably be some simple linear algebra tricks that should just tell me which axes in the data are most interesting, without me having to specify and test a bunch of different hypotheses with the conditional MVN approach.

Any suggestions?

Author
Account Strength
100%
Account Age
14 years
Verified Email
Yes
Verified Flair
No
Total Karma
25,490
Link Karma
698
Comment Karma
24,792
Profile updated: 2 days ago
Posts updated: 7 months ago

Subreddit

Post Details

We try to extract some basic information from the post title. This is not always successful or accurate, please use your best judgement and compare these values to the post title and body for confirmation.
Posted
11 years ago