Will posted earlier this week about how to produce manhattan plots of GWAS results using Stata, so I thought I'd share how I do this in R using ggplot2.
First, if you've never used ggplot2, you'll need to add it to your R installation by typing:
install.packages("ggplot2")
Once you've done that, copy and paste this command to download the functions I wrote necessary to produce these plots. If you'd like to see the source code yourself, copy the URL into your web browser.
source("http://dl.dropbox.com/u/66281/0_Permanent/qqman.r")
Next, read in the PLINK results file to a data frame. Substitute plink.qassoc with your own results filename.
mydata=read.table("plink.qassoc", header=TRUE)
Finally, run this function which will produce and save in the current directory both a QQ-plot and a manhattan plot of your results:
qqman(mydata)
QQ-Plot (these are simulated GWAS results):
Manhattan plot (click to enlarge):
A few notes: First, if you're doing this on a linux machine from your Windows computer, you'll need to be running the previously mentioned XMing server on your Windows computer for the plot to save correctly. Second, the qqman() function calls the manhattan() function, which is extremely slow and memory-intensive. It may take about 3 minutes to run for each dataset. The memory issue isn't a problem on 64-bit machines, but you may run out of memory on 32-bit machines if you're doing this with GWAS data. I'm going to try to improve this in the future. Finally, using that source command you also downloaded a function I wrote called qqmanall(), which does just what it sounds like - if you run it on a linux machine with no arguments it reads in ALL of the plink GWAS results stored in the current directory, and creates QQ and manhattan plots for all of them with a common upper limit for the y-axis corresponding to the most significant result. Enjoy.
...
Update Thursday, January 21, 2010: I neglected to mention yesterday the format of the plink.assoc or plink.qassoc files, in case you want to produce the same plots using results from another software other than plink. When you load your .assoc files in a data frame, the relevant columns are named "CHR", "BP", and "P". You can use this plotting function as long as you have these three columns in your data frame, regardless of whether you use PLINK or not.
The latest version of the code is reproduced below:
*** Update April 25, 2011: This code has gone through a major revision. Please see the updated code and tutorial here. ***
...
*** Update April 25, 2011: This code has gone through a major revision. Please see the updated code and tutorial here. ***