r functions: pca

Summary

Started a github repository to put online R functions I've create for some common types of analysis and plots. Aim to have a core set of functions to make figures look prettier, even on preliminary analysis. A couple examples are included focusing on the first function: principal component analysis.

A GitHub repository can be found at: R plotting functions.

For those who want to dive right in, the git repository (includes a readme):

R plotting functions

Update: The PCA script presented here could be greatly simplified using ggplot, a very useful graphics program in R. I'll leave implementation for a future post.


Been writing a lot of R functions and trying to make them generalizable and accepting of multiple input types. Since I was helped by others who posted code and other useful information about R online, thought I should contribute back.

There are a multitude of packages to help create useful and pretty plots with R. But sometimes it is also helpful to have functions that combine features of these packages into one nice example. That is what I hope to achieve with this repository. Each type of plot or analysis will have its own functions, examples and plots. This will allow users to verify that the functions work. Further, I hope those who just want to see a particular R feature implemented within a wider, working function can benefit from this.

left: USA Crime PCA: high population urban centers cluster in this high-dimension analysis.

I have included two example images created with my first function, a script to do principal component analysis on arbitrary datasets. By getting the scores from the PCA object R creates, I can create a plot that is softer on the eyes that biplot or other standard functions. In addition, it allows me to input any arbitrary list and have the function highlight the subset of items in that list on the component graph. This allow easier visualization and understanding for human readers.

The first example looks at multiple crime statistics in the USA across states. Analyzing each individually might not tell us much about crime in the US at it relates to each state, but by doing PCA we see that there is some relation between these variables and that states with large urban populations group together, seen by looking at the clustering of the 70th percentile states.

Next, I included some preliminary data, mostly uninformative to the uninitiated but visually nice, looking at biophysical protein properties across the entire yeast genome and then highlight the kinases to show this analysis can properly group related protein subsets. Obviously this is a rough first-analysis, but you get the picture.

S. cer protein properties: yeast kinases group together when analyzing several biophysical protein properties.

Alright, this was supposed to be short, so I'll end it here. In the future I'll include code and explain the thought process behind it.

-biafra
bahanonu [at] alum.mit.edu

more articles to enjoy:

quicklinks github
12 august 2012 | programming

quicklinks is a new homepage for those looking for efficiency over lavish use of big buttons commonly seen in Firefox, Opera and other brow[...]ser's homepages. As quicklinks is still a little rough around the edges and needs to be updated, I've added it to GitHub to allow me to update it easier.

citizenship, war and social networks
09 september 2012 | essay

Scott Adams recently wrote about citizenship and how the Internet [...]will bring the fall of territory-based national governments, and by extension wars. In this post I briefly highlight where he errs and give reasons why country divides will only grow sharper in the coming decades, in part due to competition for dwindling water, oil and other resources along with increasingly fraught intranational civil relations.

guild wars 2 chef excel guide
08 september 2012 | guild wars 2

The chef's ingredients are scattered throughout the Guild Wars 2 map, but the profession is a cheap way to get 10 levels quickly. Because I[...] am OCD and would rather not read text when I can browse a spreadsheet, compiled information from a couple of sources that is more easily searchable.

killer's army
21 december 2009 | short story

The trees flew past us, the wind almost drowning out the roars, hisses, and howls that permeated the area. The gun felt cold in my han[...]d; it weighed me down and got in the way of my agile attempts to slip through the undergrowth. To my left was Marsha, like a sly fox she twisted and turned to avoid everything Nature did to obstruct her path. To my right was Judy, her movements labored owing to the gash running down the side of her leg. It oozed green and yellow puss—we may have to kill her before she turns. The twigs and branches continued to rip through my clothes and skin, yet it was nothing compared to the terror that drove us forward. The howls grew louder and the falling of trees could be heard behind us.

Perhaps the pinnacle of my horror stories, it is a relentless, fast-paced tale of zombies, an accident in the making and a mysterious man. There are two independent stories being told, but drawing the link makes everything much more satisfying.

©2006-2025 | Site created & coded by Biafra Ahanonu | Updated 21 October 2024
biafra ahanonu