Introduction to Modeling in R

Joshua Wiley, M.A.

Senior Analyst — Elkhart Group

Overview

Examples using linear regression and linear regression with smooths

ggplot2 for graphics and visualization

Synthesize models and visualization to

Basics

require(ggplot2)
(p <- ggplot(d, aes(mentalHealth)) + geom_histogram(binwidth = 2))
plot of chunk unnamed-chunk-2

How would you summarize the distribution?

Basics

You could use the expectation \[E(y) = 52.92 \]

p + geom_vline(xintercept = mean(d$mentalHealth), size = 2, colour = "red")
plot of chunk unnamed-chunk-3

Basics

(p <- ggplot(d, aes(mastery, mentalHealth)) + geom_point())
plot of chunk unnamed-chunk-4

How would you summarize this distribution?

Basics

You could use the conditional expectation \( E(y | f(x)) \) where \(f(\cdot)\) is some function, say linear

m <- lm(mentalHealth ~ mastery, data = d)

\[ \widehat{mentalHealth} = 36.9 + 5.14 mastery \]

p + geom_abline(intercept = coef(m)[1], slope = coef(m)[2])
plot of chunk unnamed-chunk-6

Multiple dimensions

The conditional expectation generalizes to \(\mathbb{R}^{N}\), but graphs do not.

coef(lm(mentalHealth ~ mastery + neuroticism, d))[["mastery"]]
## [1] -1

Under certain conditions, \( E(y | f(x_1, x_2)) \approx E( y | f(x_1) ~~ | ~~ f(x_2 | f(x_1))~) \)

m1 <- lm(cbind(mentalHealth, mastery) ~ neuroticism, d)
coef(lm(mentalHealth ~ mastery, rd <- as.data.frame(resid(m1))))[["mastery"]]
## [1] -1

Multiple dimensions

We can take advantage of this for data and model visualization

p <- ggplot(d, aes(mastery, mentalHealth)) + geom_point() + stat_smooth(method = "lm")
require(gridExtra)
grid.arrange(p, p %+% rd, ncol = 2)
plot of chunk unnamed-chunk-9

Multiple dimensions

This concept also generalizes: \( E( y | f(x_1, \ldots, x_{n - 1}) ~~ | ~~ f(x_n | f(x_1, \ldots, x_{n - 1}))~) \)

rd <- as.data.frame(resid(lm(cbind(mentalHealth, mastery) ~ age + 
    hope + neuroticism, d)))
p %+% rd
plot of chunk unnamed-chunk-10

Multiple dimensions

require(GGally)
ggpairs(d[, c(1, 2, 11, 12)], lower = list(continuous = "smooth"))
plot of chunk unnamed-chunk-11

Other Functional Forms

So far simple linear functional forms, but \(f(\cdot)\) can be any

When form is unknown, splines or smooth parameters are useful, especially for "nuissance" variables

Many options -- mgcv package has the gam function for generalized additive models with a general smooth function s() defaulting to thin-plate

Also the splines package for b-splines bs() or natural cubic splines ns() you can use in any modeling function

require(mgcv)
require(splines)

Other Functional Forms

# fit and store models
m.linear <- lm(positiveAffect ~ deltaTime, d)
m.quadratic <- lm(positiveAffect ~ deltaTime + I(deltaTime^2), d)
m.smooth <- gam(positiveAffect ~ s(deltaTime, k = 5), data = d)

# data set of raw and predicted values
pdat <- cbind(d[, c("deltaTime", "positiveAffect")],
          linear = fitted(m.linear),
          quadratic = fitted(m.quadratic),
          smooth = fitted(m.smooth))

# plot of raw data with separate coloured lines by model
p <- ggplot(pdat, aes(deltaTime, positiveAffect)) + geom_point() +
  geom_line(aes(y = linear), colour = "green", size=1.2) +
  geom_line(aes(y = quadratic), colour = "blue", size=1.2) +
  geom_line(aes(y = smooth), colour = "red", size=1.2)

Other Functional Forms

plot of chunk unnamed-chunk-14

Other Functional Forms

Days since treatment, visual acuity (logMAR), and age are nuissance variables; we care about mastery & positive affect

ldat <- melt(d[, c("deltaTime", "logMAR1", "age", "mastery",
  "positiveAffect")], id.vars="positiveAffect")

ggplot(ldat, aes(value, positiveAffect)) + geom_point() +
  stat_smooth(se=FALSE) + facet_wrap(~variable, scales="free_x")
plot of chunk unnamed-chunk-15

Other Functional Forms

# smooth model - k = 10 for high flexibility
m.smooth <- gam(positiveAffect ~ s(deltaTime, k = 10) +
  s(logMAR1, k = 10) + s(age, k = 10), data = d)

# linear model only
m.linear <- lm(positiveAffect ~ deltaTime + logMAR1 + age, data = d)

# pull out residuals in a data set
pdat <- data.frame(mastery = d$mastery,
  residualPositiveAffect = c(resid(m.smooth), resid(m.linear)),
  Type = rep(c("smooth", "linear"), each = nrow(d)))

# make the graph (separate by model type)
p <- ggplot(pdat, aes(mastery, residualPositiveAffect)) +
  geom_point() + stat_smooth(method="lm") +
  facet_wrap(~ Type)

Other Functional Forms

plot of chunk unnamed-chunk-17
# correlation between residuals
cor(resid(m.smooth), resid(m.linear))
## [1] 0.96

Review

Packages

Graphics Functions

Model Functions

Info

/

#
Pac cher handbags trio celine store outlet boston celine clasp cheap yves saint laurent Satchel Replica store celine 2015 handbags sunglasses celine women shop celine outlet trapeze Best wholesale handbags boston outlet replica celine Heels sandals christian louboutin store cheap official authentic bags kelly online 2015 hermes fake Discount Bags clutch store shop official 2015 prada Christian louboutin sale fake heels shoes best official Women outlet price canada goose parkas vests Heels sandals outlet cheap online official christian louboutin Pac cher heels men christian louboutin sale discount hermes fake Cheap online Authentic bags Garden Party chloe outlet baby bag louis vuitton celine trapeze replica Replica shop official real luggage handbags louis vuitton 2015 celine bags phantom outlet online Yves saint laurent fake replica online women shoulder bags authentic prada wallet sale Canada goose high quality women cheap kensington parka cheap canda goosecheap canada gooseceline outletcanada goose oultet
store Cheap Authentic Real bags kelly hermes Handbags purse sale cheap chloe online real yves saint laurent italia cheap fake sale bags lindy online Official hermes yves saint laurent wallet Christian Louboutin Shoes Sale UK Christian louboutin heels wedding sale outlet best online Official celine outlet replica handbags mini luggage tote Outlet shop hermes Official 2015 bags evelyne hermes store Outlet Official Authentic bags handbags authentic prada handbags on ebay Authentic women totes bags yves saint laurent outlet store celine bags online sale Bags belt prada outlet sale authentic official yves saint laurent Billige replica fake yves saint laurent Clutch Authentic wholesale fake store celine handbags wallet authentic prada backpack sale Louis vuitton fake handbags bags replica outlet shop celine classic canada goose salecheap canada gooseceline outletcanada goose black friday
hermes wallet outlet hermes handbags designer christian louboutin sandals celine mini luggage tote replica hermes replica bags for sale hermes kelly sale celine bag outlet hermes wallet celine luggage tote outlet hermes birkin prada handbags outlet celine luggage tote replica red bottoms shoes yves saint laurent bags replica celine for sale
red bottoms wedding shoes celine bag replica prada sunglasses outlet prada bags on sale chloe bags cheap sale 2015 celine bag outlet celine trapeze bag outlet hermes bags yves saint laurent wallet replica prada purse on sale red bottoms pumps prada sunglasses on sale celine phantom outlet yves saint laurent shoulder bags replica christian louboutin boots christian louboutin sandals celine replica christian louboutin sneakers red bottoms women prada bags outlet prada shoes outlet hermes bag kelly red bottoms sneakers