Skip to content

Commit

Permalink
Merge branch 'master' of github.com:hadley/r4ds
Browse files Browse the repository at this point in the history
  • Loading branch information
garrettgman committed Apr 1, 2016
2 parents ec1e1d8 7954100 commit 53dc4b8
Show file tree
Hide file tree
Showing 8 changed files with 190 additions and 200 deletions.
341 changes: 155 additions & 186 deletions data-structures.Rmd

Large diffs are not rendered by default.

Binary file added diagrams/data-structures-overview.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added diagrams/data-structures.graffle
Binary file not shown.
2 changes: 1 addition & 1 deletion functions.Rmd
Original file line number Diff line number Diff line change
@@ -1,4 1,4 @@
```{r, include = FALSE}
```{r setup, include = FALSE}
library(stringr)
```

Expand Down
35 changes: 26 additions & 9 deletions import.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -7,12 7,11 @@ library(readr)

## Overview

You can't apply any of the tools you've applied so far to your own work, unless you can get your own data into R. In this chapter, you'll learn how to import:
You can't apply any of the tools you've applied so far to your own work, unless you can get your own data into R. In this chapter, you'll learn how to:

* Flat files (like csv) with readr.
* Database queries with DBI.
* Data from web APIs with httr.
* Binary file formats (like excel or sas), with haven and readxl.
* Import flat files (like csv) with readr.
*
* Cache intermediate results in a fast file format like feather or RDS.

The common link between all these packages is they all aim to take your data and turn it into a data frame in R, so you can tidy it and then analyse it.

Expand Down Expand Up @@ -245,10 244,28 @@ The settings you are most like to need to change are:
* Parse these example files.
* Parse this fixed width file.
## Databases
## Other file formats
## Web APIs
* Excel: readxl
* SPSS: haven
* Stata: haven
* SAS: haven
## Binary files
Databases. All powered by the DBI package which provides a common interface.
Needs to discuss how data types in different languages are converted to R. Similarly for missing values.
* RPostgres
* RMySQL
* RSQLite
* Avoid JDBC un
Hierarchical:
* XML: xml2
* JSON: jsonlite
## Binary file formats
Feather.
RDS.
8 changes: 6 additions & 2 deletions index.rmd
Original file line number Diff line number Diff line change
@@ -1,8 1,12 @@
---
knit: "bookdown::render_book"
title: "R for Data Science"
output:
- bookdown::gitbook
author: ["Garrett Grolemund", "Hadley Wickham"]
description: "This book will teach you how to do data science with R: You'll learn how to get your data into R, get it into the most useful structure, transform it, visualise it and model it. In this book, you will find a practicum of skills for data science. Just as a chemist learns how to clean test tubes and stock a lab, you'll learn how to clean data and draw plots---and many other things besides. These are the skills that allow data science to happen, and here you will find the best practices for doing each of these things with R. You'll learn how to use the grammar of graphics, literate programming, and reproducible research to save time. You'll also learn how to manage cognitive resources to facilitate discoveries when wrangling, visualizing, and exploring data."
url: 'http\://r4ds.had.co.nz/'
github-repo: hadley/r4ds
twitter-handle: hadley
cover-image: cover.png
---

# Welcome
Expand Down
2 changes: 1 addition & 1 deletion variation.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 26,7 @@ Rectangular data provides a clear record of variation, but that doesn't mean it
mat <- as.data.frame(matrix(morley$Speed 299500, ncol = 10))
knitr::kable(mat, caption = "*The speed of light is* the *universal constant, but variation obscures its value, here demonstrated by Albert Michelson in 1879. Michelson measured the speed of light 100 times and observed 30 different values (in km/sec).*", col.names = c("\\s", "\\s", "\\s", "\\s", "\\s", "\\s", "\\s", "\\s", "\\s", "\\s"))
knitr::kable(mat, caption = "*The speed of light is* the *universal constant, but variation obscures its value, here demonstrated by Albert Michelson in 1879. Michelson measured the speed of light 100 times and observed 30 different values (in km/sec).*", col.names = rep("", ncol(mat)))
```


Expand Down
2 changes: 1 addition & 1 deletion work.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 8,7 @@ Throughout this book we work with "tibbles" instead of the traditional data fram
library(tibble)
```

## Creating tibbles
## Creating tibbles {#tibbles}

The majority of the functions that you'll use in this book already produce tibbles. But if you're working with functions from other packages, you might need to coerce a regular data frame a tibble. You can do that with `as_data_frame()`:

Expand Down

0 comments on commit 53dc4b8

Please sign in to comment.