Skip to content

Commit

Permalink
Fix/data-transform (hadley#1398)
Browse files Browse the repository at this point in the history
* fix wrong references, inconsistency between sentence and code, and typos

* Update data-transform.qmd

* Update data-transform.qmd

* Update data-transform.qmd

* Update logicals.qmd

---------

Co-authored-by: Mine Cetinkaya-Rundel <[email protected]>
  • Loading branch information
mitsuoxv and mine-cetinkaya-rundel authored Apr 10, 2023
1 parent b9f4ad6 commit e5a847f
Showing 1 changed file with 6 additions and 6 deletions.
12 changes: 6 additions & 6 deletions data-transform.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -58,7 58,7 @@ glimpse(flights)
```

In both views, the variables names are followed by abbreviations that tell you the type of each variable: `<int>` is short for integer, `<dbl>` is short for double (aka real numbers), `<chr>` for character (aka strings), and `<dttm>` for date-time.
These are important because the operations you can perform on a column depend so much on its "type", and these types are used to organize the chapters in the next section of the book.
These are important because the operations you can perform on a column depend so much on its "type".

### dplyr basics

Expand Down Expand Up @@ -102,7 102,7 @@ We'll also discuss `distinct()` which finds rows with unique values but unlike `
`filter()` allows you to keep rows based on the values of the columns[^data-transform-1].
The first argument is the data frame.
The second and subsequent arguments are the conditions that must be true to keep the row.
For example, we could find all flights that arrived more than 120 minutes (two hours) late:
For example, we could find all flights that departed more than 120 minutes (two hours) late:

[^data-transform-1]: Later, you'll learn about the `slice_*()` family which allows you to choose rows based on their positions.

Expand Down Expand Up @@ -225,7 225,7 @@ flights |>

### Exercises

1. In a single pipeline, find all flights that meet all of the following conditions:
1. In a single pipeline, find all flights that meet each of the following conditions:

- Had an arrival delay of two or more hours
- Flew to Houston (`IAH` or `HOU`)
Expand All @@ -251,7 251,7 @@ flights |>

## Columns

There are four important verbs that affect the columns without changing the rows: `mutate()` creates new columns that are derived from the existing columns, `select()` changes which columns are present; `rename()` changes the names of the columns; and `relocate()` changes the positions of the columns.
There are four important verbs that affect the columns without changing the rows: `mutate()` creates new columns that are derived from the existing columns, `select()` changes which columns are present, `rename()` changes the names of the columns, and `relocate()` changes the positions of the columns.

### `mutate()` {#sec-mutate}

Expand Down Expand Up @@ -479,7 479,7 @@ flights |>
arrange(desc(speed))
```

Even though this pipeline has four steps, it's easy to skim because the verbs come at the start of each line: start with the `flights` data, then filter, then group, then summarize.
Even though this pipeline has four steps, it's easy to skim because the verbs come at the start of each line: start with the `flights` data, then filter, then mutate, then select, then arrange.

What would happen if we didn't have the pipe?
We could nest each function call inside the previous call:
Expand Down Expand Up @@ -575,7 575,7 @@ This means subsequent operations will now work "by month".
### `summarize()` {#sec-summarize}

The most important grouped operation is a summary, which, if being used to calculate a single summary statistic, reduces the data frame to have a single row for each group.
In dplyr, this is operation is performed by `summarize()`[^data-transform-3], as shown by the following example, which computes the average departure delay by month:
In dplyr, this operation is performed by `summarize()`[^data-transform-3], as shown by the following example, which computes the average departure delay by month:

[^data-transform-3]: Or `summarise()`, if you prefer British English.

Expand Down

0 comments on commit e5a847f

Please sign in to comment.