Occasionally we need to derive variables form existing information. A good example of this is conversion between scales. With our current data set, the variable “Above.Ground.Sight.Measurement” is ostensibly in feet – the data dictionary could be clearer about this. To bring this to a Canadian context, we might need these measurements in a metric equivalent.
To do this, we could simply multiple the vector by 0.3048
$Above.Ground.Sighter.Measurement <- sq_data$Above.Ground.Sighter.Measurement * .3048 sq_data
An alternative approach would be to create a new variable and to populate it with the converted values, this way we have the measurements in their untouched imperial scale as well as their modified metric scale.
This exercise will allow us to explore a couple of issues. The first is that our variable names with this data set are often quite verbose, not ideal for coding. So first we’ll make some changes to our variable names, keeping in mind of course that this is something we should have been thinking about from the beginning when we created our data dictionary!
First, we’ll create a list of existing variable names for our data set
<- colnames(sq_data)) (sq_colnames
## [1] "X"
## [2] "Y"
## [3] "Unique.Squirrel.ID"
## [4] "Hectare"
## [5] "Shift"
## [6] "Date"
## [7] "Hectare.Squirrel.Number"
## [8] "Age"
## [9] "Primary.Fur.Color"
## [10] "Highlight.Fur.Color"
## [11] "Combination.of.Primary.and.Highlight.Color"
## [12] "Color.notes"
## [13] "Location"
## [14] "Above.Ground.Sighter.Measurement"
## [15] "Specific.Location"
## [16] "Running"
## [17] "Chasing"
## [18] "Climbing"
## [19] "Eating"
## [20] "Foraging"
## [21] "Other.Activities"
## [22] "Kuks"
## [23] "Quaas"
## [24] "Moans"
## [25] "Tail.flags"
## [26] "Tail.twitches"
## [27] "Approaches"
## [28] "Indifferent"
## [29] "Runs.from"
## [30] "Other.Interactions"
## [31] "Lat.Long"
This is a two parter.
First, in sq_colnames
, re-name
‘Above.Ground.Sighter.Measurement’ to be more concise and indicate its
scale; in fact, re-name it to agsm.f
.
sq_colnames
is a vector, so we’ll want to use indexing by
number.
Second, using colnames()
, update the variable names of
sq_data
to match those now in the vector
sq_colnames
.
A plain language approach to this might be:
sq_colnames
.agsm.f
sq_data
to
sq_colnames
# list the columns and their index numbers
sq_colnames
14] <- "agsm.f" # change the value at index point 14 to 'agsm.f'
sq_colnames[
colnames(sq_data) <- sq_colnames # re-assign the values in colnames(sq_data) to match those in sq_colnames
We now have any easy variable name to type and adapt!
Creating a new variable in a dataframe is simply a matter of assigning some values to the desired variable.
$agsm.m <- sq_data$agsm.f * .3048 # create the variable agsm.m and populate it with the values in agsm.f times .3048 sq_data
Occasionally we need to rename variables, but when doing so we should be wary about cascading impacts on our documentation. We can easily add variables to an existing data frame.
Function | Description |
---|---|
colnames |
access the column names of a data frame. |