is.numeric(numeric_data[3]) #to check the data type. retuns false or true
[1] TRUE
typeof(7L)
[1] "integer"
Character data
Characters are also called strings. Anything between quotation marks “” is treated as character
typeof("what is the date today?") #tells the type of data
[1] "character"
my_string <-"The instructor said, \"R is cool,\" and the class agreed."cat(my_string) # cat() prints the arguments
The instructor said, "R is cool," and the class agreed.
Logical
x<-c(4,5,6,7) #this one asks if 7 is found in the object x7%in% x
[1] TRUE
class(TRUE) #it tells that the class of true is logical
[1] "logical"
Factor data
When you use factor, you are telling R that this is categorical data with levels. This can be very helpful in various types of statistical analysis.
myfactor <-factor("B", levels =c("A", "B","C")) # B is a factor which has three levels A,B and Cmyfactor
[1] B
Levels: A B C
#Tidy data
Untidy data can be hard for us and the computer to read and do anlaysis on it. In tidy data, every column is variable, every row is an observation and every cell is a single value.
#This is an example of untidy data. #It shows that itemprice has two values in each cell#It is hard to read as it shows the data for all three years repeatedly which can be confusing to analyze.
tidy_data <-read.csv("CopyOfdata/tidy_data.csv")tidy_data #This is an example of tidy data. It shows how each observatio ihas its own row and each value has its own cell.
library(ggplot2)survey_data <-read.csv("https://psyteachr.github.io/ads-v2/data/survey_data.csv")survey_ggplot <-ggplot(survey_data, aes(x = wait_time, y = call_time)) +geom_point(colour="red") +geom_smooth(method =lm) +scale_x_continuous(name ="Wait Time (seconds)",breaks =seq(from=0, to=600, by=60))+scale_y_continuous(name ="Call time (seconds)",breaks =seq(from =0, to =600, by =30))+labs(title ="The relationship between wait time and call time",subtitle ="2020 Call Data",caption ="Figure 1. As wait time increases, call time increases.")survey_ggplot