R - PROGRAMMING[BCS358B]
1. Demonstrate the steps for installation of R and R Studio. Perform the following:
a) Assign different type of values to variables and display the type of variable. Assign different types such as Double, Integer, Logical, Complex and Character and understand the difference between each data type.
b) Demonstrate Arithmetic and Logical Operations with simple examples.
c) Demonstrate generation of sequences and creation of vectors.
d) Demonstrate Creation of Matrices.
e) Demonstrate the Creation of Matrices from Vectors using Binding Function.
f) Demonstrate element extraction from vectors, matrices and arrays.
1) A) Assign different type of values to variables and display the type of variable. Assign different types such as Double, Integer, Logical, Complex and Character and understand the difference between each data type.
Numeric Data type in R
Program:
x = 5.6
print(class(x))
print(typeof(x))
Output:
Even if an integer is assigned to a variable y, it is still saved as a numeric value.
Program:
y = 5
print(class(y))
print(typeof(y))
Output:
Integer Data type in R
Program:
x = as.integer(5)
print(class(x))
print(typeof(x))
y = 5L
print(class(y))
print(typeof(y))
Output:
Logical Data type in R
Program:
x = 4
y = 3
z = x > y
print(class(z))
print(typeof(z))
Output:
Complex Data type in R
Program:
x = 4 + 3i
print(class(x))
print(typeof(x))
Output:
Character Data type in R
Program:
char = "Geeksforgeeks"
print(class(char))
print(typeof(char))
Output:
1) B) Demonstrate Arithmetic and Logical Operations with simple examples.
Arithmetic Operators
Program:
vec1 <- c(0, 2)
vec2 <- c(2, 3)
cat ("Addition of vectors :", vec1 + vec2, "\n")
cat ("Subtraction of vectors :", vec1 - vec2, "\n")
cat ("Multiplication of vectors :", vec1 * vec2, "\n")
cat ("Division of vectors :", vec1 / vec2, "\n")
cat ("Modulo of vectors :", vec1 %% vec2, "\n")
cat ("Power operator :", vec1 ^ vec2)
Output:
Logical Operators
Program:
vec1 <- c(0,2)
vec2 <- c(TRUE,FALSE)
cat ("Element wise AND :", vec1 & vec2, "\n")
cat ("Element wise OR :", vec1 | vec2, "\n")
cat ("Logical AND :", vec1 && vec2, "\n")
cat ("Logical OR :", vec1 || vec2, "\n")
cat ("Negation :", !vec1)
Output:
C) Demonstrate generation of sequences and creation of vectors.
Program:
vec1 <- seq(1, 10, by = 2)
vec2 <- seq(1, 10, length.out = 7)
print(vec1)
print(vec2)
Output:
D) Demonstrate Creation of Matrices.
Create a Matrix in R
Program: |
# create a 2 by 3 matrix
matrix1 <- matrix(c(1, 2, 3, 4, 5, 6), nrow = 2, ncol = 3, byrow = TRUE) print(matrix1)
Output:
E) Demonstrate the Creation of Matrices from Vectors using Binding Function
Ø cbind() function
Ø rbind() function
Ø matrix() function
Program:
x <- c(1:5)
y <- c(11:15)
z <- c(21:25)
o <- matrix(c(x, y, z), ncol = 3)
m <- cbind(x, y, z)
n <- rbind(x, y, z)
print(o)
print(m)
print(n)
class(o)
class(m)
class(n)
Output:
2. Assess the Financial Statement of an Organization being supplied with 2 vectors of data: Monthly Revenue and Monthly Expenses for the Financial Year. You can create your own sample data vector for this experiment) Calculate the following financial metrics:
Program:
#Data
revenue <- c(14574.49, 7606.46, 8611.41, 9175.41, 8058.65, 8105.44, 11496.28, 9766.09,
10305.32, 14379.96, 10713.97, 15433.50)
expenses <- c(12051.82, 5695.07, 12319.20, 12089.72, 8658.57, 840.20, 3285.73, 5821.12,
6976.93, 16618.61, 10054.37, 3803.96) revenue
expenses
#profit per month profit <- revenue - expenses profit
#30% tax value
tax_30_per <- round(profit * 0.30, 0)
tax_30_per
#profit after tax
profit_after_tax <- profit - tax_30_per
profit_after_tax
#profit margin in %
profit.margin <- round(profit_after_tax/revenue, 2)*100
profit.margin <- paste(profit.margin,"%")
#best month
best_month <- max(profit_after_tax)
#worst month
worst_month <- min(profit_after_tax)
best_month
worst_month
mean_for_year <- mean(profit_after_tax)
mean_for_year
#sorting vector in ascending order
profit_sort_asc <- sort(profit_after_tax, decreasing = F)
for(i in profit_sort_asc){ if(i>mean_for_year){ good_month = i
break
}else{
bad_month = i
}
}
#good month
good_month
#bad month
bad_month
#csv print
data <- data.frame(revenue,expenses)
print(data)
write.csv(data,"F:BE_BIT\\display.csv")
print ('CSV file written Successfully :)')
Output:
3.
Develop a program to create two 3 X 3 matrices A and B and perform the following operations a)Transpose of the matrix b) addition c) subtraction d) multiplication
Code:
# Create matrices A and B
A <- matrix(1:9, nrow = 3)
B <- matrix(9:1, nrow = 3)
# Display matrices A and B
print("Matrix A:")
print(A)
print("Matrix B:")
print(B)
# Transpose of the matrices
print("Transpose of Matrix A:")
print(t(A))
print("Transpose of Matrix B:")
print(t(B))
# Addition of matrices
print("Addition of A and B:")
print(A + B)
# Subtraction of matrices
print("Subtraction of A and B:")
print(A - B)
# Multiplication of matrices
print("Multiplication of A and B:")
print(A %*% B)
output
4. Develop a program to find the factorial of given number using recursive function calls.
What is factorial? How to find it using recursion?
Algorithm
STEP 1: Call function recur_fact()
STEP 2: Pass the number as num to function.
STEP 3: Check if the number > 1 or not, if yes do step 4 otherwise step5
Program:
recur_fact <- function(num) { if(num <= 1) { return(1)
} else {
return(num * recur_fact(num-1))
}
}
print(paste("The factorial of 10 is",recur_fact (10)))
Output:
5. Develop an R Program using functions to find all the prime numbers up to a specified number by the method of Sieve of Eratosthenes.
Program:
prime_numbers <- function(n) {
if (n >= 2) {
x = seq(2, n)
prime_nums = c()
for (i in seq(2, n)) {
if (any(x == i)) {
prime_nums=c(prime_nums,i) x = c(x[(x %% i) != 0], i)
}
}
return(prime_nums)
}
else
{
stop("Input number should be at least 2.")
}
}
prime_numbers(12)
Output:
6. The built-in data set mammals contain data on body weight versus brain weight.
Develop R commands to:
a) Find the Pearson and Spearman correlation coefficients. Are they similar?
b) Plot the data using the plot command.
c) Plot the logarithm (log) of each variable and see if that makes a difference.
Program:
setwd("F:/BIT") #Change Directory
my_data <- read.csv("mammals.csv")
# Part a: Find the Pearson and Spearman correlation coefficients. Are they similar?
data <- read.csv("mammals.csv")
print(data)
pearson_corr <- cor(mammals$brainwt, mammals$bodywt, method = "pearson")
spearman_corr <- cor(mammals$brainwt, mammals$bodywt, method = "spearman") print(paste("Pearson correlation coefficient:", pearson_corr))
print(paste("Spearman correlation coefficient:", spearman_corr))
# Part b: Plot the data using the plot command
plot(mammals$bodywt, mammals$brainwt, xlab = "Body Weight", ylab = "Brain Weight", main = "Body Weight vs. Brain Weight")
# Part c: Plot the logarithm (log) of each variable and see if that makes a difference
plot(log(mammals$bodywt), log(mammals$brainwt), xlab = "log(Body Weight)", ylab
= "log(Brain Weight)", main = "log(Body Weight) vs. log(Brain Weight)")
Output:
8.
Let us use the built-in dataset air quality which has Daily air quality measurements in New York, May to September 1973. Develop R program to generate histogram by using appropriate arguments for the following statements.
a) Assigning names, using the air quality data set.
b) Change colors of the Histogram
c) Remove Axis and Add labels to Histogram
d) Change Axis limits of a Histogram
e) Add Density curve to the histogram
Code:
# Load the dataset
data(airquality)
# a) Assigning names, using the air quality data set
names(airquality) <- c("Ozone", "Solar.R", "Wind", "Temp", "Month", "Day")
# b) Change colors of the Histogram
hist(airquality$Ozone, col = "skyblue", main = "Histogram of Ozone Levels", xlab = "Ozone Levels")
# c) Remove Axis and Add labels to Histogram
hist(airquality$Wind, col = "lightgreen", main = "", xlab = "", ylab = "", axes = FALSE)
axis(1, at = seq(0, max(airquality$Wind), by = 5), labels = seq(0, max(airquality$Wind), by = 5))
axis(2)
title(main = "Histogram of Wind Speed", xlab = "Wind Speed", ylab = "Frequency")
# d) Change Axis limits of a Histogram
hist(airquality$Temp, col = "salmon", main = "Histogram of Temperature", xlim = c(50, 100), ylim = c(0, 30))
# e) Add Density curve to the histogram
hist(airquality$Solar.R, col = "lightblue", main = "Histogram of Solar Radiation", xlab = "Solar Radiation")
lines(density(airquality$Solar.R), col = "red")
9. Design a data frame in R for storing about 20 employee details. Create a CSV file named “input.csv” that defines all the required information about the employee such as id, name, salary, start_date, dept. Import into R and do the following analysis.
a) Find the total number rows & columns
b) Find the maximum salary
c) Retrieve the details of the employee with maximum salary
d) Retrieve all the employees working in the IT Department.
e) Retrieve the employees in the IT Department whose salary is greater than 20000 and write these details into another file “output.csv”
Program:
setwd("F:/BIT")
my_data <- read.csv("input.csv")
data <- read.csv("input.csv")
print(data)
data <- read.csv("input.csv") print(is.data.frame(data))
print(ncol(data))
print(nrow(data))
data <- read.csv("input.csv")
sal <- max(data$salary)
print(sal)
data <- read.csv("input.csv")
sal <- max(data$salary)
retval <- subset(data, salary == max(salary)) print(retval)
data <- read.csv("input.csv")
retval <- subset( data, dept == "IT")
print(retval)
data <- read.csv("input.csv")
info <- subset(data, salary > 600 & dept == "IT") print(info)
data <- read.csv("input.csv")
retval <- subset(data, as.Date(start_date) > as.Date("2014-01-01"))
write.csv(retval,"output.csv", row.names = FALSE)
newdata <- read.csv("output.csv")
print(newdata)
Output:
10. Using the built in dataset mtcars which is a popular dataset consisting of the design and fuel consumption patterns of 32 different automobiles. The data was extracted from the 1974 Motor Trend US magazine, and comprises fuel consumption and 10 aspects of automobile design and performance for 32 automobiles (1973-74 models). Format A data frame with 32 observations on 11 variables :
a) mpg Miles/(US) gallon.
b) cyl Number of cylinders.
c) disp Displacement (cu.in.).
d) hp Gross horsepower.
e) drat Rear axle ratio,[6] wt Weight (lb/1000).
f) qsec 1/4 mile time.
g) vs V/S.
h) am Transmission (0 = automatic, 1 = manual).
i) gear Number of forward gears,.
j) carb Number of carburetors.
Develop R program, to solve the following:
a) What is the total number of observations and variables in the dataset?
b) Find the car with the largest hp and the least hp using suitable functions.
c) Plot histogram / density for each variable and determine whether continuous variables are normally distributed or not. If not, what is their skewness?
d) What is the average difference of gross horse power(hp) between automobiles with 3 and 4 number of cylinders(cyl)? Also determine the difference in their standard deviations.
Program:
install.packages("dplyr")
install.packages("explore")
library(dplyr)
library(explore)
mtcars %>% explore_tbl()
mtcars %>% describe()
mtcars %>%
explore_all()
mtcars %>%
explore(gear)
mtcars %>%
select(gear, mpg, hp, cyl, am) %>%
explore_all(target = gear)
data <- mtcars %>%
mutate(highmpg = if_else(mpg > 25, 1, 0, 0)) %>%
select(-mpg)
data %>% explore(highmpg)
data %>%
select(highmpg, cyl, disp, hp) %>%
explore_all(target = highmpg)
data %>%
select(highmpg, drat, wt, qsec, vs) %>%
explore_all(target = highmpg)
data %>%
select(highmpg, am, gear, carb) %>%
explore_all(target = highmpg)
data %>%
explain_tree(target = highmpg)
data %>% explore(wt, target = highmpg)
data %>% explore(wt, target = highmpg, split = FALSE)
mtcars %>% explore(wt, mpg)
mtcars %>%
explain_tree(target = hp, minsplit=15)
mtcars %>%
select(hp, cyl, mpg) %>% explore_all(target = hp)
Output:
11. Demonstrate the progression
of salary with years of experience using a suitable data set (You can create
your own dataset). Plot the graph visualizing the best fit line on the plot of
the given data points. Plot acurve of Actual Values vs. Predicted values to
show their correlation and performance of the model. Interpret the meaning of
the slope and y-intercept of the line with respect to the given data. Implement
using lm function. Save the graphs and coefficients in files. Attach the
predicted values of salaries as a new column to the original data set and save
the data as a new CSV file.
Code:
#
Step 1: Create a dataset
years_of_experience
<- c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10)
salaries <- c(30000, 35000, 40000, 45000, 50000, 55000,
60000, 65000, 70000, 75000) data <- data.frame(Experience =
years_of_experience, Salary = salaries)
#
Step 2: Plot the data points
plot(data$Experience,
data$Salary, main = "Salary vs. Years of Experience", xlab =
"Years of Experience", ylab = "Salary", pch = 16, col =
"blue")
# Step 3: Fit a linear regression model model <- lm(Salary ~
Experience, data = data)
# Step 4: Add the best fit line to the plot abline(model, col =
"red")
# Step 5: Predict values using the model predicted_values <-
predict(model)
#
Step 6: Plot actual vs predicted values
plot(data$Salary,
predicted_values, main = "Actual vs. Predicted Values", xlab =
"Actual Salary", ylab = "Predicted Salary", col =
"blue", pch = 16) abline(0, 1, col = "red") # Add a diagonal line for reference
# Step 7: Interpret coefficients slope <- coef(model)[2]
intercept <- coef(model)[1] cat("Slope:", slope, "\n")
cat("Y-intercept:", intercept, "\n")
# Step 8: Save the graphs
png("Salary_vs_Experience.png")
plot(data$Experience, data$Salary, main =
"Salary vs. Years of Experience", xlab = "Years of
Experience", ylab = "Salary", pch = 16, col = "blue")
abline(model, col = "red") dev.off()
png("Actual_vs_Predicted.png")
plot(data$Salary,
predicted_values, main = "Actual vs. Predicted Values", xlab =
"Actual Salary", ylab = "Predicted Salary", col =
"blue", pch = 16) abline(0, 1, col = "red") dev.off()
#
Step 9: Attach predicted values as a new column to the original dataset
data$Predicted_Salary <- predicted_values
#
Step 10: Save the dataset as a new CSV file
write.csv(data, file = "new_dataset.csv", row.names = FALSE)
print(new_dataset)
Output:
Comments
Post a Comment