Great Deal! Get Instant $10 FREE in Account on First Order + 10% Cashback on Every Order Order Now

Dear TFTH,I have a final exam on 20/06/2024, and I need an expert who has completed order No XXXXXXXXXXPlease word file everything i want. i mentined it in the file.

1 answer below »
Dear Participants,
Please find below the Time Series Forecasting Project instructions:
· You have to submit 2 files : 
1. Answer Report: In this, you need to submit all the answers to all the questions in a sequential manner. It should include the detailed explanation of the approach used, insights, inferences, all outputs of codes like graphs, tables etc. Your report should not be filled with codes. You will be evaluated based on the business report.
Note: In the business report, there should be a proper interpretation of all the tasks performed along with actionable insights. Only the presence of interpretation of the models is not sufficient to be eligible for full marks in each of the criteria mentioned in the ru
ic. Marks will be deducted wherever inferences are not clearly mentioned.
2. Jupyter Notebook file: This is a must and will be used for reference while evaluating.
Any assignment found copied/ plagiarized with another person will not be graded and marked as zero. Please ensure timely submission as a post-deadline assignment will not be accepted.
Problem 1 for the Data Set : Shoesales.csv
You are an analyst in the IJK shoe company and you are expected to forecast the sales of the pairs of shoes for the upcoming 12 months from where the data ends. The data for the pair of shoe sales have been given to you from January 1980 to July 1995.
Problem 2 for the Data Set SoftDrink.csv:
You are an analyst in the RST soft drink company and you are expected to forecast the sales of the production of the soft drink for the upcoming 12 months from where the data ends. The data for the production of soft drink has been given to you from January 1980 to July 1995.
Please do perform the following questions on each of these two data sets separately.
1. Read the data as an appropriate Time Series data and plot the data.
2. Perform appropriate Exploratory Data Analysis to understand the data and also perform decomposition.
3. Split the data into training and test. The test data should start in 1991.
4. Build various exponential smoothing models on the training data and evaluate the model using RMSE on the test data.
Other models such as regression,naïve forecast models, simple average models etc. should also be built on the training data and check the performance on the test data using RMSE.
5. Check for the stationarity of the data on which the model is being built on using appropriate statistical tests and also mention the hypothesis for the statistical test. If the data is found to be non-stationary, take appropriate steps to make it stationary. Check the new data for stationarity and comment.
Note: Stationarity should be checked at alpha = 0.05.
6. Build an automated version of the ARIMA/SARIMA model in which the parameters are selected using the lowest Akaike Information Criteria (AIC) on the training data and evaluate this model on the test data using RMSE.
7. Build ARIMA/SARIMA models based on the cut-off points of ACF and PACF on the training data and evaluate this model on the test data using RMSE.
8. Build a table with all the models built along with their co
esponding parameters and the respective RMSE values on the test data.
9. Based on the model-building exercise, build the most optimum model(s) on the complete data and predict 12 months into the future with appropriate confidence intervals
ands.
10. Comment on the model thus built and report your findings and suggest the measures that the company should be taking for future sales.
    Extended Project - Time Series Forecasting Project
    Criteria
    Ratings
    Pts
    This criterion is linked to a Learning Outcome1. Read the data as an appropriate Time Series data and plot the data.
    This area will be used by the assessor to leave comments related to this criterion.
    2.0 pts
    This criterion is linked to a Learning Outcome2. Perform appropriate Exploratory Data Analysis to understand the data and also perform decomposition.
    This area will be used by the assessor to leave comments related to this criterion.
    5.0 pts
    This criterion is linked to a Learning Outcome3. Split the data into training and test. The test data should start in 1991.
    This area will be used by the assessor to leave comments related to this criterion.
    2.0 pts
    This criterion is linked to a Learning Outcome4. Build various exponential smoothing models on the training data and evaluate the model using RMSE on the test data. Other models such as regression,naïve forecast models, simple average models etc. should also be built on the training data and check the performance on the test data using RMSE. (Please do try to build as many models as possible and as many iterations of models as possible with different parameters.)
    This area will be used by the assessor to leave comments related to this criterion.
    16.0 pts
    This criterion is linked to a Learning Outcome5. Check for the stationarity of the data on which the model is being built on using appropriate statistical tests and also mention the hypothesis for the statistical test. If the data is found to be non-stationary, take appropriate steps to make it stationary. Check the new data for stationarity and comment. Note: Stationarity should be checked at alpha = 0.05.
    This area will be used by the assessor to leave comments related to this criterion.
    3.0 pts
    This criterion is linked to a Learning Outcome6. Build an automated version of the ARIMA/SARIMA model in which the parameters are selected using the lowest Akaike Information Criteria (AIC) on the training data and evaluate this model on the test data using RMSE.
    This area will be used by the assessor to leave comments related to this criterion.
    8.0 pts
    This criterion is linked to a Learning Outcome7. Build ARIMA/SARIMA models based on the cut-off points of ACF and PACF on the training data and evaluate this model on the test data using RMSE.
    This area will be used by the assessor to leave comments related to this criterion.
    8.0 pts
    This criterion is linked to a Learning Outcome8. Build a table (create a data frame) with all the models built along with their co
esponding parameters and the respective RMSE values on the test data.
    This area will be used by the assessor to leave comments related to this criterion.
    2.0 pts
    This criterion is linked to a Learning Outcome9. Based on the model-building exercise, build the most optimum model(s) on the complete data and predict 12 months into the future with appropriate confidence intervals
ands.
    This area will be used by the assessor to leave comments related to this criterion.
    3.0 pts
    This criterion is linked to a Learning Outcome10. Comment on the model thus built and report your findings and suggest the measures that the company should be taking for future sales.(Please explain and summarise the various steps performed in this project. There should be proper business interpretation and actionable insights present.)
    This area will be used by the assessor to leave comments related to this criterion.
    5.0 pts
    This criterion is linked to a Learning OutcomePlease reflect on all that you learnt and fill this reflection report. You have to copy the link and paste it on the URL bar of your respective
owser. https:
docs.google.com/forms/d/e/1FAIpQLSeBxE1cfP7ugyx8sa1JFGg_Nkv-jlEztsszbc9US911oWo2KQ/viewform
    This area will be used by the assessor to leave comments related to this criterion.
    0.0 pts
    This criterion is linked to a Learning OutcomeQuality of Business Report (Please refer to the Evaluation Guidelines for Business report checklist. Marks in this criteria are at the moderator's discretion)
    This area will be used by the assessor to leave comments related to this criterion.
    6.0 pts
    Total Points: 60.0
All the very best!
Regards,
Program Office
Top of Form
Bottom of Form
Answered 1 days After Jun 18, 2024

Solution

Baljit answered on Jun 20 2024
12 Votes
Microsoft Word - Final_exam
Student Surname: Zergani
Student Firstname: Bo
y
Student ID: 22114223
Subject Name: Thinking About Data
Subject Code: COMP1014
loading of exam data
#loading of data
li
ary(readr)
exam_data <- read_csv("exam2024.csv",show_col_types = FALSE)
head (exam_data,5)
## # A ti
le: 5 × 7
## financial_year state area source amount source_group
area_category
##
## 1 1997 NSW Administration Austr… 315 Government Other
area
## 2 1997 NSW Administration State… 120 Non-Governm… Other
area
## 3 1997 NSW Administration Priva… 314 Non-Governm… Other
area
## 4 1997 NSW Aids and applia… Austr… 65 Government Other
area
## 5 1997 NSW Aids and applia… Indiv… 168 Non-Governm… Other
area
Q1(a):
We will use the Chi-Square Test of Independence for this question.
Hypothesis statement for the test are:
• Null Hypothesis (? ): The area category of funds is independent of the source of
funding group.
• Alternative Hypothesis (? ): The area category of funds is not independent of the
source of funding group.
Lets perform the test using r
# contingency table for the source funding and area category
contingency_table <- table(exam_data$source_group, exam_data$area_category)
print(contingency_table)
##
## Dental services Medical services Other area Research
## Government 120 120 1440 120
## Non-Government 479 360 3760 257
# Chi-Square Test of Independence
chi_square_test <- chisq.test(contingency_table)

# results
print(chi_square_test)
##
## Pearson's Chi-squared test
##
## data: contingency_table
## X-squared = 21.423, df = 3, p-value = 8.599e-05
# the p-value
print(chi_square_test$p.value)
## [1] 8.598895e-05
Now in the above result we get the test-statistics 21.423 and p-value of 8.599e-05 which is
lower than significant value 0.05 .So our result will lead to rejection of null hypothesis so
there are significant evident that the area category of funds is independent of the source of
funding group.
Q1(b):
It means that one variable has no effect on the other. They operate independently, like two
separate events that don’t affect each others outcomes. For example, flipping a coin doesn’t
change the chances of rolling a specific number on a die. Each event happens without
influencing the other.
Q2(a) :
We will use two sample t- test for this question
Hypotheis for the test are :
• Null Hypothesis (? ): The mean expenditure for “Public hospitals” is the same for
oth “Australian Government” and “Individuals”.
• Alternative Hypothesis (? ):The mean expenditure for “Public hospitals” is
different for both “Australian Government” and “Individuals”.
Lets perform the test using r:
# Required li
aries
li
ary(dplyr)
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
li
ary(ggplot2)

# Subset the data for Public hospitals and Australian Government /
Individuals
public_hospitals_data <- exam_data %>%
filter(area == "Public hospitals" & source %in% c("Australian Government",
"Individuals"))

#plot of data
ggplot(public_hospitals_data, aes(x = source, y = amount, fill = source)) +
geom_boxplot() +
labs(x = "Funding Source", y = "Amount of Expenditure",
title = "Expenditure for Public Hospitals by Funding Source") +
theme_minimal()
The box plot analysis highlights significant differences in expenditure patterns between
“Australian Government” and “Individuals” funding sources for “Public hospitals”. These
findings suggest that the source of funding plays a crucial role in determining the
distribution and variability of expenditures in the healthcare sector, potentially influencing
esource allocation and healthcare service provisions...
SOLUTION.PDF

Answer To This Question Is Available To Download

Related Questions & Answers

More Questions »

Looking For Homework Help? Get Help From Best Experts!

Copy and Paste Your Assignment Here