University | Singapore University of Social Science (SUSS) |
Subject | ANL252: Python for Data Analytics |
Question 1
Given the following data which contain 20 rows and 3 columns: X1, X2, and Y.
X1 | X2 | Y |
4 | 0.2 | 1.16 |
6 | 0.1 | 0.06 |
8 | 0.3 | -1.79 |
4 | 0.6 | 1.55 |
10 | 0.1 | -4.88 |
1 | 0.4 | 1.37 |
9 | 0.6 | -1.25 |
5 | 0.3 | -1.1 |
2 | 0.5 | 3.23 |
7 | 0.5 | -2.71 |
8 | 0.1 | -0.99 |
2 | 0.9 | 3.23 |
2 | 0.8 | 4.55 |
8 | 1 | 2.7 |
7 | 0.9 | -1.13 |
9 | 0.1 | -0.88 |
1 | 0.2 | 2.08 |
4 | 0.2 | 1.62 |
6 | 0.7 | -0.9 |
9 | 0.7 | 0.46 |
(a) Construct a Python program to store the above data in a NumPy array.
(b) Suppose a linear regression was fitted on these data. The estimated model is
ππ = 2 β 0.5ππ1 + 2.5ππ2,
where ΕΆ is the predicted (or expected) value of Y, X1 and X2 are the observed values of the columns X1 and X2. Design a Python program to compute ΕΆ for every row of the array and store the results in a separate NumPy array as well.
Hire a Professional Essay & Assignment Writer for completing your Academic Assessments
Native Singapore Writers Team
- 100% Plagiarism-Free Essay
- Highest Satisfaction Rate
- Free Revision
- On-Time Delivery
(c) The residuals of the model Γͺ are calculated by:
ππΜ= ππ β ππ
where Y is the actual value stored in the original NumPy array and ΕΆ is the predicted value of Y computed in (b). Use a Python program to compute Γͺ for every row of the
array and store the results in a separate NumPy array.
(d) One of the main assumptions for linear regression is that the residuals must be normally distributed with zero mean and constant variance. Create a histogram of the tmatplotlib package. Adjust the parameters of the chart so that the ticks on the x-axis can be read clearly, a title is given to the chart, and both the axes are labeled. Eventually, discuss whether you agree that the normality assumption with zero means (the checking of constant variance is not required here) is valid based on this histogram.
(e) The constant variance assumption can be checked by a scatter plot in which the x-axis represents the values of the predicted values ΕΆ and the y-axis represents the residuals Γͺ. If the scatter plot does not show any pattern and the values of all the data points are more or less on the same level. Write a Python program to create such a scatter plot for checking the constant variance assumption. Adjust the parameters of the chart so that the ticks on both axes can be read clearly, a title is given to the chart, and both the axes are labeled. Eventually, discuss whether you agree that the constant variance
the assumption is valid based on this scatter plot.
Buy Custom Answer of This Assessment & Raise Your Grades
Question 2
The data of 19 students in a secondary school class are stored in a .csv data file named βclass.csvβ. Gender, age, height, and weight are the features of the students that have been recorded. Employ your Python programming skills to carry out the tasks below.
Include your Python program code in the answers and show them in the βConsolasβ or βCourier Newβ fonts (size 12). Make a screenshot of the program output if required.
(a) Prepare a Python program to read in and to convert the data from a .csv text file into a pandas DataFrame. Check the existing missing data in the dataset and adjust the reader accordingly.
(b) The data should be sorted by the age of the students in descending order and then by their gender in ascending order. Employ the corresponding Python syntax to carry
out this task.
(c) Identify the location of the missing values in the DataFrame. Report the rows and columns where the missing data are found.
(d) If missing values are detected in the DataFrame, they have to be treated according to the columns they belong to. Here are the instructions of how we should deal with the missing data in each column:
Gender β replace missing values by the gender with the highest frequency
Age β replace missing values by the median age
Height β replace missing values by the mean height
Weight β replace missing values by the mean weight
Design your own Python program to determine the corresponding statistics for each column to replace the missing values in it.
Stuck with a lot of homework assignments and feeling stressed ? Take professional academic assistance & Get 100% Plagiarism free papers
Are you a Singapore university student? Need Cheap Assignment Help to complete your (ANL252) Python for Data Analytics Assignment? then don't need to look further. At Singapore Assignment Help we have a huge team of experts who have a great ability to serve step-by-step guidance on creating authentic and plagiarism-free answers on python programming assignments at a nominal price. So hurry up and get an outstanding grade on your python assignment.
Looking for Plagiarism free Answers for your college/ university Assignments.
- ECE210 Advocacy and Collaborations with Families Assignment: Supporting Young Children Through Grief and Family-Centered Partnerships
- ACC707 Accounting and Finance Assignment: Evaluating Investment Decisions, Budgeting Strategies, and Financial Performance Analysis
- NCO201 Learn to Learn, Learn for Life TMA-01: Developing Self-Directed Learning Through the Journey of Mastering Public Speaking
- PSS219 Public Safety and Security in Singapore Group-Based Assignment: Strengthening National Resilience Through Policy Responses from the 2025 Committee of Supply Debate
- MTH240 Engineering Mathematics I Assignment: Heat Transfer, Chemical Balancing, Circuit Analysis, Signal Processing, and Matrix Theory
- Engaging Youth with IBM Skills Build Assignment: Designing Innovative Strategies for Skill Development and Career Growth
- BUS368 Innovation Management and Digital Transformation Assignment: Managing Innovation in Foldable, Trifold, and Stretchable Display Technologies
- BUS366 Assignment: Process Improvement and Recruitment Optimization Using Lean Six Sigma Methodology
- HBC203 Statistics and Data Analysis for the Social and Behavioural Sciences TMA-01: A Comparative Analysis of Workplace Wellbeing Interventions and Their Impact on Employee Productivity
- BCAF003 Business Accounting Assignment: A Comprehensive Study on Bank Reconciliation, Cash Controls, Inventory Valuation, and Financial Analysis