CVE2112 HDB Resale Price Analysis Assignment: Data Visualization & Summary Stats in R Program Project, Singapore

University Singapore Institute of Technology (SIT)
Subject CVE2112 Data Analysis

Project Description

This is an individual project focused on descriptive statistics and data visualization techniques to analyse public housing HDB resale flat prices in Singapore from 2017 to 2025. Students will use real-world data sourced from data.gov.sg (Singapore’s official public data repository).

The objective is to develop proficiency in preparing, processing, analyzing, and visualizing data using RStudio. Students will also be encouraged to use Generative AI (GenAI) tools (such as ChatGPT, Copilot, Gemini) to aid in R programming tasks, including data wrangling and plotting.

Learning Outcome

By completing this continuous assessment (CA), students will be able to:

  • Perform data cleaning and transformation on real-world datasets using R.
  • Apply descriptive statistics to summarize housing market data.
  • Generate meaningful visual representations of data trends and distributions.
  • Interpret data analysis results and draw relevant conclusions.
  • Utilize GenAI tools to assist in the development of data analysis code and visualizations.

Dataset Overview

The dataset used in this project was downloaded from data.gov.sg, which is Singapore’s official open data repository. The data set contains records of HDB resale flat prices based on the registration date of the transactions, covering the period from January 2017 to April 2025.

dataset overview

Hire a Professional Essay & Assignment Writer for completing your Academic Assessments

Native Singapore Writers Team

  • 100% Plagiarism-Free Essay
  • Highest Satisfaction Rate
  • Free Revision
  • On-Time Delivery

The dataset “ResaleHdbFlat2017to2025.csv” is provided in CSV (Comma-Separated Values) format. It can be opened using spreadsheet software such as Microsoft Excel, which allows you to view the tabular structure of the data before importing it into RStudio for analysis. An example of how the dataset appears in Excel is shown below.

The dataset contains 11 columns, each representing a specific attribute of the HDB resale transaction:

  1. Month – The month and year of the transaction (formatted as YYYY-MM).
  2. Town – The town or estate where the flat is located.
  3. Flat type – The classification of the flat based on size and layout (e.g., 3 ROOM, 4 ROOM).
  4. Block – The block number of the HDB flat.
  5. Street name – The name of the street where the flat is located.
  6. Storey range – The floor level range of the unit (e.g., 10 TO 12).
  7. Floor area (sqm) – The size of the flat in square metres.
  8. Flat model – The design model of the flat (e.g., New Generation, Model A).
  9. Lease commence date – The year the 99-year lease started.
  10. Remaining lease – The number of years left on the lease at the time of resale.
  11. Resale price – The transaction price in Singapore dollars.

 HDB resale transaction

For this project, students are required to analyze only the transactions from the town where they currently reside. This ensures that each student conducts a focused and personalized analysis based on a familiar location.

Students will load the dataset into RStudio, filter it to include only their selected town, and then carry out data cleaning, statistical analysis, and visualization tasks as outlined in the assignment instructions.

SEE ANNEX A FOR THE LIST OF TOWN AND FLAT TYPE

To begin your analysis in RStudio, follow these simple steps to read and filter the dataset with the name of your town.

# Load the necessary libraries
library(tidyverse)

# Read the CSV file into R (ensure the file is in your working directory)
hdb_data <- read_csv("ResaleHdbFlat2017to2025.csv")

# Replace "YOUR TOWN" with the name of your town in CAPITAL LETTERS (e.g., "TAMPINES")
selected_town <- "YOUR TOWN"

# Filter the dataset to include only transactions from your town
town_data <- hdb_data %>%
 filter(town == selected_town)

# View the first few rows of the data
head(town_data)

Tasks

Students are to focus their analysis on only one town—specifically, the town where they currently reside. They are expected to perform the tasks below and submit a structured data analysis report.

Tasks Descriptions

  1. DATA PREPARATION
    • Import and inspect the dataset.
    • Convert date fields to proper date/time format.
    • Filter the data to include only records from your residential town.
    • Handle any missing values or inconsistencies.
  2. SUMMARY STATISTICSProvide descriptive statistics for resale prices in your town filtered by flat type (i.e., 1 ROOM, 2 ROOM, 3 ROOM, 4 ROOM, 5 ROOM, EXECUTIVE, MULTI-GENERATION)
    • Mean, median, minimum, maximum, Q1 and Q3
    • Standard deviation, interquartile range (IQR), coefficient of variation (cov)
    • Number of transactions per flat type

    Tabulate the above summary statistics in the following table format

    Town Flat type* Number of transactions Mean Median SD Min Q1 Q3 Max IQR cov
    “YOUR TOWN” 1 ROOM
    2 ROOM
    3 ROOM
    4 ROOM
    5 ROOM
    EXEC…
    MULTI…

    * if applicable

    Buy Custom Answer of This Assessment & Raise Your Grades
  3. DATA VISUALISATIONSCreate the following plots for your town filtered by flat type using ggplot2 package:
    1. Histogram with KDE (Kernel Density Estimate): Visualize the distribution of resale prices, with a smooth density curve overlay.
    2. Cumulative Distribution Plot: Show the cumulative proportion of resale prices to understand pricing thresholds
    3. Time Series Plot: Analyze how the average resale price changes over year.
    4. Pie Chart: Show the proportion of total transactions by flat type.
    5. Box-plot with Jittered data points: Compare the spread and central tendency of resale prices across flat types
    6. Any other plots that you may think useful (e.g. further filtered by remaining lease or storey range)

    Ensure all visualizations are clearly labelled and interpreted.

    Creativity is encouraged — feel free to enhance your visuals with themes, colors, or annotations to improve clarity.

    SEE ANNEX B FOR EXAMPLE OF DATA VISUALISATION PLOTS

Report Submission Requirements

Students must submit a written report in PDF format that includes the following clearly labeled sections. The report should be concise, well-organized, and supported with tables, plots, and commentary. You are encouraged to apply creativity in your presentation and data visualization to effectively communicate your findings.

  1. IntroductionBriefly introduce the project and your selected townState the objective of the analysis (e.g., to explore resale price trends, flat type distributions, and lease effects in your town)
  2. Data PreparationDescribe the data cleaning and preprocessing steps you performed.Mention any assumptions made or filters applied (e.g., filtering by town, handling missing values, converting date formats).
  3. Summary StatisticsPresent summary statistics in tableComment on price levels, variation, and trends
  4. Data VisualisationsInclude and describe each of the required plotsHighlight key observations
  5. Observations and InsightsDiscuss any interesting patterns or anomalies you discovered.Provide your interpretation of the data in the context of your town.

    Consider factors like flat type, lease duration, and price variation.

  6. Use of GenAI in learningDescribe how you used Generative AI tools to: Write or debug R code, Understand statistical concepts, Generate visualizationsReflect on how these tools supported your learning
  7. ConclusionSummarize your key findingsReflect on what you learned about your town’s HDB resale market

    Mention any limitations for further data exploration

Guiding questions. Use these to help structure your analysis and reflection:

  1. What is the most common flat type in your town?
  2. How have resale prices changed over time?
  3. How does your town compare to national averages (if known or researched)?
  4. Are there specific flat models or storey ranges that are more common in your town?
  5. Does your town show signs of gentrification or aging infrastructure based on lease dates and prices?
  6. Which flat type has the highest average resale price?
  7. Are there any outliers or unusual trends in the data?
  8. How does the remaining lease affect resale prices?
  9. What challenges did you face during the analysis?
  10. How did GenAI tools help you in this project?

SEE ANNEX C FOR THE ASSESSMENT RUBRIC

Stuck with a lot of homework assignments and feeling stressed ? Take professional academic assistance & Get 100% Plagiarism free papers

Other submission instructions

Please submit the softcopy of your written report in a pdf format together with your RStudio script file (.R) into LMS dropbox by Sunday, 1st June 2025, 23.59hrs.

Plagiarism is strictly prohibited.

  • Submitting work that is copied from another student will result in a zero mark.

You are encouraged to discuss ideas and collaborate on learning, but:

  • Do not copy code or text from others
  • Your submission must reflect your own understanding and work

Use of Generative AI tools (e.g., ChatGPT, Copilot) is allowed, but you must:

  • Use them responsibly
  • Acknowledge how you used them in your report

ANNEX A – LIST OF TOWN AND FLAT TYPE (based on the csv file provided)

LIST OF TOWN

  1. ANG MO KIO
  2. BEDOK
  3. BISHAN
  4. BUKIT BATOK
  5. BUKIT MERAH
  6. BUKIT PANJANG
  7. BUKIT TIMAH
  8. CENTRAL AREA
  9. CHOA CHU KANG
  10. CLEMENTI
  11. GEYLANG
  12. HOUGANG
  13. JURONG EAST
  14. JURONG WEST
  15. KALLANG/WHAMPOA
  16. MARINE PARADE
  17. PASIR RIS
  18. PUNGGOL
  19. QUEENSTOWN
  20. SEMBAWANG
  21. SENGKANG
  22. SERANGOON
  23. TAMPINES
  24. TOA PAYOH
  25. WOODLANDS
  26. YISHUN

LIST OF FLAT TYPE

  1. 1 ROOM
  2. 2 ROOM
  3. 3 ROOM
  4. 4 ROOM
  5. 5 ROOM
  6. EXECUTIVE
  7. MULTI-GENERATION

ANNEX B – EXAMPLE OF DATA VISULATION PLOTS

The following are example data visualisation plots created using dummy data to illustrate the expected outputs. Students are encouraged to be creative — feel free to enhance your visuals with themes, color palettes, annotations, or layout adjustments to improve clarity and presentation.

ANNEX B – EXAMPLE OF DATA VISULATION PLOTS

ANNEX C – ASSESSMENT RUBRIC

ANNEX C – ASSESSMENT RUBRIC

Hire a Professional Essay & Assignment Writer for completing your Academic Assessments

Native Singapore Writers Team

  • 100% Plagiarism-Free Essay
  • Highest Satisfaction Rate
  • Free Revision
  • On-Time Delivery

Get Help By Expert

Singapore Assignment Help is an online assignment writing service provider that is reliable, affordable, and capable of writing any subject assignment for you. You can check some of the data analysis assignment samples uploaded on our website and see how our expert report writers write data analysis assignments.

Answer

Looking for Plagiarism free Answers for your college/ university Assignments.

Ask Your Homework Today!

We have over 1000 academic writers ready and waiting to help you achieve academic success