Skip to content

This is a personal project on Data Science working on NYC neighborhood housing analysis

License

Notifications You must be signed in to change notification settings

tgh101/nyc_hsp_project

Repository files navigation

Housing Sales Prices & Venues Data Analysis of New York City

A. Introduction

A.1. Business Problem

New York is one of the largest metropolises in the world where over 8 millions people live and it has a population density of 10,715 people per square kilometer. As a resident of this city, I decided to use New York in my project. The city is divided into 5 main districts in total. However, the fact that the districts are squeezed into an area of approximately 783 square kilometers causes the city to have a very intertwined and mixed structure [1].

As you can see from the figures, New York is a city with a high population and population density. Being such a crowded city leads the owners of shops and social sharing places in the city where the population is dense. When we think of it by the investor, we expect from them to prefer the districts where there is a lower real estate cost and the type of business they want to install is less intense. If we think of the city residents, they may want to choose the regions where real estate values are lower, too. At the same time, they may want to choose the district according to the social places density. However, it is difficult to obtain information that will guide investors in this direction, nowadays.

When we consider all these problems, we can create a map and information chart where the real estate index is placed on New York and each district is clustered according to the venue density.

A.2. Project Summary

Using data science techniques to analyze the following questions:

  • Is the surrounding venues can effect the price of real estates?
  • What kind of surrounding venues, and to what extend, can effect the price?
  • Can we use the surrounding venue to estimate the value of an accommodation over the average price of one area? And to what degree of confidence?

The data will be:

  • Average price of 1 Standard residential Unit in New York city's neighborhoods. (kaggle)
  • Venues surrounding each neighborhoods. (FourSquare API)

Target audiences will be:

  • Home buyers, who can roughly estimate the value of a target house over the average.
  • Planners, who can decide which venues to place around their product, so that the price is maximized.
  • Any normal person, who is wondering if that in-process building will effect his/her home's value.

About

This is a personal project on Data Science working on NYC neighborhood housing analysis

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published