Worldwide Berlin is famously known for its nightlife. The data set we will work with stores the geographical location of clubs and bars located in Berlin. The data is gathered from OpenStreetMap (OSM). The data was downloaded from GEOFABRIK on June 25, 2017, and contains OpenStreetMap data as of June 22, 2017 (see here).

We download the data and read the osm_pois_p.shp file, which corresponds to point of interests in Berlin, using the st_read() function from the sf package.

# build a temporary folder on disk
temp <- tempfile()
download.url <- "https://userpage.fu-berlin.de/soga/300/30100_data_sets/spatial/"
zipfile <- "osm_pois_p.zip"

## download the file
download.file(paste0(download.url,zipfile),temp, mode="wb")
## unzip the file(s)
unzip(temp)
## close file connection
unlink(temp)
library(rgeos)
library(sf)
## read in the data
berlin.features <- st_read("osm_pois_p.shp")
## Reading layer `osm_pois_p' from data source `/Users/jokr/Documents/soga/osm_pois_p.shp' using driver `ESRI Shapefile'
## Simple feature collection with 49091 features and 4 fields
## geometry type:  POINT
## dimension:      XY
## bbox:           xmin: 13.08407 ymin: 52.33794 xmax: 13.75799 ymax: 52.66951
## epsg (SRID):    4326
## proj4string:    +proj=longlat +datum=WGS84 +no_defs

The short report printed gives the file name, the driver (ESRI Shapefile), mentions that there are 49091 records (denoted as features), represented as rows, and 4 attributes (denoted as fields), represented as columns.

By looking at the column names we realize that the category related to the point data is stored in the fclass column.

colnames(berlin.features)
## [1] "osm_id"   "code"     "fclass"   "name"     "geometry"

By applying the unique() function we get an overview of the different categories represented in the data set.

unique(berlin.features$fclass)
##   [1] camera_surveillance tourist_info        restaurant         
##   [4] telephone           biergarten          recycling_glass    
##   [7] kiosk               memorial            fire_station       
##  [10] bank                post_office         toilet             
##  [13] cinema              library             police             
##  [16] hospital            post_box            chemist            
##  [19] hotel               motel               arts_centre        
##  [22] cafe                fast_food           optician           
##  [25] convenience         car_sharing         shelter            
##  [28] viewpoint           attraction          pub                
##  [31] supermarket         bakery              sports_shop        
##  [34] pharmacy            atm                 playground         
##  [37] theatre             school              car_rental         
##  [40] gift_shop           museum              guesthouse         
##  [43] beverages           bar                 travel_agent       
##  [46] embassy             butcher             pitch              
##  [49] stationery          outdoor_shop        kindergarten       
##  [52] fountain            university          monument           
##  [55] beauty_shop         sports_centre       hostel             
##  [58] tower               bench               waste_basket       
##  [61] recycling_clothes   artwork             nightclub          
##  [64] windmill            drinking_water      computer_shop      
##  [67] furniture_shop      recycling           community_centre   
##  [70] bicycle_shop        car_dealership      vending_any        
##  [73] public_building     bookshop            florist            
##  [76] doityourself        laundry             park               
##  [79] comms_tower         hairdresser         clothes            
##  [82] mobile_phone_shop   bicycle_rental      video_shop         
##  [85] picnic_site         jeweller            college            
##  [88] veterinary          dentist             water_tower        
##  [91] hunting_stand       vending_machine     swimming_pool      
##  [94] track               newsagent           shoe_shop          
##  [97] mall                doctors             vending_cigarette  
## [100] courthouse          greengrocer         town_hall          
## [103] toy_shop            nursing_home        recycling_paper    
## [106] garden_centre       department_store    vending_parking    
## [109] theme_park          car_wash            graveyard          
## [112] water_well          water_works         chalet             
## [115] ruins               prison              zoo                
## [118] battlefield         stadium             archaeological     
## [121] camp_site           observation_tower   wayside_shrine     
## [124] alpine_hut          castle              food_court         
## [127] dog_park            water_mill          caravan_site       
## [130] golf_course        
## 130 Levels: alpine_hut archaeological arts_centre artwork ... zoo

Now we subset our data set to include only the category nightclub and bar using the | operator.

berlin.locations <- 
  berlin.features[berlin.features$fclass=='nightclub' |
                    berlin.features$fclass=='bar',
                  c('fclass', 'name')]

We clean up the data set by renaming the column and dropping unused levels from the factors variables using the droplevels() function. Then we plot the relative frequency of the categories in our data set by combing the table(), the prop.table() and the barplot() function.

# rename column from fclass to loc
colnames(berlin.locations)[1] <- 'loc'
# drop unused levels 
berlin.locations$loc <- droplevels(berlin.locations$loc)
# plot
barplot(prop.table(table(berlin.locations$loc)))