This site is prepared for the GIS workshop at Universität Wien. The main purpose of the workshop is to encourage Japanologie researchers to understand GIS and public statistics regarding Japan.
In this workshop, how to use GIS in R will be lectured by hands-on. R could be available from the following URL https://cran.r-project.org/. R studio https://www.rstudio.com/ is also recommended to use because of its beautiful GUI.
When using GIS in R, it is necessary to use several packages. In order to use packages, we need to install them by using install.packages()
function like this. ggplot2
package can be installed via github.
Once you install packages, usually you don’t need to install them next time. When some of them are upgraded, it is recommended to install the packages again.
install.packages(c("maptools","classInt","RColorBrewer", "readr",
"magrittr","dplyr","kokudosuuchi", "estatapi",
"sf","purrr", "needs", "DT"))
devtools::install_github("tidyverse/ggplot2")
Installed packages becomes active by using library()
function.
library(maptools)
library(classInt)
library(RColorBrewer)
library(readr)
library(magrittr)
library(dplyr)
library(kokudosuuchi)
library(estatapi)
library(sf)
library(purrr)
library(needs)
library(ggplot2)
library(DT)
magrittr()
of needs
package should be prioritized so that pipes %>%
are available.
needs::prioritize(magrittr)
Create a new folder named as gisdata
for the working space. In this case, gisdata
folder will be made under the user folder.
setwd()
is used to move directory and getwd()
returns the current directory.
setwd("./")
dir.create("gisdata")
setwd("gisdata")
getwd()
Once you made a working directory, you don’t need to use the first and the second line of the above chunk.
First of all, we download a 1km mesh estiamted population data set from Kokudo Suuchi Joho (KSJ) from MLIT (Ministry of Lnad Infrastructure and Transportation) web site and then make a thematic map of estimated population of Kanagawa in 2020.
Kokudo Suuchi Joho Web site is as following; http://nlftp.mlit.go.jp/ksj/index.html
getKSJSummary()
of kokudosuuchi
package provides summary information of Kokudo Suuchi Joho via API
kokudosuuchi::getKSJSummary() %>%
filter(title=="1kmメッシュ別将来推計人口(H29国政局推計)(shape形式版)")
## # A tibble: 1 x 5
## identifier title field1 field2 areaType
## <chr> <chr> <chr> <chr> <chr>
## 1 m1000 1kmメッシュ別将来推計人口(H29国政局推計)(shape形式版)… 各種統計… " " 3
After getting URL of Kokudo Suuchi Joho by getKSJURL()
, Kokudo Suuchi Joho data could be downloaded by curl_download
of curl
package.
url <- kokudosuuchi::getKSJURL("m1000", prefCode = 14) %>%
dplyr::select(zipFileUrl) %>%
dplyr::pull(1)
tf <- tempfile(fileext = ".zip")
curl::curl_download(url, tf)
unzip(tf, exdir = getwd(), junkpaths = TRUE)
unlink(tf)
rm(tf)
st_read
of sf
package is used to read ArcGIS shape files.
d <- sf::st_read('Mesh3_POP_14.shp')
Plot a map of estimated population in 2020 of Kanagawa prefecture by using ggplot2
as followings.
ggplot2::ggplot(data = d, aes(fill=POP2020)) +
geom_sf() +
scale_fill_distiller(palette="RdYlGn") +
theme_bw() +
labs(title = "Population of Kanagawa in 2020 (est.)")
In this case, class of the attribute we’d like to show in the map is defined according to quantile
style by using classIntervals()
funciton of classInt
package.
Find another class style from help(classIntervals)
or the following URL (in Japanese).
http://web.sfc.keio.ac.jp/~maunz/wiki/index.php?asakura_sp_chap04
Class interval of population 2020 is five by defining n=5
. Color palet of the class interval is defined by using findClours()
. Here, a grey scale color pallet is applied.
pal0 <- c("white","grey","grey2")
ci <- classInt::classIntervals(d$POP2020, n=5, style = "quantile")
ci_Col <- classInt::findColours(ci,pal0)
You can plot a white map as following.
Color of borders of mesh (or polygon boundary) can be defined by border=""
.
plot(sf::st_geometry(d), border="grey", col="white")
A thematic map can also be drawn like this. Here, title and legend are added by using title()
and legend()
.
plot(sf::st_geometry(d), border="grey" ,col=ci_Col)
title("Population of Kanagawa in 2020 (Estimated)", cex=1.4)
legend("topleft",fill=attr(ci_Col,"palette"), cex=0.8,
legend=names(attr(ci_Col,"table")),bty="n")
The following chunk shows another way to make a thematic map by using pipe %>%
.
In this case, RColorBrewer
is applied to define a color pallet. Quantitative pallets, sequential pallets and diverging pallets can be employed in RColorBrewer
. (i.e. see https://moderndata.plot.ly/create-colorful-graphs-in-r-with-rcolorbrewer-and-plotly/)
d %>%
sf::st_geometry(d) %>%
plot(col=classInt::findColours(ci, rev(RColorBrewer::brewer.pal(7, "RdYlGn"))))
A grey scale map can be depicted like this.
d %>%
sf::st_geometry(d) %>%
plot(col=classInt::findColours(ci, pal0), border="grey")
You can select areas arbitrary by using filter()
function of dplyr
package. In this case, Yokohama city boundary is selected by CITY_CODE
.
市区町村コード or city codes are available from the following URL. http://www.soumu.go.jp/denshijiti/code.html
yoko <- dplyr::filter(d, CITY_CODE>=14101&CITY_CODE<=14143)
yoko %>%
sf::st_geometry(yoko) %>%
plot(col=classInt::findColours(ci, rev(RColorBrewer::brewer.pal(7, "RdYlGn"))))
Secondary, we’d like to dray a map of 500m mesh scale population in 2020 of Yokohama.
Download a 500m mesh estiamted population data set from KSJ and make a thematic map.
kokudosuuchi::getKSJSummary() %>%
filter(title=="500mメッシュ別将来推計人口(H29国政局推計)(shape形式版)")
url <- kokudosuuchi::getKSJURL("m500", prefCode = 14) %>%
select(zipFileUrl) %>%
pull(1)
tf <- tempfile(fileext = ".zip")
curl::curl_download(url, tf)
unzip(tf, exdir = getwd(), junkpaths = TRUE)
unlink(tf)
rm(tf)
d2 <- sf::st_read('Mesh4_POP_14.shp')
d2 %>%
sf::st_geometry(d2) %>%
plot(col=classInt::findColours(
classInt::classIntervals(d2$POP2020, n=5, style = "sd"),
rev(RColorBrewer::brewer.pal(7, "RdYlGn"))),
border="grey")
Now, we’d like to draw a Yokohama city area population and then to overlay a railway map and administrative boundary map to make a thematic map.
A railway GIS data and a boundary GIS data are also downloaded from KSJ.
# kokudosuuchi::getKSJSummary() %>% filter(title=="鉄道")
url <- kokudosuuchi::getKSJURL("N02") %>%
select(zipFileUrl) %>%
pull(1)
tf <- tempfile(fileext = ".zip")
curl::curl_download(url, tf)
unzip(tf, exdir = getwd(), junkpaths = TRUE)
unlink(tf)
rm(tf)
rw_ln <- sf::st_read('N02-05-g_RailroadSection.shp')
# kokudosuuchi::getKSJSummary() %>% filter(title=="行政区域")
kokudosuuchi::getKSJURL("N03", prefCode = 14) %>%
filter(year==2015) %>% select(zipFileUrl) %>%
pull(1) -> url
tf <- tempfile(fileext = ".zip")
curl::curl_download(url, tf)
unzip(tf, exdir = getwd(), junkpaths = TRUE)
unlink(tf)
rm(tf)
# Read a boundary of Kanagawa Pref.
ad <- sf::st_read('N03-15_14_150101.shp')
# Select Yokohama City boundary and plot it
ad %>%
filter(N03_007 %in% c(14101:14118)) -> y_ad
# Select Yokohama city: City codes is from 14101 to 14118
y2 <- dplyr::filter(d2, CITY_CODE>=14101&CITY_CODE<=14118)
y2 %>%
sf::st_geometry(y2) %>%
plot(col=classInt::findColours(
classInt::classIntervals(y2$POP2020, n=5, style = "sd"),
rev(RColorBrewer::brewer.pal(7, "RdYlGn"))),
border="grey")
plot(rw_ln, add=TRUE)
## Warning in plot.sf(rw_ln, add = TRUE): ignoring all but the first attribute
plot(y_ad, col="transparent", border="red", add=TRUE)
## Warning in plot.sf(y_ad, col = "transparent", border = "red", add = TRUE):
## ignoring all but the first attribute
title("Population of Yokohama CIty in 2020 (Estimated)")
legend("bottomright",
fill=attr(
classInt::findColours(
classInt::classIntervals(d$POP2020, n=5, style = "quantile"),
rev(RColorBrewer::brewer.pal(7, "RdYlGn"))),"palette"),
cex=1.0,
legend=names(attr(
classInt::findColours(
classInt::classIntervals(d$POP2020, n=5, style = "quantile"),
rev(RColorBrewer::brewer.pal(7, "RdYlGn"))),"table")),bty="n")
Here we show a case to download multiple boundary data of Kanto region (Saitama, Chiba, Tokyo and Kanagawa).
preflist <- 11:14
urls <- kokudosuuchi::getKSJURL("N03", prefCode = preflist) %>%
filter(year==2015) %$% zipFileUrl
d <- purrr::map(urls, getKSJData,cache_dir = "cached_zip")
zip_exdir <- tempfile()
dir.create(zip_exdir)
setwd("cached_zip")
map(list.files(), unzip, exdir = zip_exdir)
mdata <- file.path(zip_exdir,
list.files(zip_exdir),
paste("N03-15_",11:14,"_150101.shp", sep="")) %>%
map(st_read) %>%
do.call(rbind, .)
setwd("../")
st_write(mdata, "N03-15_Kanto_150101.shp")
Here, we’d like to demonstrate how to bind a shape file of Tokyo 23ku(wards) GIS data and a house land price (Koji Chika Data of House).
Land price (Koji Chika) data of Tokyo in 2018 is available from Tokyo metropolitan government’s web site. Here, land price data is provided as a point data of table format.
A CSV file (comma (,) deliminated file) can be read by read_csv()
of readr
.
Select land price points of houses from the data set by using select()
and filter()
.
lp_url <- sprintf("http://www.zaimu.metro.tokyo.jp/kijunchi/30kouji/30kouji_chiten_data.csv")
curl::curl_download(lp_url, "lp_tky.csv")
lph_tky <- readr::read_csv("lp_tky.csv", skip=1,
col_types = cols(都道府県市区町村コード = "c"),
locale = locale(encoding = "CP932")) %>%
select(CODE = 2,
LP2018 = 10,
Chiseki = 13,
Yoto = 34) %>%
filter(Yoto %in% c("1住居", "1中専", "1低専", "2住居", "2中専", "2低専", "準住居"))
And then, mean house land prices of “Ku” are calculated by summarize()
of dply
by CODE
. This is like making a pivot table in EXCEL.
lph_tky %>%
dplyr::group_by(CODE) %>%
dplyr::summarise(LP2018.Mean = mean(LP2018))
Tokyo metropolitan prefecture’s boundary is also available from KSJ.
kokudosuuchi::getKSJURL("N03", prefCode = 13) %>%
filter(year==2015) %>% select(zipFileUrl) %>%
pull(1) -> url
tf <- tempfile(fileext = ".zip")
curl::curl_download(url, tf)
unzip(tf, exdir = getwd(), junkpaths = TRUE)
unlink(tf)
rm(tf)
Read a boundary of Tokyo-to and select Tokyo 23ku boundary.
tky_bd <- sf::st_read('N03-15_13_150101.shp') %>%
dplyr::group_by(N03_007) %>%
dplyr::summarise(geometry = st_union(geometry)) %>%
dplyr::ungroup()
tky_bd %>%
filter(N03_007 %in% c(13101:13123)) -> tky23ku_bd
To bind a shape file and a table file, use inner_join()
of dplyr
.
# Join a shape file and a table
tky23ku_lph <- dplyr::inner_join(tky23ku_bd, lph_tky, by = c("N03_007" = "CODE"))
## Warning: Column `N03_007`/`CODE` joining factor and character vector,
## coercing into character vector
Finally, you can depict a thematic map of house land price data of Tokyo 23 Ku in 2018.
plot(tky23ku_lph["LP2018"])
e-Stat URL: https://www.e-stat.go.jp/
Before downloading data sets, it is necessary to register e-Stat API from the following URL https://www.e-stat.go.jp/mypage/user/preregister
You need to enter your e-Stat api ID as myappID
You can search stats table by stas data ID of e-Stat if you know it. estat_getStatsList()
of estatapi
returns the list of the stats.
# Search stats list by stats data ID of e-stat if known
StatsList <- estatapi::estat_getStatsList(
appId = myappID,
lang = c("J"),
searchWord = "",
statsDataID = "0000010109"
)
str(StatsList)
## Classes 'tbl_df', 'tbl' and 'data.frame': 10000 obs. of 13 variables:
## $ @id : chr "0003110904" "0003110914" "0003110917" "0003110918" ...
## $ STAT_NAME : chr "国民経済計算" "国民経済計算" "国民経済計算" "国民経済計算" ...
## $ GOV_ORG : chr "内閣府" "内閣府" "内閣府" "内閣府" ...
## $ STATISTICS_NAME : chr "国民経済計算年次推計 平成2年基準 平成10年度確報 フロー編" "国民経済計算年次推計 平成2年基準 平成10年度確報 フロー編" "国民経済計算年次推計 平成2年基準 平成10年度確報 フロー編" "国民経済計算年次推計 平成2年基準 平成10年度確報 フロー編" ...
## $ TITLE : chr "1.統合勘定 2. 国民可処分所得と処分勘定 年度" "1.統合勘定 1. 国内総生産と総支出勘定 年度" "1.統合勘定 2. 国民可処分所得と処分勘定 暦年" "1.統合勘定 3. 資本調達勘定 (1)実物取引 暦年" ...
## $ CYCLE : chr "年次" "年次" "年次" "年次" ...
## $ SURVEY_DATE : chr "199801-199812" "199801-199812" "199801-199812" "199801-199812" ...
## $ OPEN_DATE : chr "2015-09-30" "2015-09-30" "2015-09-30" "2015-09-30" ...
## $ SMALL_AREA : chr "0" "0" "0" "0" ...
## $ MAIN_CATEGORY : chr "企業・家計・経済" "企業・家計・経済" "企業・家計・経済" "企業・家計・経済" ...
## $ SUB_CATEGORY : chr "国民経済計算" "国民経済計算" "国民経済計算" "国民経済計算" ...
## $ OVERALL_TOTAL_NUMBER: chr "616" "748" "616" "396" ...
## $ UPDATED_DATE : chr "2016-08-15" "2016-08-15" "2016-08-15" "2016-08-15" ...
estat_getMetaInfo()
returns meta data information of the stats.
# Get metadata information
meta_info <- estatapi::estat_getMetaInfo(
appId = myappID,
statsDataId = "0000010109"
)
meta_info
## $tab
## # A tibble: 1 x 3
## `@code` `@name` `@level`
## <chr> <chr> <chr>
## 1 00001 観測値 1
##
## $cat01
## # A tibble: 384 x 4
## `@code` `@name` `@level` `@unit`
## <chr> <chr> <chr> <chr>
## 1 I1101 I1101_平均余命(0歳)(男) 1 年
## 2 I1102 I1102_平均余命(0歳)(女) 1 年
## 3 I1201 I1201_平均余命(20歳)(男) 1 年
## 4 I1202 I1202_平均余命(20歳)(女) 1 年
## 5 I1301 I1301_平均余命(40歳)(男) 1 年
## 6 I1302 I1302_平均余命(40歳)(女) 1 年
## 7 I1401 I1401_平均余命(60歳)(男) 1 年
## 8 I1402 I1402_平均余命(60歳)(女) 1 年
## 9 I1501 I1501_平均余命(65歳)(男) 1 年
## 10 I1502 I1502_平均余命(65歳)(女) 1 年
## # ... with 374 more rows
##
## $area
## # A tibble: 48 x 3
## `@code` `@name` `@level`
## <chr> <chr> <chr>
## 1 00000 全国 1
## 2 01000 北海道 2
## 3 02000 青森県 2
## 4 03000 岩手県 2
## 5 04000 宮城県 2
## 6 05000 秋田県 2
## 7 06000 山形県 2
## 8 07000 福島県 2
## 9 08000 茨城県 2
## 10 09000 栃木県 2
## # ... with 38 more rows
##
## $time
## # A tibble: 42 x 3
## `@code` `@name` `@level`
## <chr> <chr> <chr>
## 1 1975100000 1975年度 1
## 2 1976100000 1976年度 1
## 3 1977100000 1977年度 1
## 4 1978100000 1978年度 1
## 5 1979100000 1979年度 1
## 6 1980100000 1980年度 1
## 7 1981100000 1981年度 1
## 8 1982100000 1982年度 1
## 9 1983100000 1983年度 1
## 10 1984100000 1984年度 1
## # ... with 32 more rows
##
## $.names
## # A tibble: 4 x 2
## id name
## <chr> <chr>
## 1 tab 観測値
## 2 cat01 I 健康・医療
## 3 area 地域
## 4 time 調査年
Now, stats data can be obtained by using estat_getStatsData()
. By adding condition on statsDataId
, cdCat01
or lvArea
, you can narrowe down the condition of stats.
# Get stats from e-Stat
SDS_df <- estatapi::estat_getStatsData(
appId = myappID,
statsDataId = "0000010109", # 社会・人口統計体系 都道府県データ 健康・医療
cdCat01 = "I1601", # 健康寿命
# cdCat01 = c("I1601", "I1602"), # 健康寿命
lvArea = 2 # Level of Area
# lvTime = 1 # Level of Time
# cdtime = "2010100000" # 2010年度
# surveyYears = 2010
)
## Fetching record 1-94... (total: 94 records)
h2013_df <- subset(SDS_df, SDS_df$調査年 %in% "2013年度") %>%
as.data.frame()
h2013_df$KENCODE <- as.numeric(h2013_df$area_code)/1000
head(h2013_df)
## tab_code 観測値 cat01_code I 健康・医療 area_code 地域
## 1 00001 観測値 I1601 I1601_健康寿命(男) 01000 北海道
## 2 00001 観測値 I1601 I1601_健康寿命(男) 02000 青森県
## 3 00001 観測値 I1601 I1601_健康寿命(男) 03000 岩手県
## 4 00001 観測値 I1601 I1601_健康寿命(男) 04000 宮城県
## 5 00001 観測値 I1601 I1601_健康寿命(男) 05000 秋田県
## 6 00001 観測値 I1601 I1601_健康寿命(男) 06000 山形県
## time_code 調査年 unit value KENCODE
## 1 2013100000 2013年度 年 71.11 1
## 2 2013100000 2013年度 年 70.29 2
## 3 2013100000 2013年度 年 70.68 3
## 4 2013100000 2013年度 年 71.99 4
## 5 2013100000 2013年度 年 70.71 5
## 6 2013100000 2013年度 年 71.34 6
str(h2013_df)
## 'data.frame': 47 obs. of 11 variables:
## $ tab_code : chr "00001" "00001" "00001" "00001" ...
## $ 観測値 : chr "観測値" "観測値" "観測値" "観測値" ...
## $ cat01_code : chr "I1601" "I1601" "I1601" "I1601" ...
## $ I 健康・医療: chr "I1601_健康寿命(男)" "I1601_健康寿命(男)" "I1601_健康寿命(男)" "I1601_健康寿命(男)" ...
## $ area_code : chr "01000" "02000" "03000" "04000" ...
## $ 地域 : chr "北海道" "青森県" "岩手県" "宮城県" ...
## $ time_code : chr "2013100000" "2013100000" "2013100000" "2013100000" ...
## $ 調査年 : chr "2013年度" "2013年度" "2013年度" "2013年度" ...
## $ unit : chr "年" "年" "年" "年" ...
## $ value : num 71.1 70.3 70.7 72 70.7 ...
## $ KENCODE : num 1 2 3 4 5 6 7 8 9 10 ...
Prefecture boundary GIS data is available my web site as following.
# Download GIS data set from my web page
url.gis <- sprintf("http://web.sfc.keio.ac.jp/~maunz/asakura_sp/asakura_sp_data_1101.zip")
tf <- tempfile(fileext = ".zip")
curl::curl_download(url.gis, tf)
unzip(tf, exdir = getwd(), junkpaths = TRUE)
# Read jpn_pref.shp = prefecture boundary data
jpn_pref <- sf::st_read('jpn_pref.shp')
Finally, by joining a prefecture boundary GIS data and a house land price data, you can plot a thematic map. Here, we plot “Kenko Jumyo (Health life expectancy)” of male in 2013 by prefectural level on the map.
# Join SDS health care data and jpn_pref.shp
jpn_pref_h <- dplyr::inner_join(jpn_pref, h2013_df, by = c("PREF_CODE" = "KENCODE"))
# Plot a map
ggplot(data = jpn_pref_h, aes(fill= value)) +
geom_sf() +
scale_fill_distiller(palette="RdYlGn") +
theme_bw() +
labs(title = "Health life expectancy of male in Japan (2013)")