---
title: "Cartograflow"
subtitle: "Filtering Origin-Destination Matrix for Thematic Flow Mapping"
author: "Françoise Bahoken, Sylvain Blondeau"
date: "`r Sys.Date()`"
output: html_vignette
vignette: >
  \usepackage[ps2pdf,
              bookmarks=true,
              dvipsone=pdftex,                                
              backref,
              ]{hyperref}
  %\VignetteIndexEntry{CartogRaflow}
  %\VignetteEncoding{UTF-8}
  %\SweaveUTF8
  %\VignetteEngine{knitr::rmarkdown}
#editor_options: 
#chunk_output_type: inline
---
`Cartograflow` is designed to filter origin-destination (OD) flow matrix for thematic mapping purposes.
## Description of functions
**1. Preparing flow data sets:** 
 
**1.1 General functions**
You can use long "L" or matrix "M" [n*n] flow dataset formats. 
-- `flowtabmat()` is to transform "L" to "M" formats, also to build an empty square matrix from spatial codes.
-- `flowcarre()` is to square a matrix.
-- `flowjointure()` is to performs a spatial join between a flow dataset and a spatial features layer or an external matrix.
-- `flowstructmat()` fixes an unpreviously codes shift in the flow dataset "M" format. If necessary this function is to be used with `flowjointure` and `flowtabmat`.
**1.2. Flow computation:**
-- `flowtype()` is to compute several types of flow from an asymmetric matrix: 
 
x= `flux` for remaining initial flow (Fij) 
x= `transpose` for reverse flow value (Fji) 
x= `bivolum` for bilateral volum, as gross flow (FSij)
x= `bibal` for bilateral balance, as net flow (FBij) 
x= `biasym` for asymetry of bilateral flow (FAij) 
x= `bimin` for minimum of bilateral flow (minFij) 
x= `bimax` for maximum of bilateral flow (maxFij) 
x= `birange` for bilateral flow range (rangeFij) 
x= `bidisym` for bilateral disymetry as (FDij) 
-- `flowplaces()` is to compute several types of flow places oriented from an asymmetric:
ie. as a dataframe that describes the flows from Origin / destination point of view 
 
x= `ini` for the number of incoming links (as in-degree) 
x= `outi` for the number of outcoming links (as out-degree) 
x= `degi` for the total number of links (as in and out degrees) 
x= `intra` for total intra zonal interaction (if main diagonal is not empty 
x= `Dj` for the total flows received by (j) place 
x= `voli` for the total volume of flow per place 
x= `bali` for the net balance of flow per place 
x= `asyi` for the asymetry of flow per place 
x= `allflowplaces` for computing all the above indicators 
**1.3. Flow reduction:**
-- `flowlowup()` is to extracts the upper or the lower triangular part of a matrix - preferably for symmetrical matrixes.
x= `up` for the part above the main diagonal 
x= `low` for the part below the main diagonal 
-- `flowreduct()` is to reduce the flow dataset regarding another matrix, e.g. distances travelled. 
 
`metric` is the metric of the distance matrix :
- metric= `continuous` (e.g. for kilometers) 
- metric= `ordinal` (e.g. for `k` contiguity) 
If the metric is `continuous` (e.g for filtering flows by kilometric distances travelled), use:
`d.criteria` is for selecting the minimum or the maximum distance criteria 
- d.criteria= `dmin` for keeping only flows up to a _dmin_ criterion in km 
- d.criteria= `dmax` for selecting values less than a _dmax_ criterion in km 
`d` is the value of the selected `dmin` or `dmax` criteria.
Notice that these arguments can be used as a filter criterion in `flowmap()`.
See Cartograflow_distance and Cartograflow_ordinal_distance Vignettes for examples.
URL: https://github.com/fbahoken/cartogRaflow/tree/master/vignettes
**2. Flows filtering:** 
 
**2.1. Filtering from flow concentration analysis** 
**Flow concentration analysis:** 
-- `flowgini()` performs a Gini's concentration analysis of the flow features, by computing _Gini coefficient_ and plotting interactive _Lorenz curve_.
To be use before `flowanalysis()`
See Cartograflow_concentration Vignette for example.
URL: https://github.com/fbahoken/cartogRaflow/tree/master/vignettes
**Flow filtering according to a concentration criterion:**
-- `flowanalysis()` computes filters criterions based on:
- argument `critflow` is to filter the flows according to their significativity (% of total of flow information) ; 
- argument `critlink` is to filter the flows according to their density (% of total features) 
These arguments can be used as `filter` criterion in `flowmap()`.
See Cartograflow_concentration Vignette for example.
URL: https://github.com/fbahoken/cartogRaflow/tree/master/vignettes
**2.2. Spatial / territorial filtering of flows** 
**Flow filtering based on a continuous distance criterion**
-- `flowdist()` computes a _continous distance_ matrix from spatial features (area or points). The result is a matrix of the distances travelled between ODs, with flows filtered or not.
See Cartograflow_distance Vignette for example.
URL: https://github.com/fbahoken/cartogRaflow/tree/master/vignettes
**Flow filtering based on an ordinal distance / neighbourhood criterion**:
-- `flowcontig()` compute an _ordinal distance_ matrix from spatial features (area). The result is a matrix of adjacency or k-contiguity of the ODs.
- `background` is the areal spatial features ;
- code` is the spatial features codes ; 
- `k` is to enter the number (k:1,2,...,k) of the contiguity matrix to be constructed : if (k=1), ODs places are adjacent, then the flow have to cross only 1 boundary, else (k=k) ODs places are distant from n borders ;
- `algo` is the algorithm to use for ordinal distance calculation (also Default is "automatic" for "Dijkstra's") ; 
Notice that the function automatically returns the maximum (k) number of the spatial layer.
See Cartograflow_distance_ordinal Vignette for example.
**3. Flow mapping**
-- `flowmap()` is to plot flows as segments or arrows, by acting on the following arguments:
 
- `filter` is to filter or not flow's information or features 
- `threshold` is used to set the filtering level of the flows when filter="True" 
- `taille` is the value of the width of the flow feature 
- `a.head` is the arrow head parameter (in, out, in and out) 
- `a.length` is the length of the edges of the arrow head (in inches) 
- `a.angle` is the angle from the shaft of the arrow to the edge of the arrow head 
- `a.col` is the arrow's color 
- `plota` is to add spatial features as map background to the flows's plot 
- `add` is to allow to overlay flow features on external spatial features background 
## Examples of applications
-- **Useful packages**
Best external R package to use: 
{dplyr} {sf} {igraph} {rlang} {cartography}
```{r include=FALSE, message=FALSE}
rm(list=ls())
library(sf)
library(dplyr)
library(cartograflow)
library(cartography)
knitr::opts_chunk$set(fig.width=6, fig.height=6)
```
**1. Load datasets**
--------------------
**Flow dataset**
```{r flowdata_preprocess, warning=FALSE, echo=TRUE}
# Load Statistical information
tabflow<-read.csv2("./data/MOBPRO_ETP.csv", header=TRUE, sep=";",stringsAsFactors=FALSE,
                   encoding="UTF-8", dec=".",check.names=FALSE)
```
```{r var_typing, echo=FALSE, warning=FALSE}
# Variable typing
tabflow$i<-as.character(tabflow$i)
tabflow$j<-as.character(tabflow$j)
tabflow$Fij<-as.numeric(tabflow$Fij)
tabflow$count<-as.numeric(tabflow$count)
str(tabflow)
```
**Select variable and change matrix format**
```{r flowdata_reverse, echo=TRUE, message=FALSE, warning=FALSE}
# Selecting useful variables for changing format
tabflow<-tabflow %>% select(i,j,Fij)
# From list (L) to matrix (M) format
matflow <-flowtabmat(tabflow,matlist="M")
head(matflow[1:4,1:4])
dim(matflow)
```
```{r flowdata_reverseM, message=FALSE, warning=FALSE, include=FALSE}
# From matrix (M) to list (L) format
tabflow<-flowtabmat(tab=matflow,
                    matlist="L")
colnames(tabflow)<-c("i","j","Fij")
head(tabflow)
```
**Geographical dataset**
```{r data_preprocess, message=FALSE, warning=FALSE, include=FALSE}
# Load a list of geo codes
ID_CODE<-read.csv2("./data/COD_GEO_EPT.csv",
                   header=TRUE,sep=";",stringsAsFactors=FALSE,encoding="UTF-8", dec=".",   check.names=FALSE)
#head(ID_CODE)
CODE<-ID_CODE%>% dplyr::select(COD_GEO_EPT)
colnames(CODE)<-c("CODGEO")
#head(CODE)
```
**2. Flow types computing**
--------------------
```{r vara_typing2, message=FALSE, warning=FALSE, include=FALSE}
# Variable typing
tabflow$i<-as.character(tabflow$i)
tabflow$j<-as.character(tabflow$j)
tabflow$Fij<-as.numeric(tabflow$Fij)
as.data.frame(tabflow)
```
*Compute bilateral flows types : eg. volum, balance, bilateral maximum and all types*
```{r data_computing, echo=TRUE, message=FALSE, warning=FALSE}
# Bilateral volum (gross) FSij:  
tabflow_vol<-flowtype(tabflow, format="L", origin="i", destination="j", fij="Fij",  x= "bivolum" )
# Matrix format (M= : matflow_vol<-flowtype(matflow, format="M", "bivolum")
# Bilateral balance (net ) FBij:  
tabflow_net<-flowtype(tabflow, format="L", origin="i", destination="j", fij="Fij", x="bibal")
# Bilateral maximum (maxFij): 
tabflow_max<-flowtype(tabflow, format="L", origin="i", destination="j", fij="Fij", x="bimax")
# Compute all types of bilateral flows, in one 11 columns
tabflow_all<-flowtype(tabflow,format="L", origin="i", destination="j", fij="Fij", x="alltypes")
head(tabflow_all) 
```
**3. Direct flow mapping**
---------------------------
**3.1. Plot all origin-destination without any filtering criterion** 
The result will reveal a graphic complexity ("spaghetti-effect"")
Plot links
```{r maps_links, echo=TRUE, fig.show='hold', fig.width=6, message=FALSE, warning=FALSE, ECHO=FALSE}
library(sf)
map<-st_read("./data/MGP_TER.shp")
# Add and overlay spatial background 
par(bg = "NA")
# Graphic parameters
par(mar=c(0,0,1,0))
extent <- c(2800000, 1340000, 6400000, 4800000)
resolution<-150
plot(st_geometry(map), col = NA, border=NA, bg="#dfe6e1")
plot(st_geometry(map), col = "light grey", add=TRUE)
# Flowmapping of all links
flowmap(tab=tabflow,
        fij="Fij",
        origin.f = "i",
        destination.f = "j",
        bkg = map,
        code="EPT_NUM",
        nodes.X="X",
        nodes.Y = "Y",
        filter=FALSE,
        add=TRUE
        )
library(cartography)
# Map cosmetics
layoutLayer(title = "All origin-destination for commuting in Greater Paris, 2017",
           coltitle ="black",
           author = "Cartograflow, 2020",
           sources = "Data : INSEE, 2017 ; Basemap : APUR, RIATE, 2018.",
           scale = 2,
           tabtitle = FALSE,
           frame = TRUE,
           col = "grey"
            )
# North arrow
north("topright")
```
**3.2. Plot the above-average flows** 
```{r maps_flowmean, echo=TRUE, fig.show='hold', fig.width=6, message=FALSE, warning=FALSE, ECHO=FALSE}
library(sf)
map<-st_read("./data/MGP_TER.shp")
# Add and overlay spatial background 
par(bg = "NA")
# Graphic parameters
par(mar=c(0,0,1,0))
extent <- c(2800000, 1340000, 6400000, 4800000)
resolution<-150
plot(st_geometry(map), col = NA, border=NA, bg="#dfe6e1")
plot(st_geometry(map), col = "light grey", add=TRUE)
# Flow mapping above-average flows
flowmap(tab=tabflow,
        fij="Fij",
        origin.f = "i",
        destination.f = "j",
        bkg = map,
        code="EPT_NUM",
        nodes.X="X",
        nodes.Y = "Y",
        filter=TRUE,
        threshold =(mean(tabflow$Fij)),  #mean value is the level of threshold
        taille=20,           
        a.head = 1,
        a.length = 0.11,
        a.angle = 30,
        a.col="#138913",
        add=TRUE)
# Map Legend
legendPropLines(pos="topleft",
                title.txt="Commuters > 13220 ",
                title.cex=0.8,   
                cex=0.5,
                values.cex= 0.7,  
                var=c(mean(tabflow$Fij),max(tabflow$Fij)), 
                lwd=5, 
                frame = FALSE,
                col="#138913",
                values.rnd = 0
                )
#Map cosmetic
layoutLayer(title = "Commuters up to above-average in Greater Paris",
           coltitle ="black",
           author = "Cartograflow, 2020",
           sources = "Data : INSEE, 2017 ; Basemap : APUR, RIATE, 2018.",
           scale = 2,
           tabtitle = FALSE,
           frame = TRUE,
           col = "grey"
            )
# North arrow
north("topright")
```
**3.3. Plot the net flows of bilateral flows** 
```{r maps_flownet, echo=TRUE, fig.show='hold', fig.width=6, message=FALSE, warning=FALSE, ECHO=FALSE}
#library(sf)
map<-st_read("./data/MGP_TER.shp")
# Net matrix reduction
tabflow_net <- tabflow_net %>% filter(.data$FBij>=0)
# Net matrix thresholding
Q80<-quantile(tabflow_net$FBij,0.95)
# Add and overlay spatial background 
par(bg = "NA")
# Graphic parameters
par(mar=c(0,0,1,0))
extent <- c(2800000, 1340000, 6400000, 4800000)
resolution<-150
plot(st_geometry(map), col = NA, border=NA, bg="#dfe6e1")
plot(st_geometry(map), col = "light grey", add=TRUE)
# Flow mapping above-average flows
flowmap(tab=tabflow_net,
        fij="FBij",
        origin.f = "i",
        destination.f = "j",
        bkg = map,
        code="EPT_NUM",
        nodes.X="X",
        nodes.Y = "Y",
        filter=TRUE,
        threshold = Q80,
        taille=12,           
        a.head = 1, 
        a.length = 0.11,
        a.angle = 30,
        a.col="#4e8ef5",
        add=TRUE)
# Map Legend
legendPropLines(pos="topleft",
                title.txt="Commuters > 5722 ",
                title.cex=0.8,   
                cex=0.5,
                values.cex= 0.7,  
                var=c(Q80,max(tabflow_net$FBij)), 
                lwd=12, 
                frame = FALSE,
                col="#4e8ef5",
                values.rnd = 0
                )
#Map cosmetic
layoutLayer(title = "Net commuters in Greater Paris (20% strongest)",
           coltitle ="black",
           author = "Cartograflow, 2020",
           sources = "Data : INSEE, 2017 ; Basemap : APUR, RIATE, 2018.",
           scale = 2,
           tabtitle = FALSE,
           frame = TRUE,
           col = "grey"
            )
# North arrow
north("topright")
```
## Sample datasets
-- _Statistical dataset_ : 
- INSEE - Base flux de mobilité (2015)
- URL : https://www.insee.fr/fr/statistiques/fichier/3566008/rp2015_mobpro_txt.zip
-- _Geographical dataset_ :
- municipalities : IGN, GEOFLA 2015 v2.1 
- Greater Paris : APUR, UMS 2414 RIATE, 2018.
## See also
https://github.com/fbahoken/cartogRaflow/tree/master/vignettes
-- cartograflow_general.html 
-- cartograflow_concentration.html 
-- cartograflow_distance.html 
-- cartograflow_ordinal_distance.hmtl 
## Reference
-- Bahoken Francoise (2016), Programmes pour R/Rtudio annexés, in :  _Contribution à la cartographie d'une matrix de flux_, Thèse de doctorat, Université Paris 7, pp. 480-520.