This function returns a data.frame where each row provides one or several goodness-of-fit measures between a simulated and an observed Origin-Destination matrix.

## Usage

```
gof(
sim,
obs,
measures = "all",
distance = NULL,
bin_size = 2,
use_proba = FALSE,
check_names = FALSE
)
```

## Arguments

- sim
an object of class

`TDLM`

(output of`run_law_model()`

,`run_law()`

or`run_model()`

). A matrix or a list of matrices can also be used (see Note).- obs
a squared matrix representing the observed mobility flows.

- measures
a vector of string(s) indicating which goodness-of-fit measure(s) to chose (see Details). If

`"all"`

is specified, then all measures will be calculated.- distance
a squared matrix representing the distance between locations. Only necessary for the distance-based measures.

- bin_size
a numeric value indicating the size of bin used to discretize the distance distribution to compute CPC_d (2 "km" by default).

- use_proba
a boolean indicating if the

`proba`

matrix should be used instead of the simulated OD matrix to compute the measure(s). Only valid for the output from`run_law_model()`

with argument`write_proba = TRUE`

(see Note).- check_names
a boolean indicating if the ID location are used as matrix rownames and colnames and if they should be checked (see Note).

## Value

A data.frame providing one or several goodness-of-fit measure(s) between simulated OD(s) and an observed OD. Each row corresponds to a matrix sorted according to the list (or list of list) elements (names are used if provided).

## Details

With \(n\) the number of locations, \(T_{ij}\) the
observed flow between location \(i\) and location \(j\)
(argument `obs`

), \(\tilde{T}_{ij}\) a simulated flow
between location \(i\) and location \(j\) (a matrix from
argument `sim`

), \(N=\sum_{i,j=1}^n T_{ij}\) the
sum of observed flows and
\(\tilde{N}=\sum_{i,j=1}^n \tilde{T}_{ij}\)
the sum of simulated flows.

Several goodness-of-fit measures have been considered
`measures = c("CPC", "NRMSE", "KL", "CPL", "CPC_d", "KS")`

. The Common Part
of Commuters (Gargiulo et al. 2012; Lenormand et al. 2012; Lenormand et al. 2016)
,

\(\displaystyle CPC(T,\tilde{T}) = \frac{2\cdot\sum_{i,j=1}^n min(T_{ij},\tilde{T}_{ij})}{N + \tilde{N}}\)

the Normalized Root Mean Square Error (NRMSE),

\(\displaystyle NRMSE(T,\tilde{T}) = \sqrt{\frac{\sum_{i,j=1}^n (T_{ij}-\tilde{T}_{ij})^2}{N}}\)

the Kullback–Leibler divergence (Kullback and Leibler 1951) ,

\(\displaystyle KL(T,\tilde{T}) = \sum_{i,j=1}^n \frac{T_{ij}}{N}\log\left(\frac{T_{ij}}{N}\frac{\tilde{N}}{\tilde{T}_{ij}}\right)\)

the Common Part of Links (CPL) (Lenormand et al. 2016) ,

\(\displaystyle CPL(T,\tilde{T}) = \frac{2\cdot\sum_{i,j=1}^n 1_{T_{ij}>0} \cdot 1_{\tilde{T}_{ij}>0}}{\sum_{i,j=1}^n 1_{T_{ij}>0} + \sum_{i,j=1}^n 1_{\tilde{T}_{ij}>0}}\)

the Common Part of Commuters based on the disance
(Lenormand et al. 2016)
, noted CPC_d. Let us consider
\(N_k\) (and \(\tilde{N}_k\)) the
sum of observed (and simulated) flows at a distance comprised in the bin
[`bin_size`

*k-`bin_size`

, `bin_size`

*k[.

\(\displaystyle CPC_d(T,\tilde{T}) = \frac{2\cdot\sum_{k=1}^{\infty} min(N_{k},\tilde{N}_{k})}{N+\tilde{N}}\)

and the Kolmogorv-Smirnov statistic and p-value (Massey 1951) , noted KS. It is based on the observed and simulated flow distance distribution and computed with the ks_test function from the Ecume package.

## Note

By default, if `sim`

is an output of `run_law_model()`

the measure(s) are computed only for the simulated OD matrices and
not the `proba`

matrix (included in the output when
`write_proba = TRUE`

). The argument `use_proba`

can be used to compute the
measure(s) based on the `proba`

matrix instead of the simulated
OD matrix. In this case the argument `obs`

should also be a proba matrix.

All the inputs should be based on the same number of
locations sorted in the same order. It is recommended to use the location ID
as matrix rownames and matrix colnames and to set
`check_names = TRUE`

to verify that everything is in order before running
this function (`check_names = FALSE`

by default). Note that the function
`check_format_names()`

can be used to control the validity of all the inputs
before running the main package's functions.

## References

Lenormand M, Bassolas A, Ramasco JJ (2016).
“Systematic comparison of trip distribution laws and models.”
*Journal of Transport Geography*, **51**, 158-169.

Gargiulo F, Lenormand M, Huet S, Baqueiro Espinosa O (2012).
“Commuting network model: getting to the essentials.”
*Journal of Artificial Societies and Social Simulation*, **15**(2), 13.

Lenormand M, Huet S, Gargiulo F, Deffuant G (2012).
“A Universal Model of Commuting Networks.”
*PLoS ONE*, **7**, e45985.

Kullback S, Leibler RA (1951).
“On Information and Sufficiency.”
*The Annals of Mathematical Statistics*, **22**(1), 79 -- 86.

Massey FJ (1951).
“The Kolmogorov-Smirnov test for goodness of fit.”
*Journal of the American Statistical Association*, **46**(253), 68--78.

## Author

Maxime Lenormand (maxime.lenormand@inrae.fr)

## Examples

```
data(mass)
data(distance)
data(od)
mi <- as.numeric(mass[, 1])
mj <- mi
Oi <- as.numeric(mass[, 2])
Dj <- as.numeric(mass[, 3])
res <- run_law_model(
law = "GravExp", mass_origin = mi, mass_destination = mj,
distance = distance, opportunity = NULL, param = 0.01,
model = "DCM", nb_trips = NULL, out_trips = Oi, in_trips = Dj,
average = FALSE, nbrep = 1, maxiter = 50, mindiff = 0.01,
write_proba = FALSE,
check_names = FALSE
)
gof(
sim = res, obs = od, measures = "CPC", distance = NULL, bin_size = 2,
use_proba = FALSE,
check_names = FALSE
)
#> Simulation CPC
#> 1 replication_1 0.4574413
```