**Intensity** is the average density of points or, in
other words, the expected number of points per unit area. The intensity
may be constant (*uniform*) or may vary from location to location
(*non-uniform* or *inhomogeneous*).

One approach to evaluate the intensity is to divide the study area into quadrats and count the numbers of points falling into each quadrat. If the points have uniform intensity, and are completely random, then the quadrat counts should be Poisson random numbers with constant mean. We may test that hypothesis by using the \(\chi^2\) goodness-of-fit test statistic

\[\chi^2 = \sum\frac{(\text{observed}-\text{expected})^2}{\text{expected}}.\]

In R, we use the `quadratcount()`

and
`quadrat.test()`

functions of the `spatstat`

package. First, we load the *Berlin city data sets* we created in
the previous section.

```
load(url("https://userpage.fu-berlin.de/soga/data/r-data/ppp_data_berlin.RData"))
load(url("https://userpage.fu-berlin.de/soga/data/r-data/berlin_district.RData"))
```

```
library(spatstat)
par(mar = c(0, 0, 1.5, 1.5),
cex.main = 2)
qc_locations <- quadratcount(ppp_locations, nx = 3, ny = 4)
plot(ppp_locations, pch = 3, cex = 0.6)
plot(qc_locations, add = TRUE, textargs = list(col = "red"))
```

`quadrat.test(qc_locations)`

```
##
## Chi-squared test of CSR using quadrat counts
##
## data:
## X2 = 1659.4, df = 10, p-value < 2.2e-16
## alternative hypothesis: two.sided
##
## Quadrats: 11 tiles (irregular windows)
## Tessellation is marked
```

The very low *p*-value indicates that the intensity is
non-constant. Such a point process is referred to as an inhomogeneous
Poisson process (IPP), and its intensity can be estimated
non-parametrically, e.g. with kernel smoothing (Bivand et
al. 2008).

Let us repeat this approach for the `ppp_random`

data
set.

Exercise:Evaluate the intensity of the data set`ppp_random`

by dividing the area into quadrats, counting the points and performing the \(\chi^2\)-Test. Which kind of Poisson point process do you expect?

```
## Your code here...
qc_random <- NULL
```

```
## configurate figure extends accordingly
par(mar = c(0, 0, 1.5, 1.5),
cex.main = 2)
## create ppp object
qc_random <- quadratcount(ppp_random, nx = 3, ny = 4)
plot(ppp_random, pch = 3, cex = 0.6)
plot(qc_random, add = TRUE, textargs = list(col = "red"))
```

```
## perform Chi-squared test
quadrat.test(qc_random)
```

`## Warning: Some expected counts are small; chi^2 approximation may be inaccurate`

```
##
## Chi-squared test of CSR using quadrat counts
##
## data:
## X2 = 12.402, df = 10, p-value = 0.5181
## alternative hypothesis: two.sided
##
## Quadrats: 11 tiles (irregular windows)
## Tessellation is marked
```

The high *p*-value \((p >
0.05)\) indicates a homogeneous Poisson process (HPP).

The intensity of homogeneous Poisson processes is constant with \(\lambda(u) = \lambda = \frac{n}{\vert A \vert}\), where \(n\) is the number of points observed in region \(A\) and \(\vert A \vert\) is the area of region \(A\).

There are different ways to estimate the intensity of an IPP. One of them is a non-parametrically estimation by means of kernel smoothing. If we observe \(n\) points \(\{x_i\}_{i=1}^n\), then the form of a kernel smoothing estimator is

\[\hat{\lambda}(x) = \frac{1}{h^2}\sum_{i=1}^n\frac{\kappa(\frac{\Vert x-x_i\Vert}{h})}{q(\Vert x \Vert)}\text{,} \]

where \(x_i \in \{x_1, x_2,..., x_n\}\) is the observed point, \(h\) is the bandwidth, \(q(\Vert x \Vert)\) is a border correction to compensate for observations missing due to edge effects, and \(\kappa(u)\) is a bivariate and symmetrical kernel function.

R currently implements a two-dimensional quartic kernel function:

\[ \kappa(u) = \begin{cases} \frac{3}{\pi}(1-\Vert u \Vert^2)^2 & \text{if } u \in (-1, 1) \\ 0 & \text{otherwise} \end{cases} \]

where \(\Vert u \Vert^2=u_1^2+u_2^2\) is the squared norm of point \(u=(u_1, u_2)\).

There is no general rule for selecting the bandwidth \(h\), which governs the level of smoothing. Small bandwidths result in spikier maps, and large bandwidths result in smoother maps. It is reasonable to use several values depending on the process under consideration and then to choose a value that seems plausible (Bivand et al. 2008).

A kernel density plot is the graphical representation of kernel
smoothing. Several functions are available in R and contributed packages
to visualise kernel density plots. Here, we use the
`density.ppp()`

function from the `spatstat`

package.

The `density.ppp()`

function estimates the intensity
function of the point process that generated the point pattern data. The
argument `sigma`

is taken as the standard deviation of the
isotropic Gaussian kernel, corresponding to the bandwidth parameter. A
small standard deviation results in a spikier density plot and a large
standard deviation results in a smoother density plot.

We code a loop to apply the `density.ppp()`

function on
our three data sets `ppp_locations`

, `ppp_regular`

and `ppp_random`

with three different values for sigma \((1000, 2500, 5000)\), respectively.

```
par(mfrow = c(3, 3), mar = c(0, 0, 1, 2))
sigma <- c(1000, 2500, 5000)
data <- list(ppp_locations, ppp_regular, ppp_random)
main <- c("Locations", "Regular", "Random")
for (i in 1:3) {
for (j in 1:3) {
ds <- density.ppp(data[[i]], sigma = sigma[j])
plot(ds,
main = paste0(main[i], ", sigma: ", sigma[j])
)
plot(data[[i]], add = TRUE, cex = 0.01, regular = FALSE)
}
}
```

Alternatively, to the non-parametrically estimation of the intensity by means of kernel smoothing we may propose a specific function for the intensity whose parameters are estimated by maximizing the likelihood of the point process (Baddeley 2010).

The log-likelihood for the homogeneous Poisson process with intensity \(\lambda\) is

\[\log L(\lambda; x) = n(x)\log \lambda-\lambda \text{ area}(W)\] where \(n(x)\) is the number of points in the data set \(x\). The maximum likelihood estimator of \(\lambda\) is \[\hat{\lambda} = \frac{n(x)}{\text{area}(W)}\] The variance of \(\hat{\lambda}\) is

\[\text{var}[\hat{\lambda}] = \frac{\lambda}{\text{area}(W)}\]

For an inhomogeneous Poisson process with intensity function \(\lambda_\theta(u)\) depending on a parameter \(\theta\), the log-likelihood for \(\theta\) is

\[\log L(\theta;x) = \sum_{i=1}^n \log \lambda_\theta(x_i)- \int_W \lambda_\theta(u)du\;,\]

where \(\int_W \lambda_\theta(u)du\) is the expected number of cases of the IPP with intensity \(\lambda_\theta(u)\) in region \(W\).

The maximum likelihood estimation \(\text{MLE } \hat{\theta}\) is not analytically tractable, so it must be computed using numerical algorithms such as Newton’s method (Baddeley 2010).

`spatstat`

By fitting a model, we need to describe how the point pattern intensity \(\lambda(u)\) depends on the spatial location \(u\) or spatial covariates \(Z(u)\). The intensity function \(\lambda_\theta(u)\) is log-linear in the parameter \(\theta\):

\[\log\lambda_\theta(u)= \theta \cdot S(u)\;,\] where \(S(u)\) is a real-valued or vector-valued function of location \(u\). \(S(u)\) could be a function of the spatial coordinates of \(u\), or an observed covariate \(Z(u)\), or a mixture of both.

The `spatstat`

package provides the `ppm()`

function (*point process model*) to fit a point process model to
an observed point pattern. The first argument in the `ppm()`

function is a formula in the R language describing the spatial trend
model to be fitted. It has the general form
`pattern ~ trend`

, where the left-hand side,
`pattern`

, is usually the name of a spatial point pattern to
which the model should be fitted, and the right-hand side,
`trend`

, is an expression specifying the spatial trend of the
model.

The table outlines the basic formulations of a point process model.

\[ \begin{array}{ll} \mathtt{R} & \text{Intensity} \\ \hline \mathtt{ppm(P, \tilde{}1)} & \log \lambda((x,y)) = \theta_0\\ \mathtt{ppm(P, \tilde{}x)} & \log \lambda((x,y)) = \theta_0+\theta_1x\\ \mathtt{ppm(P, \tilde{}x+y)} & \log \lambda((x,y)) = \beta_0+\theta_1x+\theta_2y\\ \mathtt{ppm(P, \tilde{}polynom(x,y,3))} & \text{3rd order polynomial in } x \text{ and } y\\ \mathtt{ppm(P, \tilde{}I(y > 23))} & \text{different constants above and below the line } y = 23\\ \end{array} \]

The `ppm()`

function documentation is quite extensive:
Type `help(ppm)`

into your console for further
information.

Now let us fit some models.

A homogeneous Poisson model, e.g. \(\lambda_\theta((x,y)) = \exp(\theta_0)\) can be fitted by

```
base_loc <- ppm(ppp_locations ~1)
base_loc
```

```
## Stationary Poisson process
## Fitted to point pattern dataset 'ppp_locations'
## Intensity: 1.04584e-06
## Estimate S.E. CI95.lo CI95.hi Ztest Zval
## log(lambda) -13.77069 0.032721 -13.83482 -13.70656 *** -420.8518
```

An inhomogeneous Poisson model with an intensity that is log-linear, the means linear in the Cartesian coordinates, e.g. \(\lambda_\theta((x, y)) = \exp(\theta_0 + \theta_1x + \theta_2y)\), is fitted by

`loglin_loc <- ppm(ppp_locations ~ x + y)`

We can use the `coef()`

function to extract the model
coefficients.

`coef(loglin_loc)`

```
## (Intercept) x y
## -1.493209e+02 2.852400e-06 2.287357e-05
```

This specific model can be written in the form

\(\lambda_\theta((x, y)) = \exp(-149.3209128 + 2.8524004\times 10^{-6}x + 2.2873568\times 10^{-5}y)\)

An inhomogeneous Poisson model with an intensity that is log-quadratic, that means it is quadratic in the Cartesian coordinates \(x\) and \(y\); e.g. \(\lambda_\theta((x, y)) = \exp(\theta_0 + \theta_1x + \theta_2y+ \theta_3x^2 + \theta_4y^2 +\theta_5x y)\), is fitted by

```
logquad_loc <- ppm(ppp_locations ~ polynom(x, y, 2))
coef(logquad_loc)
```

```
## (Intercept) x y I(x^2) I(x * y)
## -1.379287e+06 8.132716e-02 4.622414e-01 -2.583714e-08 -6.873808e-09
## I(y^2)
## -3.919100e-08
```

So far, we fitted three different models for the
`ppp_locations`

data set. For the purpose of comparison,
repeat this procedure and fit the same models on the random point data
set `ppp_random`

.

Exercise:Fit the previous three models on the random point data set`ppp_random`

and compare them with the models for the`ppp_locations`

data set.

```
## Your code here...
base_rand <- NULL
loglin_rand <- NULL
logquad_rand <- NULL
```

```
base_rand <- ppm(ppp_random ~1)
base_rand
```

```
## Stationary Poisson process
## Fitted to point pattern dataset 'ppp_random'
## Intensity: 2.138709e-07
## Estimate S.E. CI95.lo CI95.hi Ztest Zval
## log(lambda) -15.35789 0.07235746 -15.49971 -15.21608 *** -212.2503
```

```
loglin_rand <- ppm(ppp_random ~ x + y)
coef(loglin_rand)
```

```
## (Intercept) x y
## -1.230230e+02 -1.440396e-06 1.867609e-05
```

```
logquad_rand <- ppm(ppp_random ~ polynom(x, y, 2))
coef(logquad_rand)
```

```
## (Intercept) x y I(x^2) I(x * y)
## -1.621871e+04 7.177928e-03 4.559182e-03 8.747276e-11 -1.256704e-09
## I(y^2)
## -3.034431e-10
```

Now that we fitted a bunch of models on two different data sets, we use these models for prediction.

The value returned by the model-fitting function `ppm`

is
an object of class `ppm`

representing the fitted model,
analogous to other model objects in R. Thus, standard operations such as
`plot`

, `predict`

or `coef`

can be
applied.

We combine the `plot()`

command with the
`predict()`

command to visualize the log-linear and the
log-quadratic models for the `ppp_random`

data set. We add
the point coordinates to enhance the visualization.

```
par(mar = c(0, 0, 1.5, 1.5),
mfrow = c(1, 2),
cex.main = 2)
plot(predict(loglin_rand), main = "log-linear")
points(ppp_random, pch = 16, cex = 0.2)
plot(predict(logquad_rand), main = "log-quadratic")
points(ppp_random, pch = 16, cex = 0.2)
```