# (Tutorial) Basic Usage of the Manifold TSM R Package

## von-Mises Fisher distribution

We simulate \(n=1000\) samples from a von-Mises distribution with mean direction \((\pi/2, \pi)\) on the sphere, using the `rvmf`

function from the `Rfast`

package.

```
# Simulated VMF
n = 1000
centre = c(pi/2, pi)
centre_euclid = sphere_to_euclid(centre)
set.seed(4)
z = rvmf(n, centre_euclid, k = 6)
x = euclid_to_sphere(z)
```

Now this can be estimated directly, using various methods such as MLE, but if we could only see a portion of this data, then truncated score matching can be used. Say there is a boundary set up such that all measurements below a constant line of \(\phi = \pi/2 + 0.2\) were missing. This can be constructed as follows

```
dV = cbind(rep(pi/2+0.2, 200), seq(0, 2*pi, len=200))
outside_boundary = x[,1] > (pi/2+0.2)
truncated_x = x[outside_boundary,]
```

The mean direction can be estimated by calling `sphere_sm`

with a specified boundary function, one of `"Haversine"`

or `"Euclidean"`

.

```
est_hav = sphere_sm(truncated_x, dV, g = "Haversine", family = vmf(k=6))
est_euc = sphere_sm(truncated_x, dV, g = "Euclidean", family = vmf(k=6))
```

Note that the `family`

argument is `vmf(k=6)`

. This is a wrapper function that specifies which distribution to use. The argument `k=6`

corresponds to the concentration parameter being fixed a t 6, if this werenâ€™t supplied and `vmf()`

was used alone, the concentration parameter will be esitmated. By default the `sphere_sm`

function uses the von Mises Fisher distribution without specifying a value for `k`

. Now that the points are estimated, we can plot them using `ggplot`

.

```
plot_data = data.frame(colat = c(x[!as.logical(outside_boundary),1],
x[as.logical(outside_boundary),1]),
lon = c(x[!as.logical(outside_boundary),2],
x[as.logical(outside_boundary),2]),
Data = c(rep("Not-Observed", sum(!as.logical(outside_boundary))),
rep("Observed", sum(outside_boundary))))
centre_data = data.frame(
colat = c(est_hav$mu[1], est_euc$mu[1], centre[1]),
lon = c(est_hav$mu[2], est_euc$mu[2], centre[2]),
Centres = c("Haversine", "Projected Euclidean", "Real Centre")
)
ggplot(plot_data) + geom_point(aes(x=lon, y=colat, col=Data), alpha=.7, size=2) +
scale_color_brewer(type="qual", palette=3) + theme_minimal() +
xlab(expression(theta)) + ylab(expression(phi)) +
geom_point(data=centre_data, aes(lon, colat, fill=Centres), size=4, shape = "diamond filled")
```

The point has been estimated reasonably well. Now lets try the same approach *without* fixing the concentration parameter.

```
est_hav = sphere_sm(truncated_x, dV, g = "Haversine", family = vmf())
est_euc = sphere_sm(truncated_x, dV, g = "Euclidean", family = vmf())
```

Again, the centre has been found with good accuracy. We can inspect the output of `sphere_sm`

to see what value was estimated for `k`

.

```
est_hav
#> $mu
#> colat lon
#> 1.539329 3.112377
#>
#> $k
#> k
#> 5.415736
#>
#> $family
#> [1] "von Mises Fisher"
#>
#> $val
#> [1] -2.044959
est_euc
#> $mu
#> colat lon
#> 1.584297 3.111693
#>
#> $k
#> k
#> 5.481328
#>
#> $family
#> [1] "von Mises Fisher"
#>
#> $val
#> [1] -1.713839
```

### A different Boundary

We can experiment with a different boundary, implementing the same approach to the same dataset but a different truncation region. Start by simulating different data from the same distribution and constructing the boundary as before:

```
set.seed(7)
z = rvmf(n, centre_euclid, k = 6)
x = euclid_to_sphere(z)
dV = cbind(c(rep(1.2, 200), seq(1.2, 3, len=200)), c(seq(0, 3.6, len=200), rep(3.6, len=200)))
outside_boundary = x[,2] < 3.6 & x[,1] > 1.2
truncated_x = x[as.logical(outside_boundary),]
```

So in this case, the boundary is anything that fits \(\phi > 1.2\) and \(\theta < 3.6\), roughly a box around the top left portion of the data. We call `sphere_sm`

as before.

```
est_hav = sphere_sm(truncated_x, dV, g = "Haversine", family=vmf())
est_euc = sphere_sm(truncated_x, dV, g = "Euclidean", family=vmf())
```

The results are slightly less accurate than before, but still hold.

## Kent Distribution

We adopt a similar approach for the Kent distribution, simulating data, truncating it at an artificial boundary and experimenting with fitting the model to see how well the estimated mean direction \(\mu\) is. Firstly, we use a boundary similar to before, truncated at \(\phi = \pi/2+0.1\).

```
z = rkent(n, k = 10, m = centre_euclid, b=3)
x = euclid_to_sphere(z)
dV = cbind(rep(pi/2+0.1, 500), seq(0, 2*pi, len=500))
outside_boundary = x[,1] > (pi/2 + 0.1)
truncated_x = x[outside_boundary,]
```

Since the Kent distribution has five parameters, estimation can be difficult and slow in some cases. For the Kent distribution, we assume that the concentration parameter `k`

and ovalness parameter `b`

are known, and specified in advance.

```
est_hav = sphere_sm(truncated_x, dV, g="Haversine", family=kent(k=10, b=3))
est_euc = sphere_sm(truncated_x, dV, g="Euclidean", family=kent(k=10, b=3))
```

And another boundary, similar to before.

```
set.seed(12)
z = rkent(n, m=centre_euclid, k = 10, b=3)
dV = cbind(c(rep(1.2, 200), seq(1.2, 3, len=200)), c(seq(0, 3.6, len=200), rep(3.6, len=200)))
x = euclid_to_sphere(z)
outside_boundary = x[,2] < 3.6 & x[,1] > 1.2
truncated_x = x[as.logical(outside_boundary),]
est_hav = sphere_sm(truncated_x, dV, g = "Haversine", family=kent(k=10, b=3))
est_euc = sphere_sm(truncated_x, dV, g = "Euclidean", family=kent(k=10, b=3))
```