\(U\)-statistic is built upon theoretical arguments with the language of smooth manifold. The convergence rate of the statistic is achieved as a proxy for the estimated dimension by, at least partially, considering the scale and influence of extrinsic curvature. The method returns integer valued estimate in that there is no need for rounding the result for practical usage.

est.Ustat(X, maxdim = min(ncol(X), 15))

Arguments

X

an \((n\times p)\) matrix or data frame whose rows are observations.

maxdim

maximum possible dimension allowed for the algorithm to investigate.

Value

a named list containing containing

estdim

estimated intrinsic dimension.

References

Hein M, Audibert J (2005). “Intrinsic Dimensionality Estimation of Submanifolds in $R^d$.” In Proceedings of the 22nd International Conference on Machine Learning, 289--296.

Author

Kisung You

Examples

# \donttest{
## create 3 datasets of intrinsic dimension 2.
X1 = aux.gensamples(dname="swiss")
X2 = aux.gensamples(dname="ribbon")
X3 = aux.gensamples(dname="saddle")

## acquire an estimate for intrinsic dimension
out1 = est.Ustat(X1)
out2 = est.Ustat(X2)
out3 = est.Ustat(X3)

## print the results
line1 = paste0("* est.Ustat : 'swiss'  gives ",round(out1$estdim,2))
line2 = paste0("* est.Ustat : 'ribbon' gives ",round(out2$estdim,2))
line3 = paste0("* est.Ustat : 'saddle' gives ",round(out3$estdim,2))
cat(paste0(line1,"\n",line2,"\n",line3))
#> * est.Ustat : 'swiss'  gives 2
#> * est.Ustat : 'ribbon' gives 2
#> * est.Ustat : 'saddle' gives 2
# }