Title: | Online Change Point Detection for Matrix-Valued Time Series |
---|---|
Description: | We provide two algorithms for monitoring change points with online matrix-valued time series, under the assumption of a two-way factor structure. The algorithms are based on different calculations of the second moment matrices. One is based on stacking the columns of matrix observations, while another is by a more delicate projected approach. A well-known fact is that, in the presence of a change point, a factor model can be rewritten as a model with a larger number of common factors. In turn, this entails that, in the presence of a change point, the number of spiked eigenvalues in the second moment matrix of the data increases. Based on this, we propose two families of procedures - one based on the fluctuations of partial sums, and one based on extreme value theory - to monitor whether the first non-spiked eigenvalue diverges after a point in time in the monitoring horizon, thereby indicating the presence of a change point. This package also provides some simple functions for detecting and removing outliers, imputing missing entries and testing moments. See more details in He et al. (2021)<doi:10.48550/arXiv.2112.13479>. |
Authors: | Yong He [aut], Xinbing Kong [aut], Lorenzo Trapani [aut], Long Yu [aut, cre] |
Maintainer: | Long Yu <[email protected]> |
License: | GPL-2 | GPL-3 |
Version: | 0.1.2 |
Built: | 2024-10-28 04:19:17 UTC |
Source: | https://github.com/cran/OLCPM |
A data matrix containing approximated critical values for the scaled Wiener process. A supplement to Table 1 in Horvath et al. (2004). The rows are for different gamma on a grid of 50 points in [0,0.49) with step size equal to 0.01, while the columns are for different alpha on a grid of 50 points in [0.001,0.5) with step size equal to 0.001. All the values are based on 100000 replications of simulated Wiener process on an equal grid of 20000 points in [0,1], thus can be slightly different from those in Horvath et al. (2004).
cv.table
cv.table
A matrix of size 50*500.
the scaling parameter
significance level (1-alpha)-th quantile
#' @references Horvath L, Huskova M, Kokoszka P, et al (2004). Monitoring changes in linear models. Journal of statistical Planning and Inference, 126(1): 225-251.
This function generates matrix-valued time series under a two-way factor structure with/without a change point.
gen.data( Sample_T, p1, p2, k1, k2, tau = 0.5, change = 0, pp = 0.3, a = 0, cc = 0 )
gen.data( Sample_T, p1, p2, k1, k2, tau = 0.5, change = 0, pp = 0.3, a = 0, cc = 0 )
Sample_T |
positive integer indicating the length of series. |
p1 |
positive integer indicating the row dimension. |
p2 |
positive integer indicating the column dimension. |
k1 |
positive integer indicating the number of row factors. |
k2 |
positive integer indicating the number of column factors. |
tau |
a real number in |
change |
the type of change, taking 0 for no change point, taking 1 for the case that the loading matrix R changes, taking other values for the case that a new row factor occurs. |
pp |
a number in |
a |
a number in |
cc |
a number in |
See the paper He et al. (2021).
a array.
Yong He, Xinbing Kong, Lorenzo Trapani, Long Yu
He Y, Kong X, Trapani L, & Yu L(2021). Online change-point detection for matrix-valued time series with latent two-way factor structure. arXiv preprint, arXiv:2112.13479.
# set parameters k1=3 k2=3 epsilon=0.05 Sample_T=50 p1=40 p2=20 # generate data Y=gen.data(Sample_T,p1,p2,k1,k2,tau=0.5,change=1,pp=0.3) print("the dimension of Y is:") print(dim(Y))
# set parameters k1=3 k2=3 epsilon=0.05 Sample_T=50 p1=40 p2=20 # generate data Y=gen.data(Sample_T,p1,p2,k1,k2,tau=0.5,change=1,pp=0.3) print("the dimension of Y is:") print(dim(Y))
This function calculates the rolling eigenvalue series for the monitoring process, based on the “flat” version of sample covanriance matrix.
gen.psi.tau.flat( Y, k, m = ceiling(max(20, (dim(Y)[3])^(r/(r + 2)))), delta, r = 8 )
gen.psi.tau.flat( Y, k, m = ceiling(max(20, (dim(Y)[3])^(r/(r + 2)))), delta, r = 8 )
Y |
the observed |
k |
a positive integer determining which eigenvalue to monitor.
|
m |
a positive integer ( |
delta |
a number in |
r |
a positive integer indicating the order of the transformation
function |
The rolling eigenvalue series will start at the stage , with length
.
a matrix, whose three columns are the original,
rescaled, and transformed eigenvalue series, respectively.
Yong He, Xinbing Kong, Lorenzo Trapani, Long Yu
He Y, Kong X, Trapani L, & Yu L(2021). Online change-point detection for matrix-valued time series with latent two-way factor structure. arXiv preprint, arXiv:2112.13479.
## generate data k1=3 k2=3 epsilon=0.05 Sample_T=50 p1=40 p2=20 kmax=8 r=8 m=p2 # generate data Y=gen.data(Sample_T,p1,p2,k1,k2,tau=0.5,change=1,pp=0.3) # calculate delta temp=log(p1)/log(m*p2) delta=epsilon*(temp<=0.5)+(epsilon+1-1/(2*temp))*(temp>0.5) # calculate psi.tau psi2=gen.psi.tau.flat(Y,k1+1,m,delta,r) print(psi2)
## generate data k1=3 k2=3 epsilon=0.05 Sample_T=50 p1=40 p2=20 kmax=8 r=8 m=p2 # generate data Y=gen.data(Sample_T,p1,p2,k1,k2,tau=0.5,change=1,pp=0.3) # calculate delta temp=log(p1)/log(m*p2) delta=epsilon*(temp<=0.5)+(epsilon+1-1/(2*temp))*(temp>0.5) # calculate psi.tau psi2=gen.psi.tau.flat(Y,k1+1,m,delta,r) print(psi2)
This function calculates the rolling eigenvalue series for the monitoring process, based on the projected version of sample covanriance matrix.
gen.psi.tau.proj( Y, k, m = ceiling(max(20, (dim(Y)[3])^(r/(r + 2)))), delta, r = 8, kmax = 3 )
gen.psi.tau.proj( Y, k, m = ceiling(max(20, (dim(Y)[3])^(r/(r + 2)))), delta, r = 8, kmax = 3 )
Y |
the observed |
k |
a positive integer determining which eigenvalue to monitor.
|
m |
a positive integer ( |
delta |
a number in |
r |
a positive integer indicating the order of the transformation
function |
kmax |
a positive integer indicating the column number of the
projection matrix, should be larger than 0 but smaller than |
The rolling eigenvalue series will start at the stage , with length
.
a matrix, whose three columns are the original,
rescaled, and transformed eigenvalue series, respectively.
Yong He, Xinbing Kong, Lorenzo Trapani, Long Yu
He Y, Kong X, Trapani L, & Yu L(2021). Online change-point detection for matrix-valued time series with latent two-way factor structure. arXiv preprint, arXiv:2112.13479.
## generate data k1=3 k2=3 epsilon=0.05 Sample_T=50 p1=40 p2=20 kmax=8 r=8 m=p2 # generate data Y=gen.data(Sample_T,p1,p2,k1,k2,tau=0.5,change=1,pp=0.3) # calculate delta temp=log(p1)/log(m*p2) delta=epsilon*(temp<=0.5)+(epsilon+1-1/(2*temp))*(temp>0.5) # calculate psi.tau psi2=gen.psi.tau.proj(Y,k1+1,m,delta,r,kmax) print(psi2)
## generate data k1=3 k2=3 epsilon=0.05 Sample_T=50 p1=40 p2=20 kmax=8 r=8 m=p2 # generate data Y=gen.data(Sample_T,p1,p2,k1,k2,tau=0.5,change=1,pp=0.3) # calculate delta temp=log(p1)/log(m*p2) delta=epsilon*(temp<=0.5)+(epsilon+1-1/(2*temp))*(temp>0.5) # calculate psi.tau psi2=gen.psi.tau.proj(Y,k1+1,m,delta,r,kmax) print(psi2)
This function calculates critical values for the partial-sum and worst-case statistics.
getcv(alpha = 0.05, method = "ps", eta = 0.5, simul = 0)
getcv(alpha = 0.05, method = "ps", eta = 0.5, simul = 0)
alpha |
a number in |
method |
“ps” for the partial-sum staistic, others for the worst-case statistic. |
eta |
a number in |
simul |
logical value, woking only for "ps" method with
|
For the partial-sum statistic with or the worst-case
statistic, the critical value is simply
. For the
partial-sum statistic with
not equal to 0.5, the critical
value of the scaled Wiener process is approximated by simulated data or from
our preserved table cv.table, covering
in
with step size equal to 0.01 and
in
with step size equal to 0.001. See more details for the
test statistics in He et al. (2021).
a real number.
Yong He, Xinbing Kong, Lorenzo Trapani, Long Yu
He Y, Kong X, Trapani L, & Yu L(2021). Online change-point detection for matrix-valued time series with latent two-way factor structure. arXiv preprint, arXiv:2112.13479.
## Not run: getcv(0.05,method="ps",eta=0.25) getcv(0.05,method="ps",eta=0.25,simul=1) getcv(0.10,method="wc") ## End(Not run)
## Not run: getcv(0.05,method="ps",eta=0.25) getcv(0.05,method="ps",eta=0.25,simul=1) getcv(0.10,method="wc") ## End(Not run)
This function imputes missing entries in a numeric vector by linear interpolation.
impute.linear(x)
impute.linear(x)
x |
a numeric data vector, where |
A missing entry will be imputed by linear interpolation with the two nearest values before and after it in the vector. When all the values before (after) it are missing, use the two nearest values after (before) it, instead.
a new numeric vector, with the same size of the original vector, while all the missing entries have been imputed.
Yong He, Xinbing Kong, Lorenzo Trapani, Long Yu
a=c(NA,NA,3,NA,NA,6,NA,NA) b=c(1,2,3,4.5,5,NA,6.5,7,NA) impute.linear(a) impute.linear(b)
a=c(NA,NA,3,NA,NA,6,NA,NA) b=c(1,2,3,4.5,5,NA,6.5,7,NA) impute.linear(a) impute.linear(b)
This function tests whether the number of row factors is equal or larger than a given integer, under a two-way factor model, using flat version of sample covariance.
ITP_noproj( Y, k = 1, alpha = 0.05, epsilon = 0.05, r = 8, M = 100, S = 100, fq = 1/4 )
ITP_noproj( Y, k = 1, alpha = 0.05, epsilon = 0.05, r = 8, M = 100, S = 100, fq = 1/4 )
Y |
data, a |
k |
an positive integer indicating which eigenvalue to test. |
alpha |
a number in (0,1), indicating the significance of the test. |
epsilon |
a small positive number in (0,1), indicating the size of scaling. |
r |
a positive number indicating the order of the power function for transforming the rescaled eigenvalue. |
M |
a large integer for the number of Gaussian variables in the randomized test. |
S |
another large integer for the number of replications in the strong rule. Usually |
fq |
a number in (0,0.5), controlling the threshold function of the strong rule. |
See He et al. (2023)
a logical value. 1 for "the number of row factors is smaller than k". 0 for "at least k row factors exists".
He Y, Kong X, Trapani L, & Yu L (2023). One-way or two-way factor model for matrix sequences? Journal of Econometrics, 235(2), 1981-2004.
k1=3 k2=3 Sample_T=100 p1=40 p2=20 Y=gen.data(Sample_T,p1,p2,k1,k2,tau=0.5,change=0) ITP_noproj(Y,k=1,M=Sample_T,S=Sample_T) ITP_noproj(Y,k=4,M=Sample_T,S=Sample_T)
k1=3 k2=3 Sample_T=100 p1=40 p2=20 Y=gen.data(Sample_T,p1,p2,k1,k2,tau=0.5,change=0) ITP_noproj(Y,k=1,M=Sample_T,S=Sample_T) ITP_noproj(Y,k=4,M=Sample_T,S=Sample_T)
This function tests whether the number of row factors is equal or larger than a given integer, under a two-way factor model, using projected version of sample covariance.
ITP_proj( Y, k = 1, alpha = 0.05, kmax = 4, epsilon = 0.05, r = 8, M = 100, S = 100, fq = 1/4 )
ITP_proj( Y, k = 1, alpha = 0.05, kmax = 4, epsilon = 0.05, r = 8, M = 100, S = 100, fq = 1/4 )
Y |
data, a |
k |
an positive integer indicating which eigenvalue to test. |
alpha |
a number in (0,1), indicating the significance of the test. |
kmax |
a positive integer smaller than p2, indicating the upper bound for the factor numbers, and the dimension of projection matrix. |
epsilon |
a small positive number in (0,1), indicating the size of scaling. |
r |
a positive number indicating the order of the power function for transforming the rescaled eigenvalue. |
M |
a large integer for the number of Gaussian variables in the randomized test. |
S |
another large integer for the number of replications in the strong rule. Usually |
fq |
a number in (0,0.5), controlling the threshold function of the strong rule. |
See He et al. (2023)
a logical value. 1 for "the number of row factors is smaller than k". 0 for "at least k row factors exists".
He Y, Kong X, Trapani L, & Yu L (2023). One-way or two-way factor model for matrix sequences? Journal of Econometrics, 235(2), 1981-2004.
k1=3 k2=3 Sample_T=100 p1=40 p2=20 Y=gen.data(Sample_T,p1,p2,k1,k2,tau=0.5,change=0) ITP_proj(Y,k=1,M=Sample_T,S=Sample_T) ITP_proj(Y,k=4,M=Sample_T,S=Sample_T)
k1=3 k2=3 Sample_T=100 p1=40 p2=20 Y=gen.data(Sample_T,p1,p2,k1,k2,tau=0.5,change=0) ITP_proj(Y,k=1,M=Sample_T,S=Sample_T) ITP_proj(Y,k=4,M=Sample_T,S=Sample_T)
This function determined the numbers of row and column factors for matrix variate data under a two-way factor model, using a projected method.
kpe(Y, kmax = 4, c = 0)
kpe(Y, kmax = 4, c = 0)
Y |
data, a |
kmax |
a positive integer smaller than p2, indicating the upper bound for the factor numbers, and the dimension of projection matrix. |
c |
a non-negative but small number, to ensure the denominator of the eigenvalue ratio statistics is not 0. |
a 2-dimensional vector containing the estimated number of row and column factors, respectively.
Yu L, He Y, Kong X, & Zhang X (2022). Projected estimation for large-dimensional matrix factor models. Journal of Econometrics, 229(1),201-217.
k1=3 k2=3 Sample_T=100 p1=40 p2=20 kmax=8 Y=gen.data(Sample_T,p1,p2,k1,k2,tau=0.5,change=0) kpe(Y,kmax)
k1=3 k2=3 Sample_T=100 p1=40 p2=20 kmax=8 Y=gen.data(Sample_T,p1,p2,k1,k2,tau=0.5,change=0) kpe(Y,kmax)
This function determines the number of row factors under a two-way factor structure, using randomized test method.
KSTP( Y, alpha = 0.05, type = "proj", kmax = 4, epsilon = 0.05, r = 8, M = 100, S = 100, fq = 1/4 )
KSTP( Y, alpha = 0.05, type = "proj", kmax = 4, epsilon = 0.05, r = 8, M = 100, S = 100, fq = 1/4 )
Y |
data, a |
alpha |
a number in (0,1), indicating the significance of the test. |
type |
indicates how to calculate the sample covariance. "flat" for the flat version, while others for the projected version. |
kmax |
a positive integer smaller than p2, indicating the upper bound for the factor numbers, and the dimension of projection matrix. |
epsilon |
a small positive number in (0,1), indicating the size of scaling. |
r |
a positive number indicating the order of the power function for transforming the rescaled eigenvalue. |
M |
a large integer for the number of Gaussian variables in the randomized test. |
S |
another large integer for the number of replications in the strong rule. Usually |
fq |
a number in (0,0.5), controlling the threshold function of the strong rule. |
See He et al. (2023)
an integer for the number of row factors. To determine the number of column factors, just transpose the observation matrices.
He Y, Kong X, Trapani L, & Yu L (2023). One-way or two-way factor model for matrix sequences? Journal of Econometrics, 235(2), 1981-2004.
k1=3 k2=3 Sample_T=100 p1=40 p2=20 Y=gen.data(Sample_T,p1,p2,k1,k2,tau=0.5,change=0) KSTP(Y) KSTP(aperm(Y,c(1,3,2)))
k1=3 k2=3 Sample_T=100 p1=40 p2=20 Y=gen.data(Sample_T,p1,p2,k1,k2,tau=0.5,change=0) KSTP(Y) KSTP(aperm(Y,c(1,3,2)))
This function reports the largest moment that exists for a collection of data samples.
moment.determine(x, k.max = 8, alpha = 0.05, R = 400)
moment.determine(x, k.max = 8, alpha = 0.05, R = 400)
x |
a numeric vector of data samples. |
k.max |
a number indicating the upper bound, i.e., at most k.max-th moment exists. |
alpha |
a number in |
R |
the number of standard Gaussian variables generated in the
randomized test; see also |
The procedure will sequentially test the existence of the moment, using the function
in the same
package. As soon as the procedure finds that the
moment does not
exist, it stops and reports at most
moment.
an integer, indicating the largest moment that exists for the data samples.
Yong He, Xinbing Kong, Lorenzo Trapani, Long Yu
x=rt(10000,5) moment.determine(x,10) x=rt(10000,4) moment.determine(x,10)
x=rt(10000,5) moment.determine(x,10) x=rt(10000,4) moment.determine(x,10)
This function tests the existence of k-th moment by randomized method in Trapani (2016).
moment.test(x, k = 16, R = 400)
moment.test(x, k = 16, R = 400)
x |
a numeric vector of data samples. |
k |
a number no smaller than 4, indicating that the procedure will test
the existence of the k-th moment when k is even. Otherwise, the procedure
will test the existence of the |
R |
the number of standard Gaussian variables generated in the randomized test; see more details in Trapani (2016). |
The procedure is adapted from Trapani (2016) with , where
is a tuning parameter to scale the sample moments defined
in Section 3.1 of Trapani (2016). For simplicity, we only test the 4th, 6th,
... 2c-th moments.
a scalar in , indicating the p-value
of the test. The null hypothese is that the k-th moment doesn't exist.
Therefore, a small p-value indicates the existense of the k-th moment.
Yong He, Xinbing Kong, Lorenzo Trapani, Long Yu
Trapani, L. (2016). Testing for (in) finite moments. Journal of Econometrics, 191(1), 57-68.
x=rt(10000,5) moment.test(x,4) x=rt(10000,4) moment.test(x,4)
x=rt(10000,5) moment.test(x,4) x=rt(10000,4) moment.test(x,4)
This function removes outliers in the data, which are far from the sample median.
outlier.remove(x, rg = 3)
outlier.remove(x, rg = 3)
x |
a numeric data vector. |
rg |
a positive number indicating how the outliers are defined; see details. |
An outlier is detected if it deviates from the sample median more than rg times interquantile range.
a list containing the following:
x |
a new
data vector with outliers replaced with |
id |
the locations of the outliers in the data vector. |
Yong He, Xinbing Kong, Lorenzo Trapani, Long Yu
a=c(1:5,NA,10000) outlier.remove(a,3)
a=c(1:5,NA,10000) outlier.remove(a,3)
This function tests multiple change points for matrix-valued online time
series, under a two-way factor structure. A change point will be reported
only when it's the majority vote in multiple replications. The function KSTP
is used to determine the initial number of factors in each regime. This function only outputs
the change points for row factors. For column factors, transpose the data.
test.multiple.robust( Y, k = 1, m = ceiling(max(20, (dim(Y)[3])^(r/(r + 2)))), epsilon1 = 0.25, epsilon2 = 0.05, r = 8, kmax = 4, type = "proj", method = "ps", eta = 0.25, cv = 2.386, S = 100, pr = 0.75 )
test.multiple.robust( Y, k = 1, m = ceiling(max(20, (dim(Y)[3])^(r/(r + 2)))), epsilon1 = 0.25, epsilon2 = 0.05, r = 8, kmax = 4, type = "proj", method = "ps", eta = 0.25, cv = 2.386, S = 100, pr = 0.75 )
Y |
data, a |
k |
a non-negative integer indicating the initial number of factors. |
m |
a positive integer ( |
epsilon1 |
the rescaling parameter taking value in |
epsilon2 |
the rescaling parameter taking value in |
r |
a positive integer indicating the order of the transformation
function |
kmax |
a positive number determining the column number of the projection
matrix, should be larger than 0 but smaller than |
type |
indicates how to calculate the sample covariance. "flat" for the flat version, while others for the projected version. See more details in He et al. (2021). |
method |
indicating the test statistic, “ps” for the partial-sum method; others for the worst-case method. |
eta |
a number between |
cv |
critical value; see also |
S |
an integer indicating the number of replications. |
pr |
an number in |
#' See empirical study in He et al. (2021).
a matrix with two columns. The first column reports the locations of change points. The second column reports the number of row factors after each change point.
Yong He, Xinbing Kong, Lorenzo Trapani, Long Yu
He Y, Kong X, Trapani L, & Yu L(2021). Online change-point detection for matrix-valued time series with latent two-way factor structure. arXiv preprint, arXiv:2112.13479.
## Not run: k1=3 k2=3 Sample_T=100 p1=40 p2=40 kmax=8 r=8 m=p2 # generate data Y1=gen.data(Sample_T,p1,p2,k1,k2,tau=0.5,change=1,pp=0.5) Y2=gen.data(Sample_T,p1,p2,k1,k2,tau=0.5,change=0) Y=array(rbind(matrix(Y1,Sample_T,p1*p2),matrix(Y2,Sample_T,p1*p2)),c(Sample_T*2,p1,p2)) # calculate cv for "ps" with eta=0.45 and "wc" cv1=getcv(0.05,method="ps",eta=0.45) cv2=getcv(0.05,method="wc") # test with Y test.multiple.robust(Y,k1,m,epsilon1=0.25,epsilon2=0.05,r,type="proj",kmax,method="ps") test.multiple.robust(Y,k1,m,epsilon1=0.25,epsilon2=0.05,r,type="proj",kmax,method="wc",cv=cv2) test.multiple.robust(Y,k1,m,epsilon1=0.25,epsilon2=0.05,r,type="flat",method="wc",cv=cv2) test.multiple.robust(Y,k1,m,epsilon1=0.25,epsilon2=0.05,r,type="flat",method="ps",eta=0.45,cv=cv1) ## End(Not run)
## Not run: k1=3 k2=3 Sample_T=100 p1=40 p2=40 kmax=8 r=8 m=p2 # generate data Y1=gen.data(Sample_T,p1,p2,k1,k2,tau=0.5,change=1,pp=0.5) Y2=gen.data(Sample_T,p1,p2,k1,k2,tau=0.5,change=0) Y=array(rbind(matrix(Y1,Sample_T,p1*p2),matrix(Y2,Sample_T,p1*p2)),c(Sample_T*2,p1,p2)) # calculate cv for "ps" with eta=0.45 and "wc" cv1=getcv(0.05,method="ps",eta=0.45) cv2=getcv(0.05,method="wc") # test with Y test.multiple.robust(Y,k1,m,epsilon1=0.25,epsilon2=0.05,r,type="proj",kmax,method="ps") test.multiple.robust(Y,k1,m,epsilon1=0.25,epsilon2=0.05,r,type="proj",kmax,method="wc",cv=cv2) test.multiple.robust(Y,k1,m,epsilon1=0.25,epsilon2=0.05,r,type="flat",method="wc",cv=cv2) test.multiple.robust(Y,k1,m,epsilon1=0.25,epsilon2=0.05,r,type="flat",method="ps",eta=0.45,cv=cv1) ## End(Not run)
This function tests single change point for matrix-valued online time series, under a two-way factor structure, using ”flat” sample covariance matrix.
test.once.flat( Y, k = 1, m = ceiling(max(20, (dim(Y)[3])^(r/(r + 2)))), epsilon = 0.05, r = 8, decrease = 0, method = "ps", eta = 0.25, cv = 2.386 )
test.once.flat( Y, k = 1, m = ceiling(max(20, (dim(Y)[3])^(r/(r + 2)))), epsilon = 0.05, r = 8, decrease = 0, method = "ps", eta = 0.25, cv = 2.386 )
Y |
data, a |
k |
a positive integer indicating which eigenvalue to monitor.
|
m |
a positive integer ( |
epsilon |
the rescaling parameter taking value in |
r |
a positive integer indicating the order of the transformation
function |
decrease |
a logical value. If decrease=1, testing the decrease of factor number. |
method |
indicating the test statistic, “ps” for the partial-sum method; others for the worst-case method. |
eta |
a number between |
cv |
critical value; see also |
See He et al. (2021).
a list containing:
test |
a logical value. 1 indicating the existence of change point, 0 indicating no change point. |
loc |
an integer larger than m, indicating
the location of change point; or |
Yong He, Xinbing Kong, Lorenzo Trapani, Long Yu
He Y, Kong X, Trapani L, & Yu L(2021). Online change-point detection for matrix-valued time series with latent two-way factor structure. arXiv preprint, arXiv:2112.13479.
k1=3 k2=3 epsilon=0.05 Sample_T=50 p1=40 p2=20 r=8 m=p2 # generate data Y=gen.data(Sample_T,p1,p2,k1,k2,tau=0.5,change=1,pp=0.5) # calculate cv for "ps" with eta=0.45 and "wc" cv1=getcv(0.05,method="ps",eta=0.45) cv2=getcv(0.05,method="wc") ## test with Y, flat version test.once.flat(Y,k1+1,m,epsilon,r,0,method="ps",eta=0.25) test.once.flat(Y,k1+1,m,epsilon,r,0,method="ps",eta=0.45,cv1) test.once.flat(Y,k1+1,m,epsilon,r,0,method="wc",eta=0.5,cv2)
k1=3 k2=3 epsilon=0.05 Sample_T=50 p1=40 p2=20 r=8 m=p2 # generate data Y=gen.data(Sample_T,p1,p2,k1,k2,tau=0.5,change=1,pp=0.5) # calculate cv for "ps" with eta=0.45 and "wc" cv1=getcv(0.05,method="ps",eta=0.45) cv2=getcv(0.05,method="wc") ## test with Y, flat version test.once.flat(Y,k1+1,m,epsilon,r,0,method="ps",eta=0.25) test.once.flat(Y,k1+1,m,epsilon,r,0,method="ps",eta=0.45,cv1) test.once.flat(Y,k1+1,m,epsilon,r,0,method="wc",eta=0.5,cv2)
Based on test.once.flat
, this function
repeats the randomized procedure multiple times and reports the majority
vote, thus more robust.
test.once.flat.robust( Y, k = 1, m = ceiling(max(20, (dim(Y)[3])^(r/(r + 2)))), epsilon = 0.05, r = 8, decrease = 0, method = "ps", eta = 0.25, cv = 2.386, S = 100, pr = 0.75 )
test.once.flat.robust( Y, k = 1, m = ceiling(max(20, (dim(Y)[3])^(r/(r + 2)))), epsilon = 0.05, r = 8, decrease = 0, method = "ps", eta = 0.25, cv = 2.386, S = 100, pr = 0.75 )
Y |
data, a |
k |
a positive integer indicating which eigenvalue to monitor.
|
m |
a positive integer ( |
epsilon |
the rescaling parameter taking value in |
r |
a positive integer indicating the order of the transformation
function |
decrease |
a logical value. If decrease=1, testing the decrease of factor number. |
method |
indicating the test statistic, “ps” for the partial-sum method; others for the worst-case method. |
eta |
a number between |
cv |
critical value; see also |
S |
an integer indicating the number of replications. |
pr |
an number in |
See He et al. (2021).
a list containing:
test |
a logical value. 1 indicating the existence of change point, 0 indicating no change point. |
loc |
an integer larger than m, indicating
the median location of the change point among the positive votes in the
S replications; or |
Yong He, Xinbing Kong, Lorenzo Trapani, Long Yu
He Y, Kong X, Trapani L, & Yu L(2021). Online change-point detection for matrix-valued time series with latent two-way factor structure. arXiv preprint, arXiv:2112.13479.
## Not run: k1=3 k2=3 epsilon=0.05 Sample_T=50 p1=40 p2=20 kmax=8 r=8 m=p2 # generate data Y=gen.data(Sample_T,p1,p2,k1,k2,tau=0.5,change=1,pp=0.5) # calculate cv for "ps" with eta=0.45 and "wc" cv1=getcv(0.05,method="ps",eta=0.45) cv2=getcv(0.05,method="wc") ## test with Y, flat version test.once.flat.robust(Y,k1+1,m,epsilon,r,0,method="ps",eta=0.25) test.once.flat.robust(Y,k1+1,m,epsilon,r,0,method="ps",eta=0.45,cv1) test.once.flat.robust(Y,k1+1,m,epsilon,r,0,method="wc",eta=0.5,cv2) ## End(Not run)
## Not run: k1=3 k2=3 epsilon=0.05 Sample_T=50 p1=40 p2=20 kmax=8 r=8 m=p2 # generate data Y=gen.data(Sample_T,p1,p2,k1,k2,tau=0.5,change=1,pp=0.5) # calculate cv for "ps" with eta=0.45 and "wc" cv1=getcv(0.05,method="ps",eta=0.45) cv2=getcv(0.05,method="wc") ## test with Y, flat version test.once.flat.robust(Y,k1+1,m,epsilon,r,0,method="ps",eta=0.25) test.once.flat.robust(Y,k1+1,m,epsilon,r,0,method="ps",eta=0.45,cv1) test.once.flat.robust(Y,k1+1,m,epsilon,r,0,method="wc",eta=0.5,cv2) ## End(Not run)
This function tests single change point for matrix-valued online time series, under a two-way factor structure, using projected sample covariance matrix.
test.once.proj( Y, k = 1, m = ceiling(max(20, (dim(Y)[3])^(r/(r + 2)))), epsilon = 0.05, r = 8, kmax = 3, decrease = 0, method = "ps", eta = 0.25, cv = 2.386 )
test.once.proj( Y, k = 1, m = ceiling(max(20, (dim(Y)[3])^(r/(r + 2)))), epsilon = 0.05, r = 8, kmax = 3, decrease = 0, method = "ps", eta = 0.25, cv = 2.386 )
Y |
data, a |
k |
a positive integer indicating which eigenvalue to monitor.
|
m |
a positive integer ( |
epsilon |
the rescaling parameter taking value in |
r |
a positive integer indicating the order of the transformation
function |
kmax |
a positive number determining the column number of the
projection matrix, should be larger than 0 but smaller than |
decrease |
a logical value. If decrease=1, testing the decrease of factor number. |
method |
indicating the test statistic, “ps” for the partial-sum method; others for the worst-case method. |
eta |
a number between |
cv |
critical value; see also |
See He et al. (2021).
a list containing:
test |
a logical value. 1 indicating the existence of change point, 0 indicating no change point. |
loc |
an integer larger than m, indicating
the location of change point; or |
Yong He, Xinbing Kong, Lorenzo Trapani, Long Yu
He Y, Kong X, Trapani L, & Yu L(2021). Online change-point detection for matrix-valued time series with latent two-way factor structure. arXiv preprint, arXiv:2112.13479.
k1=3 k2=3 epsilon=0.05 Sample_T=50 p1=40 p2=20 kmax=8 r=8 # generate data Y=gen.data(Sample_T,p1,p2,k1,k2,tau=0.5,change=1,pp=0.5) # calculate cv for "ps" with eta=0.45 and "wc" cv1=getcv(0.05,method="ps",eta=0.45) cv2=getcv(0.05,method="wc") ## test with Y, projection test.once.proj(Y,k1+1,kmax=8) m=p2 test.once.proj(Y,k1+1,m,epsilon,r,kmax,0,method="ps",eta=0.45,cv1) test.once.proj(Y,k1+1,m,epsilon,r,kmax,0,method="wc",eta=0.5,cv2)
k1=3 k2=3 epsilon=0.05 Sample_T=50 p1=40 p2=20 kmax=8 r=8 # generate data Y=gen.data(Sample_T,p1,p2,k1,k2,tau=0.5,change=1,pp=0.5) # calculate cv for "ps" with eta=0.45 and "wc" cv1=getcv(0.05,method="ps",eta=0.45) cv2=getcv(0.05,method="wc") ## test with Y, projection test.once.proj(Y,k1+1,kmax=8) m=p2 test.once.proj(Y,k1+1,m,epsilon,r,kmax,0,method="ps",eta=0.45,cv1) test.once.proj(Y,k1+1,m,epsilon,r,kmax,0,method="wc",eta=0.5,cv2)
Based on test.once.proj
, this function
repeats the randomized procedure multiple times and reports the majority
vote, thus more robust.
test.once.proj.robust( Y, k = 1, m = ceiling(max(20, (dim(Y)[3])^(r/(r + 2)))), epsilon = 0.05, r = 8, kmax = 3, decrease = 0, method = "ps", eta = 0.25, cv = 2.386, S = 100, pr = 0.75 )
test.once.proj.robust( Y, k = 1, m = ceiling(max(20, (dim(Y)[3])^(r/(r + 2)))), epsilon = 0.05, r = 8, kmax = 3, decrease = 0, method = "ps", eta = 0.25, cv = 2.386, S = 100, pr = 0.75 )
Y |
data, a |
k |
a positive integer indicating which eigenvalue to monitor.
|
m |
a positive integer ( |
epsilon |
the rescaling parameter taking value in |
r |
a positive integer indicating the order of the transformation
function |
kmax |
a positive number determining the column number of the
projection matrix, should be larger than 0 but smaller than |
decrease |
a logical value. If decrease=1, testing the decrease of factor number. |
method |
indicating the test statistic, “ps” for the partial-sum method; others for the worst-case method. |
eta |
a number between |
cv |
critical value; see also |
S |
an integer indicating the number of replications. |
pr |
an number in |
See He et al. (2021).
a list containing:
test |
a logical value. 1 indicating the existence of change point, 0 indicating no change point. |
loc |
an integer larger than m, indicating
the median location of the change point among the positive votes in the
S replications; or |
Yong He, Xinbing Kong, Lorenzo Trapani, Long Yu
He Y, Kong X, Trapani L, & Yu L(2021). Online change-point detection for matrix-valued time series with latent two-way factor structure. arXiv preprint, arXiv:2112.13479.
## Not run: k1=3 k2=3 epsilon=0.05 Sample_T=50 p1=40 p2=20 kmax=8 r=8 m=p2 # generate data Y=gen.data(Sample_T,p1,p2,k1,k2,tau=0.5,change=1,pp=0.5) # calculate cv for "ps" with eta=0.45 and "wc" cv1=getcv(0.05,method="ps",eta=0.45) cv2=getcv(0.05,method="wc") ## test with Y, projection test.once.proj.robust(Y,k1+1,m,epsilon,r,kmax,0,method="ps",eta=0.25) test.once.proj.robust(Y,k1+1,m,epsilon,r,kmax,0,method="ps",eta=0.45,cv1) test.once.proj.robust(Y,k1+1,m,epsilon,r,kmax,0,method="wc",eta=0.5,cv2) ## End(Not run)
## Not run: k1=3 k2=3 epsilon=0.05 Sample_T=50 p1=40 p2=20 kmax=8 r=8 m=p2 # generate data Y=gen.data(Sample_T,p1,p2,k1,k2,tau=0.5,change=1,pp=0.5) # calculate cv for "ps" with eta=0.45 and "wc" cv1=getcv(0.05,method="ps",eta=0.45) cv2=getcv(0.05,method="wc") ## test with Y, projection test.once.proj.robust(Y,k1+1,m,epsilon,r,kmax,0,method="ps",eta=0.25) test.once.proj.robust(Y,k1+1,m,epsilon,r,kmax,0,method="ps",eta=0.45,cv1) test.once.proj.robust(Y,k1+1,m,epsilon,r,kmax,0,method="wc",eta=0.5,cv2) ## End(Not run)
This function tests single change point for matrix-valued online time series, under a two-way factor structure, given the transformed eigenvalue series.
test.once.psi(m = 20, psi, method = "ps", eta = 0.25, cv = 2.386)
test.once.psi(m = 20, psi, method = "ps", eta = 0.25, cv = 2.386)
m |
a positive integer ( |
psi |
the transformed eigenvalue series, produced by gen.psi.tau.flat
or gen.psi.tau.proj, with length |
method |
indicating the test statistic, “ps” for the partial-sum method, while others for the worst-case method. |
eta |
a number between |
cv |
critical value, related to the significance level and test
statistic. The default cv is from Horvath et al. (2004), and only works for
|
See He et al. (2021).
a list containing:
test |
a logical value. 1 indicating the existence of change point, 0 indicating no change point. |
loc |
an integer larger than m, indicating
the location of change point; or |
Yong He, Xinbing Kong, Lorenzo Trapani, Long Yu
Horvath L, Huskova M, Kokoszka P, et al (2004). Monitoring changes in linear models. Journal of statistical Planning and Inference, 126(1): 225-251.
He Y, Kong X, Trapani L, & Yu L(2021). Online change-point detection for matrix-valued time series with latent two-way factor structure. arXiv preprint, arXiv:2112.13479.
k1=3 k2=3 epsilon=0.05 Sample_T=50 p1=40 p2=20 kmax=8 r=8 m=p2 # generate data Y=gen.data(Sample_T,p1,p2,k1,k2,tau=0.5,change=1,pp=0.5) # calculate delta temp=log(p1)/log(m*p2) delta=epsilon*(temp<=0.5)+(epsilon+1-1/(2*temp))*(temp>0.5) # calculate psi.tau psi1=gen.psi.tau.proj(Y,k1+1,m,delta,r,kmax) psi2=gen.psi.tau.flat(Y,k1+1,m,delta,r) # calculate cv for "ps" with eta=0.45 and "wc" cv1=getcv(0.05,method="ps",eta=0.45) cv2=getcv(0.05,method="wc") # test with psi1 test.once.psi(m,psi1[,3],method="ps",eta=0.45,cv1) test.once.psi(m,psi1[,3],method="wc",eta=0.5,cv2) # test with psi2 test.once.psi(m,psi2[,3],method="ps",eta=0.45,cv1) test.once.psi(m,psi2[,3],method="wc",eta=0.5,cv2)
k1=3 k2=3 epsilon=0.05 Sample_T=50 p1=40 p2=20 kmax=8 r=8 m=p2 # generate data Y=gen.data(Sample_T,p1,p2,k1,k2,tau=0.5,change=1,pp=0.5) # calculate delta temp=log(p1)/log(m*p2) delta=epsilon*(temp<=0.5)+(epsilon+1-1/(2*temp))*(temp>0.5) # calculate psi.tau psi1=gen.psi.tau.proj(Y,k1+1,m,delta,r,kmax) psi2=gen.psi.tau.flat(Y,k1+1,m,delta,r) # calculate cv for "ps" with eta=0.45 and "wc" cv1=getcv(0.05,method="ps",eta=0.45) cv2=getcv(0.05,method="wc") # test with psi1 test.once.psi(m,psi1[,3],method="ps",eta=0.45,cv1) test.once.psi(m,psi1[,3],method="wc",eta=0.5,cv2) # test with psi2 test.once.psi(m,psi2[,3],method="ps",eta=0.45,cv1) test.once.psi(m,psi2[,3],method="wc",eta=0.5,cv2)
Based on test.once.psi
, this function
repeats the randomized procedure multiple times and reports the majority
vote, thus more robust.
test.once.psi.robust( m = 20, psi, method = "ps", eta = 0.25, cv = 2.386, S = 100, pr = 0.75 )
test.once.psi.robust( m = 20, psi, method = "ps", eta = 0.25, cv = 2.386, S = 100, pr = 0.75 )
m |
a positive integer ( |
psi |
the transformed eigenvalue series, produced by gen.psi.tau.flat
or gen.psi.tau.proj, with length |
method |
indicating the test statistic, “ps” for the partial-sum method, while others for the worst-case method. |
eta |
a number between |
cv |
critical value; see also |
S |
an integer indicating the number of replications. |
pr |
an number in |
See He et al. (2021).
a list containing:
test |
a logical value. 1 indicating the existence of change point, 0 indicating no change point. |
loc |
an integer larger than m, indicating
the median location of the change point among the positive votes in the
S replications; or |
Yong He, Xinbing Kong, Lorenzo Trapani, Long Yu
He Y, Kong X, Trapani L, & Yu L(2021). Online change-point detection for matrix-valued time series with latent two-way factor structure. arXiv preprint, arXiv:2112.13479.
k1=3 k2=3 epsilon=0.05 Sample_T=50 p1=40 p2=20 kmax=8 r=8 m=p2 # generate data Y=gen.data(Sample_T,p1,p2,k1,k2,tau=0.5,change=1,pp=0.5) # calculate delta temp=log(p1)/log(m*p2) delta=epsilon*(temp<=0.5)+(epsilon+1-1/(2*temp))*(temp>0.5) # calculate psi.tau psi1=gen.psi.tau.proj(Y,k1+1,m,delta,r,kmax) psi2=gen.psi.tau.flat(Y,k1+1,m,delta,r) # calculate cv for "ps" with eta=0.45 and "wc" cv1=getcv(0.05,method="ps",eta=0.45) cv2=getcv(0.05,method="wc") # test with psi1 test.once.psi.robust(m,psi1[,3],method="ps",eta=0.45,cv1,S=100,pr=0.75) test.once.psi.robust(m,psi1[,3],method="wc",eta=0.5,cv2,S=100,pr=0.75) # test with psi2 test.once.psi.robust(m,psi2[,3],method="ps",eta=0.45,cv1,S=100,pr=0.75) test.once.psi.robust(m,psi2[,3],method="wc",eta=0.5,cv2,S=100,pr=0.75)
k1=3 k2=3 epsilon=0.05 Sample_T=50 p1=40 p2=20 kmax=8 r=8 m=p2 # generate data Y=gen.data(Sample_T,p1,p2,k1,k2,tau=0.5,change=1,pp=0.5) # calculate delta temp=log(p1)/log(m*p2) delta=epsilon*(temp<=0.5)+(epsilon+1-1/(2*temp))*(temp>0.5) # calculate psi.tau psi1=gen.psi.tau.proj(Y,k1+1,m,delta,r,kmax) psi2=gen.psi.tau.flat(Y,k1+1,m,delta,r) # calculate cv for "ps" with eta=0.45 and "wc" cv1=getcv(0.05,method="ps",eta=0.45) cv2=getcv(0.05,method="wc") # test with psi1 test.once.psi.robust(m,psi1[,3],method="ps",eta=0.45,cv1,S=100,pr=0.75) test.once.psi.robust(m,psi1[,3],method="wc",eta=0.5,cv2,S=100,pr=0.75) # test with psi2 test.once.psi.robust(m,psi2[,3],method="ps",eta=0.45,cv1,S=100,pr=0.75) test.once.psi.robust(m,psi2[,3],method="wc",eta=0.5,cv2,S=100,pr=0.75)
This function calculates the cumulative explanatory power of the leading row factors, in terms of the explained variance, under a two-way factor structure.
var.exp(Y, k = 2, type = "proj", kmax = 4, plot = 0)
var.exp(Y, k = 2, type = "proj", kmax = 4, plot = 0)
Y |
data, a |
k |
a positive integer indicating the number of factors investigated, should be smaller than p1. |
type |
indicates how to calculate the sample covariance. "flat" for the flat version, while others for the projected version. |
kmax |
a positive integer smaller than p2, indicating the upper bound for the factor numbers, and the dimension of projection matrix. |
plot |
a logical value. When plot=1, a figure of the cumulative explanatory power will be plotted, with x axis being the number of factors, and y axis being the cumulative explained variance. |
a vector with k entries, corresponding to the cumulative explanatory power of the leading k factors.
k1=3 k2=3 Sample_T=100 p1=40 p2=20 Y=gen.data(Sample_T,p1,p2,k1,k2,tau=0.5,change=0) var.exp(Y,k=5,plot=1)
k1=3 k2=3 Sample_T=100 p1=40 p2=20 Y=gen.data(Sample_T,p1,p2,k1,k2,tau=0.5,change=0) var.exp(Y,k=5,plot=1)