Help for package MissCP

Type:

Package

Title:

Change Point Detection with Missing Values

Version:

0.1.1

Author:

Yanxi Liu [aut, cre], Abolfazl Safikhani [aut]

Maintainer:

Yanxi Liu <liuyanxi@ufl.edu>

Description:

A four step change point detection method that can detect break points with the presence of missing values proposed by Liu and Safikhani (2023) https://drive.google.com/file/d/1a8sV3RJ8VofLWikTDTQ7W4XJ76cEj4Fg/view?usp=drive_link.

License:

GPL-2

Encoding:

UTF-8

Imports:

stats, graphics, mvtnorm, factoextra, Rcpp, ggplot2, glmnet

LinkingTo:

Rcpp, RcppArmadillo

Suggests:

knitr, rmarkdown

VignetteBuilder:

knitr

RoxygenNote:

7.2.3

NeedsCompilation:

yes

Packaged:

2025-02-17 11:48:51 UTC; yanxiliu

Repository:

CRAN

Date/Publication:

2025-02-17 12:30:07 UTC

BIC

Description

BIC and HBIC function

Usage

BIC(residual, phi)

Arguments

residual

residual matrix

phi

estimated coefficient matrix of the model

Value

A list object, which contains the followings

BIC: BIC value
HBIC: HBIC value

BIC_threshold

Description

BIC threshold for final parameter estimation

Usage

BIC_threshold(
  beta.final,
  k,
  m.hat,
  brk,
  data_y,
  data_x = NULL,
  b_n = 2,
  nlam = 20
)

Arguments

beta.final

estimated parameter coefficient matrices

k

dimensions of parameter coefficient matrices

m.hat

number of estimated change points

brk

vector of estimated change points

data_y

input data matrix (response), with each column representing the time series component

data_x

input data matrix (predictor), with each column 1

b_n

the block size

nlam

number of hyperparameters for grid search

Value

lambda.val.best, the tuning parameter lambda selected by BIC.

BTIE

Description

Perform the BTIE algorithm to detect the structural breaks in large scale high-dimensional mean shift models.

Usage

BTIE(
  data_y,
  lambda.1.cv = NULL,
  lambda.2.cv = NULL,
  max.iteration = 100,
  tol = 10^(-2),
  block.size = NULL,
  refit = FALSE,
  optimal.block = TRUE,
  optimal.gamma.val = 1.5,
  block.range = NULL
)

Arguments

data_y

input data matrix (response), with each column representing the time series component

lambda.1.cv

tuning parmaeter lambda_1 for fused lasso

lambda.2.cv

tuning parmaeter lambda_2 for fused lasso

max.iteration

max number of iteration for the fused lasso

tol

tolerance for the fused lasso

block.size

the block size

refit

logical; if TRUE, refit the model, if FALSE, use BIC to find a thresholding value and then output the parameter estimates without refitting. Default is FALSE.

optimal.block

logical; if TRUE, grid search to find optimal block size, if FALSE, directly use the default block size. Default is TRUE.

optimal.gamma.val

hyperparameter for optimal block size, if optimal.blocks == TRUE. Default is 1.5.

block.range

the search domain for optimal block size.

Value

A list object, which contains the followings

Examples

set.seed(1)
n <- 1000;
p <- 50;
brk <-  c(333, 666, n+1)
m <- length(brk)
d <- 5
constant.full <- constant_generation(n, p, d, 50, brk)
e.sigma <- as.matrix(1*diag(p))
data_y <- data_generation(n = n, mu = constant.full, sigma = e.sigma, brk = brk)
data_y <- as.matrix(data_y, ncol = p.y)
data_y_miss <- MCAR(data_y, 0.3)
temp <- BTIE(data_y_miss, optimal.block = FALSE, block.size = 30)
temp$cp.final

Heter_missing

Description

function to do the missing assuming the missing completely at random

Usage

Heter_missing(data, alpha)

Arguments

data

data before the missing case

alpha

the list of percentage of missing compared to whole data

Value

the data matrix with missing values

MCAR

Description

function to do the missing assuming the missing completely at random

Usage

MCAR(data, alpha)

Arguments

data

data before the missing case

alpha

the percentage of missing compared to whole data

Value

the data matrix with missing values

constant_generation

Description

function to generate constant given jump size and break points

Usage

constant_generation(n, p, d, vns, brk)

Arguments

n

the sample size

p

the data dimension

d

the number of nonzero coeddficients

vns

the jump size. It can be a vector or a single value. If single value, it is same for all break points

brk

the break points' locations

Value

the parameter matrix used to generate data

data_generation

Description

The function to generate mean shift data

Usage

data_generation(n, mu, sigma, brk = n + 1)

Arguments

n

the number of data points

mu

the matrix of mean parameter

sigma

covariance matrix of the white noise

brk

vector of change points

Value

data_y matrix of generated mean shift data

first.step

Description

Perform the block fused lasso with thresholding to detect candidate break points.

Usage

first.step(
  data_y,
  data_x,
  lambda1,
  lambda2,
  max.iteration = max.iteration,
  tol = tol,
  blocks,
  cv.index,
  fixed_index = NULL,
  nonfixed_index = NULL
)

Arguments

data_y

input data matrix Y, with each column representing the time series component

data_x

input data matrix X

lambda1

tuning parmaeter lambda_1 for fused lasso

lambda2

tuning parmaeter lambda_2 for fused lasso

max.iteration

max number of iteration for the fused lasso

tol

tolerance for the fused lasso

blocks

the blocks

cv.index

the index of time points for cross-validation

fixed_index

index for linear regression model with only partial compoenents change.

nonfixed_index

index for linear regression model with only partial compoenents change.

Value

A list object, which contains the followings

jump.l2: estimated jump size in L2 norm
jump.l1: estimated jump size in L1 norm
pts.list: estimated change points in the first step
beta.full: estimated parameters in the first step

imputation

Description

function to do the imputation based on block size

Usage

imputation(data, block.size)

Arguments

data

data before the imputation

block.size

the block size that are used to impute the missing

Value

the data matrix without missing values after imputation

imputation2

Description

function to do the imputation based on change point candidate

Usage

imputation2(data, cp.candidate)

Arguments

data

data before the imputation

cp.candidate

the change point candidate that are used to impute the missing

Value

the data matrix without missing values after imputation

pred

Description

function to do the prediction

Usage

pred(X, phi, j, p.x, p.y, h = 1)

Arguments

X

data for prediction

phi

parameter matrix

j

the start time point for prediction

p.x

the dimension of data X

p.y

the dimension of data Y

h

the length of observation to predict

Value

prediction matrix

pred.block

Description

Prediction function (block)

Usage

pred.block(X, phi, j, p.x, p.y, h)

Arguments

X

data for prediction

phi

parameter matrix

j

the start time point for prediction

p.x

the dimension of data X

p.y

the dimension of data Y

h

the length of observation to predict

Value

prediction matrix

second.step

Description

Reimputate the missing values and perform the exhaustive search to "thin out" redundant break points.

Usage

second.step(
  data_y,
  data_x,
  max.iteration = max.iteration,
  tol = tol,
  cp.first,
  beta.est,
  blocks,
  data_y_miss
)

Arguments

data_y

input data matrix, with each column representing the time series component

data_x

input data matrix

max.iteration

max number of iteration for the fused lasso

tol

tolerance for the fused lasso

cp.first

the selected break points after the first step

beta.est

the estiamted parameters by block fused lasso

blocks

the blocks

data_y_miss

the data y matrix before the first imputation

Value

A list object, which contains the followings

cp.final: a set of selected break point after the exhaustive search step
beta.hat.list: the estimated coefficient matrix for each segmentation