HiCParser 1.0.0
HiCParser is based on other packages and in particular in those that have implemented the infrastructure needed for dealing with HiC data with several replicates and conditions. Is provides several parsers, for several HiC data standard format to import them into R in a InteractionSet object.
HiCParserWe hope that HiCParser will be useful for your research. Please use the following information to cite the package and the overall approach. Thank you!
## Citation info
citation("HiCParser")
#> To cite package 'HiCParser' in publications use:
#>
#> Maigne E, Zytnicki M (2025). _HiCParser package to parse HiC data and
#> import them in R_. doi:10.18129/B9.bioc.HiCParser
#> <https://doi.org/10.18129/B9.bioc.HiCParser>,
#> https://github.com/emaigne/HiCParser/HiCParser - R package version
#> 0.99.0, <http://www.bioconductor.org/packages/HiCParser>.
#>
#> A BibTeX entry for LaTeX users is
#>
#> @Manual{,
#> title = {{HiCParser} package to parse HiC data and import them in R},
#> author = {Elise Maigne and Matthias Zytnicki},
#> year = {2025},
#> url = {http://www.bioconductor.org/packages/HiCParser},
#> note = {https://github.com/emaigne/HiCParser/HiCParser - R package version 0.99.0},
#> doi = {10.18129/B9.bioc.HiCParser},
#> }
HiCParserlibrary("HiCParser")
HiCParser can import Hi-C data sets in various different formats:
- Cooler .cool or .mcool files.
- Juicer .hic files.
- HiC-Pro .matrix and .bed files.
- Tabular (.tsv, .csv, …) files.
.cool filesTo load .cool files generated by [Cooler][cooler-documentation]
[@cooler]:
# Path to each file
paths <- c(
"path/to/condition-1.replicate-1.cool",
"path/to/condition-1.replicate-2.cool",
"path/to/condition-1.replicate-3.cool",
"path/to/condition-2.replicate-1.cool",
"path/to/condition-2.replicate-2.cool",
"path/to/condition-2.replicate-3.cool"
)
# For the sake of the example, we will use the same file, several times
paths <- rep(
system.file("extdata",
"hicsample_21.cool",
package = "HiCParser"
),
6
)
# Condition and replicate of each file. Can be names instead of numbers.
conditions <- c(1, 1, 1, 2, 2, 2)
replicates <- c(1, 2, 3, 1, 2, 3)
# Instantiation of data set
hic.experiment <- parseCool(
paths,
conditions = conditions,
replicates = replicates
)
#> Loading required namespace: rhdf5
#>
#> Parsing '/tmp/RtmpvTvvxB/Rinst1c117c190ee654/HiCParser/extdata/hicsample_21.cool'.
#>
#> Parsing '/tmp/RtmpvTvvxB/Rinst1c117c190ee654/HiCParser/extdata/hicsample_21.cool'.
#>
#> Parsing '/tmp/RtmpvTvvxB/Rinst1c117c190ee654/HiCParser/extdata/hicsample_21.cool'.
#>
#> Parsing '/tmp/RtmpvTvvxB/Rinst1c117c190ee654/HiCParser/extdata/hicsample_21.cool'.
#>
#> Parsing '/tmp/RtmpvTvvxB/Rinst1c117c190ee654/HiCParser/extdata/hicsample_21.cool'.
#>
#> Parsing '/tmp/RtmpvTvvxB/Rinst1c117c190ee654/HiCParser/extdata/hicsample_21.cool'.
.mcool filesTo load .mcool files generated by [Cooler][cooler-documentation]
[@cooler]:
# Path to each file
paths <- c(
"path/to/condition-1.replicate-1.mcool",
"path/to/condition-1.replicate-2.mcool",
"path/to/condition-1.replicate-3.mcool",
"path/to/condition-2.replicate-1.mcool",
"path/to/condition-2.replicate-2.mcool",
"path/to/condition-2.replicate-3.mcool"
)
# For the sake of the example, we will use the same file, several times
paths <- rep(
system.file("extdata",
"hicsample_21.mcool",
package = "HiCParser"
),
6
)
# Condition and replicate of each file. Can be names instead of numbers.
conditions <- c(1, 1, 1, 2, 2, 2)
replicates <- c(1, 2, 3, 1, 2, 3)
# mcool files can store several resolutions.
# We will mention the one we need.
binSize <- 5000000
# Instantiation of data set
# The same function "parseCool" is used for cool and mcool files
hic.experiment <- parseCool(
paths,
conditions = conditions,
replicates = replicates,
binSize = binSize # Specified for .mcool files.
)
#>
#> Parsing '/tmp/RtmpvTvvxB/Rinst1c117c190ee654/HiCParser/extdata/hicsample_21.mcool'.
#>
#> Parsing '/tmp/RtmpvTvvxB/Rinst1c117c190ee654/HiCParser/extdata/hicsample_21.mcool'.
#>
#> Parsing '/tmp/RtmpvTvvxB/Rinst1c117c190ee654/HiCParser/extdata/hicsample_21.mcool'.
#>
#> Parsing '/tmp/RtmpvTvvxB/Rinst1c117c190ee654/HiCParser/extdata/hicsample_21.mcool'.
#>
#> Parsing '/tmp/RtmpvTvvxB/Rinst1c117c190ee654/HiCParser/extdata/hicsample_21.mcool'.
#>
#> Parsing '/tmp/RtmpvTvvxB/Rinst1c117c190ee654/HiCParser/extdata/hicsample_21.mcool'.
To load .hic files generated by [Juicer][juicer-documentation] [@juicer]:
# Path to each file
paths <- c(
"path/to/condition-1.replicate-1.hic",
"path/to/condition-1.replicate-2.hic",
"path/to/condition-2.replicate-1.hic",
"path/to/condition-2.replicate-2.hic",
"path/to/condition-3.replicate-1.hic"
)
# For the sake of the example, we will use the same file, several times
paths <- rep(
system.file("extdata",
"hicsample_21.hic",
package = "HiCParser"
),
6
)
# Condition and replicate of each file. Can be names instead of numbers.
conditions <- c(1, 1, 1, 2, 2, 2)
replicates <- c(1, 2, 3, 1, 2, 3)
# hic files can store several resolutions.
# We will mention the one we need.
binSize <- 5000000
# Instantiation of data set
hic.experiment <- parseHiC(
paths,
conditions = conditions,
replicates = replicates,
binSize = binSize
)
#>
#> Parsing '/tmp/RtmpvTvvxB/Rinst1c117c190ee654/HiCParser/extdata/hicsample_21.hic'.
#>
#> Parsing '/tmp/RtmpvTvvxB/Rinst1c117c190ee654/HiCParser/extdata/hicsample_21.hic'.
#>
#> Parsing '/tmp/RtmpvTvvxB/Rinst1c117c190ee654/HiCParser/extdata/hicsample_21.hic'.
#>
#> Parsing '/tmp/RtmpvTvvxB/Rinst1c117c190ee654/HiCParser/extdata/hicsample_21.hic'.
#>
#> Parsing '/tmp/RtmpvTvvxB/Rinst1c117c190ee654/HiCParser/extdata/hicsample_21.hic'.
#>
#> Parsing '/tmp/RtmpvTvvxB/Rinst1c117c190ee654/HiCParser/extdata/hicsample_21.hic'.
Currently, HiCParser supports the hic format up to the version 9.
To load .matrix and .bed files generated by [HiC-Pro][hicpro-documentation]
[@hicpro]:
# Path to each matrix file
matrixPaths <- c(
"path/to/condition-1.replicate-1.matrix",
"path/to/condition-1.replicate-2.matrix",
"path/to/condition-1.replicate-3.matrix",
"path/to/condition-2.replicate-1.matrix",
"path/to/condition-2.replicate-2.matrix",
"path/to/condition-2.replicate-3.matrix"
)
# For the sake of the example, we will use the same file, several times
matrixPaths <- rep(
system.file("extdata",
"hicsample_21.matrix",
package = "HiCParser"
),
6
)
# Path to each bed file
bedPaths <- c(
"path/to/condition-1.replicate-1.bed",
"path/to/condition-1.replicate-2.bed",
"path/to/condition-1.replicate-3.bed",
"path/to/condition-2.replicate-1.bed",
"path/to/condition-2.replicate-2.bed",
"path/to/condition-2.replicate-3.bed"
)
# Alternatively, if the same bed file is used, we can provide it only once
bedPaths <- system.file("extdata",
"hicsample_21.bed",
package = "HiCParser"
)
# Condition and replicate of each file. Can be names instead of numbers.
conditions <- c(1, 1, 1, 2, 2, 2)
replicates <- c(1, 2, 3, 1, 2, 3)
# Instantiation of data set
hic.experiment <- parseHiCPro(
matrixPaths = matrixPaths,
bedPaths = bedPaths,
conditions = conditions,
replicates = replicates
)
#>
#> Parsing '/tmp/RtmpvTvvxB/Rinst1c117c190ee654/HiCParser/extdata/hicsample_21.matrix' and '/tmp/RtmpvTvvxB/Rinst1c117c190ee654/HiCParser/extdata/hicsample_21.bed'.
#>
#> Parsing '/tmp/RtmpvTvvxB/Rinst1c117c190ee654/HiCParser/extdata/hicsample_21.matrix' and '/tmp/RtmpvTvvxB/Rinst1c117c190ee654/HiCParser/extdata/hicsample_21.bed'.
#>
#> Parsing '/tmp/RtmpvTvvxB/Rinst1c117c190ee654/HiCParser/extdata/hicsample_21.matrix' and '/tmp/RtmpvTvvxB/Rinst1c117c190ee654/HiCParser/extdata/hicsample_21.bed'.
#>
#> Parsing '/tmp/RtmpvTvvxB/Rinst1c117c190ee654/HiCParser/extdata/hicsample_21.matrix' and '/tmp/RtmpvTvvxB/Rinst1c117c190ee654/HiCParser/extdata/hicsample_21.bed'.
#>
#> Parsing '/tmp/RtmpvTvvxB/Rinst1c117c190ee654/HiCParser/extdata/hicsample_21.matrix' and '/tmp/RtmpvTvvxB/Rinst1c117c190ee654/HiCParser/extdata/hicsample_21.bed'.
#>
#> Parsing '/tmp/RtmpvTvvxB/Rinst1c117c190ee654/HiCParser/extdata/hicsample_21.matrix' and '/tmp/RtmpvTvvxB/Rinst1c117c190ee654/HiCParser/extdata/hicsample_21.bed'.
A tabular file is a tab-separated multi-replicate sparse matrix with a header:
chromosome position 1 position 2 C1.R1 C1.R2 C1.R3 ...
Y 1500000 7500000 145 184 72 ...
The number of interactions between position 1 and position 2 of
chromosome are reported in each condition.replicate column. There is no
limit to the number of conditions and replicates.
To load Hi-C data in this format:
hic.experiment <- parseTabular(
system.file("extdata",
"hicsample_21.tsv",
package = "HiCParser"
),
sep = "\t"
)
#> Parsing '/tmp/RtmpvTvvxB/Rinst1c117c190ee654/HiCParser/extdata/hicsample_21.tsv'.
The output is a InteractionSet. This object can store one or several samples.
Please read the documentation associated with the InteractionSet package to known more about this format.
library("HiCParser")
hicFilePath <- system.file("extdata", "hicsample_21.hic", package = "HiCParser")
hic.experiment <- parseHiC(
paths = rep(hicFilePath, 6),
binSize = 5000000,
conditions = rep(seq(2), each = 3),
replicates = rep(seq(3), 2)
)
#>
#> Parsing '/tmp/RtmpvTvvxB/Rinst1c117c190ee654/HiCParser/extdata/hicsample_21.hic'.
#>
#> Parsing '/tmp/RtmpvTvvxB/Rinst1c117c190ee654/HiCParser/extdata/hicsample_21.hic'.
#>
#> Parsing '/tmp/RtmpvTvvxB/Rinst1c117c190ee654/HiCParser/extdata/hicsample_21.hic'.
#>
#> Parsing '/tmp/RtmpvTvvxB/Rinst1c117c190ee654/HiCParser/extdata/hicsample_21.hic'.
#>
#> Parsing '/tmp/RtmpvTvvxB/Rinst1c117c190ee654/HiCParser/extdata/hicsample_21.hic'.
#>
#> Parsing '/tmp/RtmpvTvvxB/Rinst1c117c190ee654/HiCParser/extdata/hicsample_21.hic'.
hic.experiment
#> class: InteractionSet
#> dim: 44 6
#> metadata(0):
#> assays(1): ''
#> rownames: NULL
#> rowData names(1): chromosome
#> colnames: NULL
#> colData names(2): condition replicate
#> type: StrictGInteractions
#> regions: 9
The conditions and replicates are reported in the colData slot :
SummarizedExperiment::colData(hic.experiment)
#> DataFrame with 6 rows and 2 columns
#> condition replicate
#> <integer> <integer>
#> 1 1 1
#> 2 1 2
#> 3 1 3
#> 4 2 1
#> 5 2 2
#> 6 2 3
They corresponds to columns of the assays matrix (containing
interactions values):
head(SummarizedExperiment::assay(hic.experiment))
#> [,1] [,2] [,3] [,4] [,5] [,6]
#> [1,] 79 79 79 79 79 79
#> [2,] 22 22 22 22 22 22
#> [3,] 3 3 3 3 3 3
#> [4,] 1 1 1 1 1 1
#> [5,] 1 1 1 1 1 1
#> [6,] 2 2 2 2 2 2
The positions of interactions are in the interactions slot of the object:
InteractionSet::interactions(hic.experiment)
#> StrictGInteractions object with 44 interactions and 1 metadata column:
#> seqnames1 ranges1 seqnames2 ranges2 | chromosome
#> <Rle> <IRanges> <Rle> <IRanges> | <Rle>
#> [1] 21 5000001-10000000 --- 21 5000001-10000000 | 21
#> [2] 21 5000001-10000000 --- 21 10000001-15000000 | 21
#> [3] 21 5000001-10000000 --- 21 15000001-20000000 | 21
#> [4] 21 5000001-10000000 --- 21 20000001-25000000 | 21
#> [5] 21 5000001-10000000 --- 21 25000001-30000000 | 21
#> ... ... ... ... ... ... . ...
#> [40] 21 35000001-40000000 --- 21 40000001-45000000 | 21
#> [41] 21 35000001-40000000 --- 21 45000001-50000000 | 21
#> [42] 21 40000001-45000000 --- 21 40000001-45000000 | 21
#> [43] 21 40000001-45000000 --- 21 45000001-50000000 | 21
#> [44] 21 45000001-50000000 --- 21 45000001-50000000 | 21
#> -------
#> regions: 9 ranges and 1 metadata column
#> seqinfo: 1 sequence from an unspecified genome; no seqlengths
A function mergeInteractionSet to merge InteractionSet objects,
from the same experiment (for differents replicates or conditions).
It merges the the data containing bins of interactions and fill the assays matrix accordingly, returning an assays matrix with several columns.
The object returned by the function is an InteractionSet.
Here is a fictitious example:
path <- system.file("extdata", "hicsample_21.cool", package = "HiCParser")
object1 <- parseCool(path, conditions = 1, replicates = 1)
#>
#> Parsing '/tmp/RtmpvTvvxB/Rinst1c117c190ee654/HiCParser/extdata/hicsample_21.cool'.
# Creating an object with a different condition
object2 <- parseCool(path, conditions = 2, replicates = 1)
#>
#> Parsing '/tmp/RtmpvTvvxB/Rinst1c117c190ee654/HiCParser/extdata/hicsample_21.cool'.
The merged object:
objectMerged <- mergeInteractionSet(object1, object2)
SummarizedExperiment::colData(objectMerged)
#> DataFrame with 2 rows and 2 columns
#> condition replicate
#> <numeric> <numeric>
#> 1 1 1
#> 2 2 1
head(SummarizedExperiment::assay(objectMerged))
#> [,1] [,2]
#> [1,] 79 79
#> [2,] 22 22
#> [3,] 3 3
#> [4,] 1 1
#> [5,] 1 1
#> [6,] 2 2
This package was developed using biocthis.
R session information.
#> ─ Session info ───────────────────────────────────────────────────────────────────────────────────────────────────────
#> setting value
#> version R version 4.5.0 RC (2025-04-04 r88126)
#> os Ubuntu 24.04.2 LTS
#> system x86_64, linux-gnu
#> ui X11
#> language (EN)
#> collate C
#> ctype en_US.UTF-8
#> tz America/New_York
#> date 2025-04-15
#> pandoc 2.7.3 @ /usr/bin/ (via rmarkdown)
#> quarto 1.5.57 @ /usr/local/bin/quarto
#>
#> ─ Packages ───────────────────────────────────────────────────────────────────────────────────────────────────────────
#> package * version date (UTC) lib source
#> abind 1.4-8 2024-09-12 [2] CRAN (R 4.5.0)
#> Biobase 2.68.0 2025-04-15 [2] Bioconductor 3.21 (R 4.5.0)
#> BiocGenerics 0.54.0 2025-04-15 [2] Bioconductor 3.21 (R 4.5.0)
#> BiocManager 1.30.25 2024-08-28 [2] CRAN (R 4.5.0)
#> BiocStyle * 2.36.0 2025-04-15 [2] Bioconductor 3.21 (R 4.5.0)
#> bookdown 0.43 2025-04-15 [2] CRAN (R 4.5.0)
#> bslib 0.9.0 2025-01-30 [2] CRAN (R 4.5.0)
#> cachem 1.1.0 2024-05-16 [2] CRAN (R 4.5.0)
#> cli 3.6.4 2025-02-13 [2] CRAN (R 4.5.0)
#> crayon 1.5.3 2024-06-20 [2] CRAN (R 4.5.0)
#> data.table 1.17.0 2025-02-22 [2] CRAN (R 4.5.0)
#> DelayedArray 0.34.0 2025-04-15 [2] Bioconductor 3.21 (R 4.5.0)
#> digest 0.6.37 2024-08-19 [2] CRAN (R 4.5.0)
#> evaluate 1.0.3 2025-01-10 [2] CRAN (R 4.5.0)
#> fastmap 1.2.0 2024-05-15 [2] CRAN (R 4.5.0)
#> generics 0.1.3 2022-07-05 [2] CRAN (R 4.5.0)
#> GenomeInfoDb 1.44.0 2025-04-15 [2] Bioconductor 3.21 (R 4.5.0)
#> GenomeInfoDbData 1.2.14 2025-04-10 [2] Bioconductor
#> GenomicRanges 1.60.0 2025-04-15 [2] Bioconductor 3.21 (R 4.5.0)
#> gtools 3.9.5 2023-11-20 [2] CRAN (R 4.5.0)
#> HiCParser * 1.0.0 2025-04-15 [1] Bioconductor 3.21 (R 4.5.0)
#> htmltools 0.5.8.1 2024-04-04 [2] CRAN (R 4.5.0)
#> httr 1.4.7 2023-08-15 [2] CRAN (R 4.5.0)
#> InteractionSet 1.36.0 2025-04-15 [2] Bioconductor 3.21 (R 4.5.0)
#> IRanges 2.42.0 2025-04-15 [2] Bioconductor 3.21 (R 4.5.0)
#> jquerylib 0.1.4 2021-04-26 [2] CRAN (R 4.5.0)
#> jsonlite 2.0.0 2025-03-27 [2] CRAN (R 4.5.0)
#> knitr 1.50 2025-03-16 [2] CRAN (R 4.5.0)
#> lattice 0.22-7 2025-04-02 [3] CRAN (R 4.5.0)
#> lifecycle 1.0.4 2023-11-07 [2] CRAN (R 4.5.0)
#> Matrix 1.7-3 2025-03-11 [3] CRAN (R 4.5.0)
#> MatrixGenerics 1.20.0 2025-04-15 [2] Bioconductor 3.21 (R 4.5.0)
#> matrixStats 1.5.0 2025-01-07 [2] CRAN (R 4.5.0)
#> pbapply 1.7-2 2023-06-27 [2] CRAN (R 4.5.0)
#> R6 2.6.1 2025-02-15 [2] CRAN (R 4.5.0)
#> Rcpp 1.0.14 2025-01-12 [2] CRAN (R 4.5.0)
#> rhdf5 2.52.0 2025-04-15 [2] Bioconductor 3.21 (R 4.5.0)
#> rhdf5filters 1.20.0 2025-04-15 [2] Bioconductor 3.21 (R 4.5.0)
#> Rhdf5lib 1.30.0 2025-04-15 [2] Bioconductor 3.21 (R 4.5.0)
#> rlang 1.1.6 2025-04-11 [2] CRAN (R 4.5.0)
#> rmarkdown 2.29 2024-11-04 [2] CRAN (R 4.5.0)
#> S4Arrays 1.8.0 2025-04-15 [2] Bioconductor 3.21 (R 4.5.0)
#> S4Vectors 0.46.0 2025-04-15 [2] Bioconductor 3.21 (R 4.5.0)
#> sass 0.4.10 2025-04-11 [2] CRAN (R 4.5.0)
#> sessioninfo * 1.2.3 2025-02-05 [2] CRAN (R 4.5.0)
#> SparseArray 1.8.0 2025-04-15 [2] Bioconductor 3.21 (R 4.5.0)
#> SummarizedExperiment 1.38.0 2025-04-15 [2] Bioconductor 3.21 (R 4.5.0)
#> UCSC.utils 1.4.0 2025-04-15 [2] Bioconductor 3.21 (R 4.5.0)
#> xfun 0.52 2025-04-02 [2] CRAN (R 4.5.0)
#> XVector 0.48.0 2025-04-15 [2] Bioconductor 3.21 (R 4.5.0)
#> yaml 2.3.10 2024-07-26 [2] CRAN (R 4.5.0)
#>
#> [1] /tmp/RtmpvTvvxB/Rinst1c117c190ee654
#> [2] /home/biocbuild/bbs-3.21-bioc/R/site-library
#> [3] /home/biocbuild/bbs-3.21-bioc/R/library
#> * ── Packages attached to the search path.
#>
#> ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
Lun ATL, Perry M and Ing-Simmons E (2016). Infrastructure for genomic interactions: Bioconductor classes for Hi-C, ChIA-PET and related experiments. F1000Res. 5, 950