Package htsjdk.samtools.cram.structure
Class Slice
- java.lang.Object
-
- htsjdk.samtools.cram.structure.Slice
-
public class Slice extends Object
A CRAM slice is a logical construct that is just a subset of the blocks in a Slice. NOTE: Every Slice has a reference context (it is either single-reference (mapped), multi-reference, or unmapped), reflecting depending on the records it contains. Single-ref mapped doesn't mean that the records are necessarily (that is, that their getMappedRead flag is true), only that the records in that slice are PLACED on the corresponding reference contig.
-
-
Field Summary
Fields Modifier and Type Field Description static int
EMBEDDED_REFERENCE_ABSENT_CONTENT_ID
static int
UNINITIALIZED_INDEXING_PARAMETER
-
Constructor Summary
Constructors Constructor Description Slice(CRAMVersion cramVersion, CompressionHeader compressionHeader, InputStream inputStream, long containerByteOffset)
Create a slice by reading a serialized Slice from an input stream.Slice(List<CRAMCompressionRecord> records, CompressionHeader compressionHeader, long containerByteOffset, long globalRecordCounter)
Create a single Slice from CRAM Compression Records and a Compression Header.
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description ArrayList<CRAMCompressionRecord>
deserializeCRAMRecords(CompressorCache compressorCache, ValidationStringency validationStringency)
Reads and decodes the underlying blocks and returns a list of CRAMCompressionRecord.AlignmentContext
getAlignmentContext()
List<BAIEntry>
getBAIEntries(CompressorCache compressorCache)
Generate a BAIEntry Index entry from this Slice and other container parameters, splitting Multiple Reference slices into constituent reference sequence entries.long
getBaseCount()
int
getByteOffsetOfSliceHeaderBlock()
The Slice's offset in bytes from the beginning of the Container's Compression Header (or the end of the Container Header), equal toContainerHeader.getLandmarks()
Used by BAI and CRAI indexingint
getByteSizeOfSliceBlocks()
The Slice's size in bytes Used by CRAI indexing onlyCompressionHeader
getCompressionHeader()
List<Integer>
getContentIDs()
List<CRAIEntry>
getCRAIEntries(CompressorCache compressorCache)
Generate a CRAI Index entry from this Slice and other container parameters, splitting Multiple Reference slices into constituent reference sequence entries.Block
getEmbeddedReferenceBlock()
Return the embedded reference block, if any.int
getEmbeddedReferenceContentID()
Get the content ID of the embedded reference block.long
getGlobalRecordCounter()
Map<ReferenceContext,AlignmentSpan>
getMultiRefAlignmentSpans(CompressorCache compressorCache, ValidationStringency validationStringency)
Uses a Multiple Reference Slice Alignment Reader to determine the reference spans of a MULTI_REF Slice.int
getNumberOfBlocks()
int
getNumberOfRecords()
byte[]
getReferenceMD5()
SliceBlocks
getSliceBlocks()
Block
getSliceHeaderBlock()
SAMBinaryTagAndValue
getSliceTags()
void
normalizeCRAMRecords(List<CRAMCompressionRecord> cramCompressionRecords, CRAMReferenceRegion cramReferenceRegion)
Normalize a list of CRAMCompressionRecord that have been read in from a CRAM stream.void
setAttribute(String tag, Object value)
Set a value for the tag.void
setByteOffsetOfSliceHeaderBlock(int byteOffsetOfSliceHeaderBlock)
void
setByteSizeOfSliceBlocks(int byteSizeOfSliceBlocks)
void
setEmbeddedReferenceBlock(Block embeddedReferenceBlock)
void
setEmbeddedReferenceContentID(int embeddedReferenceBlockContentID)
Set the content ID of the embedded reference block.void
setLandmarkIndex(int landmarkIndex)
void
setReferenceMD5(byte[] ref)
String
toString()
void
write(CRAMVersion cramVersion, OutputStream outputStream)
-
-
-
Field Detail
-
UNINITIALIZED_INDEXING_PARAMETER
public static final int UNINITIALIZED_INDEXING_PARAMETER
- See Also:
- Constant Field Values
-
EMBEDDED_REFERENCE_ABSENT_CONTENT_ID
public static final int EMBEDDED_REFERENCE_ABSENT_CONTENT_ID
- See Also:
- Constant Field Values
-
-
Constructor Detail
-
Slice
public Slice(CRAMVersion cramVersion, CompressionHeader compressionHeader, InputStream inputStream, long containerByteOffset)
Create a slice by reading a serialized Slice from an input stream.- Parameters:
cramVersion
- the version of the CRAM stream being readcompressionHeader
- the compression header for the contain in which the Slice residesinputStream
- the input stream to be readcontainerByteOffset
- the stream byte offset of start of the container in which this Slice resides
-
Slice
public Slice(List<CRAMCompressionRecord> records, CompressionHeader compressionHeader, long containerByteOffset, long globalRecordCounter)
Create a single Slice from CRAM Compression Records and a Compression Header. The caller is responsible for appropriate subdivision of records into containers and slices (see ContainerFactory}.- Parameters:
records
- input CRAM Compression RecordscompressionHeader
- the enclosingContainer
's Compression HeadercontainerByteOffset
-globalRecordCounter
-- See Also:
CRAMCompressionRecord.isPlaced()
,ReferenceContextType
-
-
Method Detail
-
getSliceHeaderBlock
public Block getSliceHeaderBlock()
-
getAlignmentContext
public AlignmentContext getAlignmentContext()
-
getSliceBlocks
public SliceBlocks getSliceBlocks()
-
getNumberOfRecords
public int getNumberOfRecords()
-
getGlobalRecordCounter
public long getGlobalRecordCounter()
-
getNumberOfBlocks
public int getNumberOfBlocks()
- Returns:
- the number of blocks as defined by the CRAM spec; this is 1 for the core block plus the number of external blocks (does not include the slice header block);
-
getReferenceMD5
public byte[] getReferenceMD5()
-
getByteOffsetOfSliceHeaderBlock
public int getByteOffsetOfSliceHeaderBlock()
The Slice's offset in bytes from the beginning of the Container's Compression Header (or the end of the Container Header), equal toContainerHeader.getLandmarks()
Used by BAI and CRAI indexing
-
setByteOffsetOfSliceHeaderBlock
public void setByteOffsetOfSliceHeaderBlock(int byteOffsetOfSliceHeaderBlock)
-
getByteSizeOfSliceBlocks
public int getByteSizeOfSliceBlocks()
The Slice's size in bytes Used by CRAI indexing only
-
setByteSizeOfSliceBlocks
public void setByteSizeOfSliceBlocks(int byteSizeOfSliceBlocks)
-
setLandmarkIndex
public void setLandmarkIndex(int landmarkIndex)
-
getBaseCount
public long getBaseCount()
-
getSliceTags
public SAMBinaryTagAndValue getSliceTags()
-
setEmbeddedReferenceContentID
public void setEmbeddedReferenceContentID(int embeddedReferenceBlockContentID)
Set the content ID of the embedded reference block. Per the CRAM spec, the value can be -1 (EMBEDDED_REFERENCE_ABSENT_CONTENT_ID
) to indicate no embedded reference block is present. If the reference block content ID already has a non-EMBEDDED_REFERENCE_ABSENT_CONTENT_ID
value, it cannot be reset. If the embedded reference block has already been set, the provided reference block content ID must agree with the content ID of the existing block.- Parameters:
embeddedReferenceBlockContentID
-
-
getEmbeddedReferenceContentID
public int getEmbeddedReferenceContentID()
Get the content ID of the embedded reference block. Per the CRAM spec, the value can beEMBEDDED_REFERENCE_ABSENT_CONTENT_ID
(-1) to indicate no embedded reference block is present.- Returns:
- id of embedded reference block if present, otherwise
EMBEDDED_REFERENCE_ABSENT_CONTENT_ID
-
setEmbeddedReferenceBlock
public void setEmbeddedReferenceBlock(Block embeddedReferenceBlock)
-
getEmbeddedReferenceBlock
public Block getEmbeddedReferenceBlock()
Return the embedded reference block, if any.- Returns:
- embedded reference block. May be null.
-
getCompressionHeader
public CompressionHeader getCompressionHeader()
-
deserializeCRAMRecords
public ArrayList<CRAMCompressionRecord> deserializeCRAMRecords(CompressorCache compressorCache, ValidationStringency validationStringency)
Reads and decodes the underlying blocks and returns a list of CRAMCompressionRecord. This isn't done initially when the blocks are read from the underlying stream since there are cases where we want to iterate through containers or slices and consume the underlying blocks, but not actually pay the price to decode the records (i.e., during indexing, or when satisfying index queries). The CRAMRecords returned from this are not normalized (read bases, quality scores and mates have not been resolved). SeenormalizeCRAMRecords(java.util.List<htsjdk.samtools.cram.structure.CRAMCompressionRecord>, htsjdk.samtools.cram.build.CRAMReferenceRegion)
for more information about normalization.- Parameters:
compressorCache
- cached compressor objects to use to decode streamsvalidationStringency
- validation stringency to use- Returns:
- list of raw (not normalized) CRAMCompressionRecord for this Slice (
normalizeCRAMRecords(java.util.List<htsjdk.samtools.cram.structure.CRAMCompressionRecord>, htsjdk.samtools.cram.build.CRAMReferenceRegion)
)
-
normalizeCRAMRecords
public void normalizeCRAMRecords(List<CRAMCompressionRecord> cramCompressionRecords, CRAMReferenceRegion cramReferenceRegion)
Normalize a list of CRAMCompressionRecord that have been read in from a CRAM stream. Normalization converts raw CRAM records to a state suitable for conversion to SAMRecords, resolving read bases against the reference, as well as quality scores and mates. The records in this list being normalized should be the records from a Slice, not an entire Container, since the relative positions of mate records are determined relative to the Slice (downstream) offsets. NOTE: This mutates (normalizes) the CRAM records in place.- Parameters:
cramCompressionRecords
- CRAMCompressionRecords to normalizecramReferenceRegion
- the reference region for this slice
-
write
public void write(CRAMVersion cramVersion, OutputStream outputStream)
-
setReferenceMD5
public void setReferenceMD5(byte[] ref)
-
setAttribute
public void setAttribute(String tag, Object value)
Set a value for the tag.- Parameters:
tag
- tag ID as a short integer as returned bySAMTag.makeBinaryTag(String)
value
- tag value
-
getMultiRefAlignmentSpans
public Map<ReferenceContext,AlignmentSpan> getMultiRefAlignmentSpans(CompressorCache compressorCache, ValidationStringency validationStringency)
Uses a Multiple Reference Slice Alignment Reader to determine the reference spans of a MULTI_REF Slice. Used for creating CRAI/BAI index entries.- Parameters:
validationStringency
- how strict to be when reading CRAM records
-
getCRAIEntries
public List<CRAIEntry> getCRAIEntries(CompressorCache compressorCache)
Generate a CRAI Index entry from this Slice and other container parameters, splitting Multiple Reference slices into constituent reference sequence entries.- Returns:
- a list of CRAI Index Entries derived from this Slice
-
getBAIEntries
public List<BAIEntry> getBAIEntries(CompressorCache compressorCache)
Generate a BAIEntry Index entry from this Slice and other container parameters, splitting Multiple Reference slices into constituent reference sequence entries.- Returns:
- a list of BAIEntry Index Entries derived from this Slice
-
-