Class AbstractArrowResultChunk

java.lang.Object
com.databricks.jdbc.api.impl.arrow.AbstractArrowResultChunk
Direct Known Subclasses:
ArrowResultChunk, ArrowResultChunkV2

public abstract class AbstractArrowResultChunk extends Object
An abstract class that represents a chunk of query result.

This class provides methods for downloading, processing, and releasing the data in the chunk. It also manages the state of the chunk and provides access to the data as Arrow record batches.

  • Field Details

    • SECONDS_BUFFER_FOR_EXPIRY

      protected static final Integer SECONDS_BUFFER_FOR_EXPIRY
    • numRows

      protected final long numRows
    • rowOffset

      protected final long rowOffset
    • chunkIndex

      protected final long chunkIndex
    • statementId

      protected final StatementId statementId
    • rootAllocator

      protected final org.apache.arrow.memory.BufferAllocator rootAllocator
    • chunkReadyFuture

      protected final CompletableFuture<Void> chunkReadyFuture
      Future to track when the chunk becomes ready for consumption. This includes both the download and processing phases. The state of the Future is updated by the ChunkDownloadTask and indicates when the chunk's data is fully processed and available for use.
    • stateMachine

      protected final ArrowResultChunkStateMachine stateMachine
    • recordBatchList

      protected List<List<org.apache.arrow.vector.ValueVector>> recordBatchList
    • expiryTime

      protected Instant expiryTime
    • errorMessage

      protected String errorMessage
    • arrowMetadata

      protected List<String> arrowMetadata
    • chunkReadyTimeoutSeconds

      protected int chunkReadyTimeoutSeconds
  • Constructor Details

    • AbstractArrowResultChunk

      protected AbstractArrowResultChunk(long numRows, long rowOffset, long chunkIndex, StatementId statementId, ChunkStatus initialStatus, ExternalLink chunkLink, Instant expiryTime, int chunkReadyTimeoutSeconds)
  • Method Details

    • getChunkIndex

      public Long getChunkIndex()
      Returns the index of this chunk.
      Returns:
      chunk index
    • isChunkLinkInvalid

      public boolean isChunkLinkInvalid()
      Checks if the chunk link is invalid or expired.
      Returns:
      true if link is invalid, false otherwise
    • releaseChunk

      public boolean releaseChunk()
      Releases all resources associated with this chunk.
      Returns:
      true if chunk was released, false if it was already released
    • setChunkLink

      public void setChunkLink(ExternalLink chunk)
      Sets the external link details for this chunk.
      Parameters:
      chunk - the external link information
    • getStatus

      public ChunkStatus getStatus()
      Returns the current status of the chunk.
      Returns:
      current ChunkStatus
    • downloadData

      protected abstract void downloadData(IDatabricksHttpClient httpClient, CompressionCodec compressionCodec, double speedThreshold) throws DatabricksParsingException, IOException
      Downloads and initializes data for this chunk using the provided HTTP client and compression codec.
      Parameters:
      httpClient - the HTTP client to use for downloading
      compressionCodec - the compression codec to use for decompression
      speedThreshold - the minimum expected download speed in MB/s for logging warnings
      Throws:
      DatabricksParsingException - if there is an error parsing the data
      IOException - if there is an error downloading or reading the data
    • handleFailure

      protected abstract void handleFailure(Exception exception, ChunkStatus failedStatus) throws DatabricksParsingException
      Handles a failure during the download or processing of this chunk.
      Throws:
      DatabricksParsingException
    • getRecordBatchCountInChunk

      protected int getRecordBatchCountInChunk()
      Returns the number of record batches in the chunk.
      Returns:
      number of record batches
    • getRecordBatchList

      protected List<List<org.apache.arrow.vector.ValueVector>> getRecordBatchList()
      Returns the list of record batches, where each record batch is a list of value vectors.
      Returns:
      List of record batches
    • getNumRows

      protected long getNumRows()
      Returns the total number of rows in the chunk.
      Returns:
      number of rows
    • getColumnVector

      protected org.apache.arrow.vector.ValueVector getColumnVector(int recordBatchIndex, int columnIndex)
      Returns the value vector for a specific record batch and column.
      Parameters:
      recordBatchIndex - index of the record batch
      columnIndex - index of the column
      Returns:
      ValueVector for the specified position
    • setStatus

      protected void setStatus(ChunkStatus targetStatus)
      Updates the status of the chunk.
      Parameters:
      targetStatus - new status to set
    • getChunkIterator

      protected ArrowResultChunkIterator getChunkIterator()
      Returns an iterator for traversing the rows in this chunk.
      Returns:
      ArrowResultChunkIterator for this chunk
    • getChunkReadyFuture

      protected CompletableFuture<Void> getChunkReadyFuture()
    • waitForChunkReady

      protected void waitForChunkReady() throws ExecutionException, InterruptedException, TimeoutException
      Waits for the chunk to be ready for consumption.
      Throws:
      ExecutionException - if the chunk download or processing throws an exception
      InterruptedException - if the thread is interrupted while waiting
      TimeoutException - if the chunk is not ready within the timeout
    • initializeData

      protected void initializeData(InputStream inputStream) throws DatabricksSQLException, IOException
      Decompresses the given InputStream and initializes recordBatchList from decompressed stream.
      Parameters:
      inputStream - the input stream to decompress
      Throws:
      DatabricksSQLException - if decompression fails
      IOException - if reading from the stream fails
    • getArrowMetadata

      protected List<String> getArrowMetadata()