Package com.databricks.jdbc.common.util
Class ArrowUtil
- java.lang.Object
-
- com.databricks.jdbc.common.util.ArrowUtil
-
public final class ArrowUtil extends Object
Utility class for Arrow operations.Provides methods for:
- Converting Thrift/Hive schemas to Arrow schemas and serialization
- Creating Arrow IPC byte streams from Thrift responses
- Processing Arrow batches with decompression
This consolidates Arrow handling logic used by both streaming and lazy inline Arrow result handlers.
-
-
Method Summary
All Methods Static Methods Concrete Methods Modifier and Type Method Description static org.apache.arrow.vector.types.pojo.FieldcolumnDescToArrowField(TColumnDesc columnDesc)Creates an Arrow Field from a Thrift column descriptor.static ByteArrayInputStreamcreateArrowByteStream(byte[] cachedSchema, TFetchResultsResp response, Class<?> callerClass)Creates a ByteArrayInputStream containing Arrow IPC data from the response.static List<ColumnInfo>getColumnInfoList(TGetResultSetMetadataResp resultManifest)Extracts column information from Thrift result set metadata.static byte[]getSerializedSchema(TGetResultSetMetadataResp metadata)Gets the serialized Arrow schema from Thrift metadata.static longgetTotalRowsInResponse(TFetchResultsResp response)Gets the total row count from all Arrow batches in the response.static org.apache.arrow.vector.types.pojo.SchemahiveSchemaToArrowSchema(TTableSchema hiveSchema)Converts a Hive TTableSchema to an Arrow Schema.
-
-
-
Method Detail
-
getSerializedSchema
public static byte[] getSerializedSchema(TGetResultSetMetadataResp metadata) throws DatabricksParsingException
Gets the serialized Arrow schema from Thrift metadata.If the metadata contains a pre-serialized Arrow schema, it is returned directly. Otherwise, the Hive schema is converted to Arrow format and serialized.
- Parameters:
metadata- The Thrift result set metadata- Returns:
- The serialized Arrow schema bytes
- Throws:
DatabricksParsingException- if schema conversion or serialization fails
-
hiveSchemaToArrowSchema
public static org.apache.arrow.vector.types.pojo.Schema hiveSchemaToArrowSchema(TTableSchema hiveSchema) throws DatabricksParsingException
Converts a Hive TTableSchema to an Arrow Schema.- Parameters:
hiveSchema- The Hive table schema from Thrift- Returns:
- The equivalent Arrow schema
- Throws:
DatabricksParsingException- if conversion fails
-
columnDescToArrowField
public static org.apache.arrow.vector.types.pojo.Field columnDescToArrowField(TColumnDesc columnDesc) throws SQLException
Creates an Arrow Field from a Thrift column descriptor.- Parameters:
columnDesc- The Thrift column descriptor- Returns:
- The equivalent Arrow field
- Throws:
SQLException- if type mapping fails
-
createArrowByteStream
public static ByteArrayInputStream createArrowByteStream(byte[] cachedSchema, TFetchResultsResp response, Class<?> callerClass) throws DatabricksParsingException
Creates a ByteArrayInputStream containing Arrow IPC data from the response.This method combines the cached schema with decompressed Arrow batches to create a complete Arrow IPC stream that can be parsed by Arrow readers.
- Parameters:
cachedSchema- The serialized Arrow schema bytes (should be cached from first response)response- The Thrift fetch response containing Arrow batchescallerClass- The calling class for logging context- Returns:
- ByteArrayInputStream containing the Arrow IPC data
- Throws:
DatabricksParsingException- if processing fails
-
getTotalRowsInResponse
public static long getTotalRowsInResponse(TFetchResultsResp response)
Gets the total row count from all Arrow batches in the response.- Parameters:
response- The Thrift fetch response- Returns:
- The total number of rows across all batches
-
getColumnInfoList
public static List<ColumnInfo> getColumnInfoList(TGetResultSetMetadataResp resultManifest) throws DatabricksSQLException
Extracts column information from Thrift result set metadata.Converts each column descriptor in the Thrift schema to a
ColumnInfoobject.- Parameters:
resultManifest- The Thrift result set metadata containing schema information- Returns:
- A list of ColumnInfo objects, empty list if schema is null
- Throws:
DatabricksSQLException
-
-