public class AvroUtils extends Object
| Modifier and Type | Class and Description |
|---|---|
static class |
AvroUtils.AvroPathFilter |
| Modifier and Type | Field and Description |
|---|---|
static String |
FIELD_LOCATION_DELIMITER |
| Constructor and Description |
|---|
AvroUtils() |
| Modifier and Type | Method and Description |
|---|---|
static GenericRecord |
convertRecordSchema(GenericRecord record,
Schema newSchema)
Change the schema of an Avro record.
|
static Schema |
getDirectorySchema(Path directory,
Configuration conf,
boolean latest)
Get the latest avro schema for a directory
|
static Schema |
getDirectorySchema(Path directory,
FileSystem fs,
boolean latest)
Get the latest avro schema for a directory
|
static Optional<Schema.Field> |
getField(Schema schema,
String fieldLocation)
Given a GenericRecord, this method will return the field specified by the path parameter.
|
static Optional<Schema> |
getFieldSchema(Schema schema,
String fieldLocation)
Given a GenericRecord, this method will return the schema of the field specified by the path parameter.
|
static Optional<Object> |
getFieldValue(GenericRecord record,
String fieldLocation)
Given a GenericRecord, this method will return the field specified by the path parameter.
|
static Map<String,Object> |
getMultiFieldValue(GenericRecord record,
String fieldLocation) |
static Schema |
getSchemaFromDataFile(Path dataFile,
FileSystem fs)
Get Avro schema from an Avro data file.
|
static Schema |
nullifyFieldsForSchemaMerge(Schema oldSchema,
Schema newSchema)
Merge oldSchema and newSchame.
|
static Schema |
parseSchemaFromFile(Path filePath,
FileSystem fs)
Parse Avro schema from a schema file.
|
static byte[] |
recordToByteArray(GenericRecord record)
Convert a GenericRecord to a byte array.
|
static Optional<Schema> |
removeUncomparableFields(Schema schema)
Remove map, array, enum fields, as well as union fields that contain map, array or enum,
from an Avro schema.
|
static Path |
serializeAsPath(GenericRecord record,
boolean includeFieldNames,
boolean replacePathSeparators)
Serialize a generic record as a relative
Path. |
static GenericRecord |
slowDeserializeGenericRecord(byte[] serializedRecord,
Schema schema)
Deserialize a
GenericRecord from a byte array. |
static Schema |
switchName(Schema schema,
String newName)
Copies the input
Schema but changes the schema name. |
static Map<String,String> |
toStringMap(Object map)
Given a map: key -> value, return a map: key.toString() -> value.toString().
|
static void |
writeSchemaToFile(Schema schema,
Path filePath,
FileSystem fs,
boolean overwrite) |
static void |
writeSchemaToFile(Schema schema,
Path filePath,
FileSystem fs,
boolean overwrite,
FsPermission perm) |
public static final String FIELD_LOCATION_DELIMITER
public static Optional<Schema> getFieldSchema(Schema schema, String fieldLocation)
schema - is the record to retrieve the schema fromfieldLocation - is the location of the fieldpublic static Optional<Schema.Field> getField(Schema schema, String fieldLocation)
schema - is the record to retrieve the schema fromfieldLocation - is the location of the fieldpublic static Optional<Object> getFieldValue(GenericRecord record, String fieldLocation)
record - is the record to retrieve the field fromfieldLocation - is the location of the fieldpublic static Map<String,Object> getMultiFieldValue(GenericRecord record, String fieldLocation)
public static Map<String,String> toStringMap(Object map)
Utf8. This method helps to restore the original string map objectmap - a map objectpublic static GenericRecord convertRecordSchema(GenericRecord record, Schema newSchema) throws IOException
record - The Avro record whose schema is to be changed.newSchema - The target schema. It must be compatible as reader schema with record.getSchema() as writer schema.IOException - if conversion failed.public static byte[] recordToByteArray(GenericRecord record) throws IOException
IOExceptionpublic static Schema getSchemaFromDataFile(Path dataFile, FileSystem fs) throws IOException
IOExceptionpublic static Schema parseSchemaFromFile(Path filePath, FileSystem fs) throws IOException
IOExceptionpublic static void writeSchemaToFile(Schema schema, Path filePath, FileSystem fs, boolean overwrite) throws IOException
IOExceptionpublic static void writeSchemaToFile(Schema schema, Path filePath, FileSystem fs, boolean overwrite, FsPermission perm) throws IOException
IOExceptionpublic static Schema getDirectorySchema(Path directory, FileSystem fs, boolean latest) throws IOException
directory - the input dir that contains avro filesfs - the FileSystem for the given directory.latest - true to return latest schema, false to return oldest schemaIOExceptionpublic static Schema getDirectorySchema(Path directory, Configuration conf, boolean latest) throws IOException
directory - the input dir that contains avro filesconf - configurationlatest - true to return latest schema, false to return oldest schemaIOExceptionpublic static Schema nullifyFieldsForSchemaMerge(Schema oldSchema, Schema newSchema)
oldSchema - newSchema - public static Optional<Schema> removeUncomparableFields(Schema schema)
public static Schema switchName(Schema schema, String newName)
Schema but changes the schema name.public static Path serializeAsPath(GenericRecord record, boolean includeFieldNames, boolean replacePathSeparators)
Path. Useful for converting GenericRecord type keys
into file system locations. For example {field1=v1, field2=v2} returns field1=v1/field2=v2 if includeFieldNames
is true, or v1/v2 if it is false. Illegal HDFS tokens such as ':' and '\\' will be replaced with '_'.
Additionally, parameter replacePathSeparators controls whether to replace path separators ('/') with '_'.record - GenericRecord to serialize.includeFieldNames - If true, each token in the path will be of the form key=value, otherwise, only the value
will be included.replacePathSeparators - If true, path separators ('/') in each token will be replaced with '_'.public static GenericRecord slowDeserializeGenericRecord(byte[] serializedRecord, Schema schema) throws IOException
GenericRecord from a byte array. This method is not intended for high performance.IOException