GH-891: Add ExtensionTypeWriterFactory to TransferPair #9
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
What's Changed
This PR simplifies extension type writer creation by moving from a factory-based pattern to a type-based pattern. Instead of passing ExtensionTypeWriterFactory instances through multiple API layers, extension types now provide their own writers via a new getNewFieldWriter() method on ArrowType.ExtensionType.
Added getNewFieldWriter(ValueVector) abstract method to ArrowType.ExtensionType
Removed ExtensionTypeWriterFactory interface and all implementations
Removed factory parameters from ComplexCopier, PromotableWriter, and TransferPair APIs
Updated UnionWriter to support extension types (previously threw UnsupportedOperationException)
Simplified extension type implementations (UuidType, OpaqueType)
The factory pattern didn't scale well. Each new extension type required creating a separate factory class and passing it through multiple API layers. This was especially painful for external developers who had to maintain two classes per extension type and manage factory parameters everywhere.
The new approach follows the same pattern as MinorType, where each type knows how to create its own writer. This reduces boilerplate, simplifies the API, and makes it easier to implement custom extension types outside arrow-java.
Breaking Changes
ExtensionTypeWriterFactory has been removed
Extension types must now implement getNewFieldWriter(ValueVector vector) method
ExtensionHolders must implement type() which returns the ExtensionType for that Holder
(Writers are obtained directly from the extension type, not from a factory)
Migration Guide
Extension types must now implement getNewFieldWriter(ValueVector vector) method
public class UuidType extends ExtensionType {
...
@OverRide
public FieldWriter getNewFieldWriter(ValueVector vector) {
return new UuidWriterImpl((UuidVector) vector);
}
...
}
ExtensionHolders must implement type() which returns the ExtensionType for that Holder
public class UuidHolder extends ExtensionHolder {
...
@OverRide
public ArrowType type() {
return UuidType.INSTANCE;
}
How to use Extension Writers?
Before:
writer.extension(UuidType.INSTANCE);
writer.addExtensionTypeWriterFactory(extensionTypeWriterFactory);
writer.writeExtension(value);
After:
writer.extension(UuidType.INSTANCE);
writer.writeExtension(value);
Also copyAsValue does not need to provide the factory anymore.
Closes apache#891 .