@InterfaceStability.Evolving
public interface InputPartition<T>
extends java.io.Serializable
DataSourceReader.planInputPartitions() and is
responsible for creating the actual data reader of one RDD partition.
The relationship between InputPartition and InputPartitionReader
is similar to the relationship between Iterable and Iterator.
Note that InputPartitions will be serialized and sent to executors, then
InputPartitionReaders will be created on executors to do the actual reading. So
InputPartition must be serializable while InputPartitionReader doesn't need to
be.| Modifier and Type | Method and Description |
|---|---|
InputPartitionReader<T> |
createPartitionReader()
Returns an input partition reader to do the actual reading work.
|
default String[] |
preferredLocations()
The preferred locations where the input partition reader returned by this partition can run
faster, but Spark does not guarantee to run the input partition reader on these locations.
|
default String[] preferredLocations()
InputPartitionReader<T> createPartitionReader()