You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/datasources/FileIndex.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -59,7 +59,7 @@ Used when:
59
59
*`DataSource` is requested to [getOrInferFileFormatSchema](../DataSource.md#getOrInferFileFormatSchema) and [resolve a FileFormat-based relation](../DataSource.md#resolveRelation)
60
60
*`FallBackFileSourceV2` logical resolution rule is executed
61
61
*[FileScanBuilder](FileScanBuilder.md) is created
62
-
*`FileTable` is requested for [dataSchema](../connector/FileTable.md#dataSchema) and [partitioning](../connector/FileTable.md#partitioning)
62
+
*`FileTable` is requested for [dataSchema](FileTable.md#dataSchema) and [partitioning](FileTable.md#partitioning)
Copy file name to clipboardExpand all lines: docs/datasources/FileTable.md
+54-28Lines changed: 54 additions & 28 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,28 +1,28 @@
1
1
# FileTable
2
2
3
-
`FileTable` is an [extension](#contract) of the [Table](Table.md) abstraction for [file-backed tables](#implementations) with support for [read](SupportsRead.md) and [write](SupportsWrite.md).
3
+
`FileTable` is an [extension](#contract) of the [Table](../connector/Table.md) abstraction for [file-based tables](#implementations) with support for [read](../connector/SupportsRead.md) and [write](../connector/SupportsWrite.md).
Used when `FallBackFileSourceV2` extended resolution rule is executed (to resolve an `InsertIntoStatement` with a [DataSourceV2Relation](../logical-operators/DataSourceV2Relation.md) with a `FileTable`)
16
16
17
-
### <spanid="formatName"> formatName
17
+
### <spanid="formatName"> Format Name
18
18
19
19
```scala
20
20
formatName:String
21
21
```
22
22
23
23
Name of the file table (_format_)
24
24
25
-
### <spanid="inferSchema"> inferSchema
25
+
### <spanid="inferSchema"> Schema Inference
26
26
27
27
```scala
28
28
inferSchema(
@@ -53,7 +53,7 @@ Default: All [DataType](../types/DataType.md)s are supported by default
`dataSchema` is a Scala **lazy value** to guarantee that the code to initialize it is executed once only (when accessed for the first time) and cached afterwards.
94
96
97
+
---
98
+
95
99
`dataSchema` is used when:
96
100
97
101
*`FileTable` is requested for a [schema](#schema)
98
102
*_others_ (in [FileTables](#implementations))
99
103
100
-
## fileIndex
104
+
## <spanid="partitioning"> Partitioning
101
105
102
106
```scala
103
-
fileIndex:PartitioningAwareFileIndex
107
+
partitioning:Array[Transform]
104
108
```
105
109
106
-
`fileIndex`...FIXME
110
+
`partitioning` is part of the [Table](../connector/Table.md#partitioning) abstraction.
111
+
112
+
---
107
113
108
-
`fileIndex` is used when...FIXME
114
+
`partitioning`...FIXME
109
115
110
-
## partitioning
116
+
## <spanid="properties"> Properties
111
117
112
118
```scala
113
-
partitioning:Array[Transform]
119
+
properties: util.Map[String, String]
114
120
```
115
121
116
-
`partitioning`...FIXME
122
+
`properties` is part of the [Table](../connector/Table.md#properties) abstraction.
117
123
118
-
`partitioning` is part of the [Table](Table.md#partitioning) abstraction.
124
+
---
119
125
120
-
## properties
126
+
`properties` returns the [options](#options).
127
+
128
+
## <spanid="schema"> Table Schema
121
129
122
130
```scala
123
-
properties: util.Map[String, String]
131
+
schema:StructType
124
132
```
125
133
126
-
`properties` is simply the [options](#options).
134
+
`schema` is part of the [Table](../connector/Table.md#schema) abstraction.
127
135
128
-
`properties` is part of the [Table](Table.md#properties) abstraction.
`fileIndex` is a Scala **lazy value** to guarantee that the code to initialize it is executed once only (when accessed for the first time) and the computed value never changes afterwards.
148
+
149
+
Learn more in the [Scala Language Specification]({{ scala.spec }}/05-classes-and-objects.html#lazy).
150
+
151
+
`fileIndex` creates one of the following [PartitioningAwareFileIndex](PartitioningAwareFileIndex.md)s:
152
+
153
+
*`MetadataLogFileIndex` when reading from the results of a streaming query
154
+
*[InMemoryFileIndex](InMemoryFileIndex.md)
155
+
156
+
---
157
+
158
+
`fileIndex` is used when:
137
159
138
-
`schema` is part of the [Table](Table.md#schema) abstraction.
160
+
*[FileTable](FileTable.md#implementations)s are requested for [FileScanBuilder](FileScanBuilder.md#fileIndex)s
161
+
*`Dataset` is requested for the [inputFiles](../Dataset.md#inputFiles)
162
+
*`CacheManager` is requested to [lookupAndRefresh](../CacheManager.md#lookupAndRefresh)
163
+
*`FallBackFileSourceV2` is created
164
+
*`FileTable` is requested to [dataSchema](#dataSchema), [schema](#schema), [partitioning](#partitioning)
Copy file name to clipboardExpand all lines: docs/datasources/InMemoryFileIndex.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -21,7 +21,7 @@ While being created, `InMemoryFileIndex` [refresh0](#refresh0).
21
21
*`HiveMetastoreCatalog` is requested to [inferIfNeeded](../hive/HiveMetastoreCatalog.md#inferIfNeeded)
22
22
*`CatalogFileIndex` is requested for the [partitions by the given predicate expressions](CatalogFileIndex.md#filterPartitions) for a non-partitioned Hive table
23
23
*`DataSource` is requested to [createInMemoryFileIndex](../DataSource.md#createInMemoryFileIndex)
24
-
*`FileTable` is requested for a [PartitioningAwareFileIndex](../connector/FileTable.md#fileIndex)
24
+
*`FileTable` is requested for a [PartitioningAwareFileIndex](FileTable.md#fileIndex)
Copy file name to clipboardExpand all lines: docs/datasources/PartitioningAwareFileIndex.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -61,7 +61,7 @@ allFiles(): Seq[FileStatus]
61
61
62
62
*`DataSource` is requested to [getOrInferFileFormatSchema](../DataSource.md#getOrInferFileFormatSchema) and [resolveRelation](../DataSource.md#resolveRelation)
63
63
*`PartitioningAwareFileIndex` is requested for [files matching filters](#listFiles), [input files](#inputFiles), and [size](#sizeInBytes)
64
-
*`FileTable` is requested for a [data schema](../connector/FileTable.md#dataSchema)
64
+
*`FileTable` is requested for a [data schema](FileTable.md#dataSchema)
`newScanBuilder` creates a [ParquetScanBuilder](ParquetScanBuilder.md) (with the [fileIndex](../../connector/FileTable.md#fileIndex), the [schema](../../connector/FileTable.md#schema) and the [dataSchema](../../connector/FileTable.md#dataSchema)).
56
+
`newScanBuilder` is part of the [FileTable](../FileTable.md#newScanBuilder) abstraction.
57
+
58
+
---
49
59
50
-
`newScanBuilder` is part of the [FileTable](../../connector/FileTable.md#newScanBuilder) abstraction.
60
+
`newScanBuilder` creates a [ParquetScanBuilder](ParquetScanBuilder.md) with the following:
61
+
62
+
*[fileIndex](../FileTable.md#fileIndex)
63
+
*[schema](../FileTable.md#schema)
64
+
*[dataSchema](../FileTable.md#dataSchema)
65
+
*[options](#options)
51
66
52
67
## <spanid="newWriteBuilder"> newWriteBuilder
53
68
@@ -56,6 +71,26 @@ newWriteBuilder(
56
71
info: LogicalWriteInfo):WriteBuilder
57
72
```
58
73
59
-
`newWriteBuilder` creates a [WriteBuilder](../../connector/WriteBuilder.md) with [build](../../connector/WriteBuilder.md#build) that, when executed, creates a [ParquetWrite](ParquetWrite.md).
74
+
`newWriteBuilder` is part of the [FileTable](../FileTable.md#newWriteBuilder) abstraction.
75
+
76
+
---
77
+
78
+
`newWriteBuilder` creates a [WriteBuilder](../../connector/WriteBuilder.md) that creates a [ParquetWrite](ParquetWrite.md) (when requested to [build a Write](../../connector/WriteBuilder.md#build)).
79
+
80
+
## <spanid="supportsDataType"> supportsDataType
81
+
82
+
```scala
83
+
supportsDataType(
84
+
dataType: DataType):Boolean
85
+
```
86
+
87
+
`supportsDataType` is part of the [FileTable](../FileTable.md#supportsDataType) abstraction.
88
+
89
+
---
90
+
91
+
`supportsDataType` supports all [AtomicType](../../types/AtomicType.md)s and the following complex [DataType](../../types/DataType.md)s with `AtomicType`s:
60
92
61
-
`newWriteBuilder` is part of the [FileTable](../../connector/FileTable.md#newWriteBuilder) abstraction.
0 commit comments