TokenTextSplitter
| Property | Pattern | Type | Deprecated | Definition | Title/Description |
|---|---|---|---|---|---|
| + implementation | No | const | No | - | Implementation |
| - chunk_size | No | integer | No | - | Chunk Size |
| - chunk_overlap | No | integer | No | - | Chunk Overlap |
| - keep_separator | No | boolean | No | - | Keep Separator |
| - strip_whitespace | No | boolean | No | - | Strip Whitespace |
| - encoding_name | No | string | No | - | Encoding Name |
| - model | No | string | No | - | Model |
| - allowed_special | No | Combination | No | - | Allowed Special |
| - disallowed_special | No | Combination | No | - | Disallowed Special |
1. Property implementation
Title: Implementation
| Type | const |
| Required | Yes |
Specific value: "TokenTextSplitter"
2. Property chunk_size
Title: Chunk Size
| Type | integer |
| Required | No |
| Default | 4000 |
Description: Maximum size of chunks to return
3. Property chunk_overlap
Title: Chunk Overlap
| Type | integer |
| Required | No |
| Default | 200 |
Description: Overlap in characters between chunks
4. Property keep_separator
Title: Keep Separator
| Type | boolean |
| Required | No |
| Default | false |
Description: Whether to keep the separator in the chunks
5. Property strip_whitespace
Title: Strip Whitespace
| Type | boolean |
| Required | No |
| Default | true |
Description: If True, strips whitespace from the start and end of every document
6. Property encoding_name
Title: Encoding Name
| Type | string |
| Required | No |
| Default | "gpt2" |
Description: Encoding name
7. Property model
Title: Model
| Type | string |
| Required | No |
| Default | null |
Description: Model name
8. Property allowed_special
Title: Allowed Special
| Type | combining |
| Required | No |
| Additional properties | [Any type: allowed] |
| Default | [] |
Description: Allowed special tokens
8.1. Property item 0
| Type | const |
| Required | No |
Specific value: "all"
8.2. Property item 1
| Type | array of string |
| Required | No |
| Array restrictions | |
|---|---|
| Min items | N/A |
| Max items | N/A |
| Items unicity | False |
| Additional items | False |
| Tuple validation | See below |
| Each item of this array must be | Description |
|---|---|
| item 1 items | - |
8.2.1. item 1 items
| Type | string |
| Required | No |
9. Property disallowed_special
Title: Disallowed Special
| Type | combining |
| Required | No |
| Additional properties | [Any type: allowed] |
| Default | "all" |
Description: Disallowed special tokens
9.1. Property item 0
| Type | const |
| Required | No |
Specific value: "all"
9.2. Property item 1
| Type | array of string |
| Required | No |
| Array restrictions | |
|---|---|
| Min items | N/A |
| Max items | N/A |
| Items unicity | False |
| Additional items | False |
| Tuple validation | See below |
| Each item of this array must be | Description |
|---|---|
| item 1 items | - |
9.2.1. item 1 items
| Type | string |
| Required | No |