***

title: create_chunks
slug: /reference/python/agents/search/document-processor/create-chunks
description: Split document content into chunks using the configured chunking strategy.
max-toc-depth: 3
---------------------

For a complete index of all SignalWire documentation pages, fetch https://signalwire.com/docs/llms.txt

Split document content into chunks using the configured chunking strategy.
Each chunk includes metadata about its source file, section, and position
within the original document.

<Note>
  The `content` parameter should be the actual text content of the document,
  not a file path. Use the appropriate extraction method first for binary formats.
</Note>

## **Parameters**

<ParamField path="content" type="str" required={true} toc={true}>
  Document text content to chunk.
</ParamField>

<ParamField path="filename" type="str" required={true} toc={true}>
  Name of the source file, used for metadata in each chunk.
</ParamField>

<ParamField path="file_type" type="str" required={true} toc={true}>
  File extension or type (e.g., `"md"`, `"py"`, `"txt"`).
</ParamField>

## **Returns**

`list[dict]` -- A list of chunk dictionaries, each containing:

* `content` (str) -- the chunk text
* `filename` (str) -- source filename
* `section` (str | None) -- section name or hierarchy path
* `start_line` (int | None) -- starting line number in the source
* `end_line` (int | None) -- ending line number in the source
* `metadata` (dict) -- additional metadata (file type, word count, chunk method, etc.)

## **Example**

```python {8}
from signalwire.search import DocumentProcessor

processor = DocumentProcessor(chunking_strategy="paragraph")

with open("README.md") as f:
    content = f.read()

chunks = processor.create_chunks(content, "README.md", "md")
for chunk in chunks:
    print(f"[{chunk['section']}] {len(chunk['content'])} chars")
```