Coverage for src/qdrant_loader/core/chunking/strategy/html/__init__.py: 100%

5 statements  

« prev     ^ index     » next       coverage.py v7.10.6, created at 2025-09-11 07:21 +0000

1"""HTML strategy package with modular components for HTML document chunking. 

2 

3This package contains HTML-specific implementations of the chunking strategy components: 

4- HTMLDocumentParser: Parses HTML DOM structure and semantic elements 

5- HTMLSectionSplitter: Intelligently splits HTML content based on semantic boundaries 

6- HTMLMetadataExtractor: Extracts HTML-specific metadata (DOM paths, accessibility, etc.) 

7- HTMLChunkProcessor: Creates HTML chunk documents with enhanced metadata 

8""" 

9 

10from .html_chunk_processor import HTMLChunkProcessor 

11from .html_document_parser import HTMLDocumentParser 

12from .html_metadata_extractor import HTMLMetadataExtractor 

13from .html_section_splitter import HTMLSectionSplitter 

14 

15__all__ = [ 

16 "HTMLDocumentParser", 

17 "HTMLSectionSplitter", 

18 "HTMLMetadataExtractor", 

19 "HTMLChunkProcessor", 

20]