Qore jni Module 2.6.0
Loading...
Searching...
No Matches

Introduction to the WordDataProvider Module

The WordDataProvider module provides a data provider API for reading and writing Microsoft Word documents through the DataProvider API. It supports the modern .docx format (Word 2007+) using Apache POI's XWPF library.

The following classes are provided by this module:

Reading Word Documents

Reading Paragraphs

%requires WordDataProvider
# Read paragraphs from a file path
WordReadDataProvider dp("document.docx", {"content_type": "paragraphs"});
list<hash<auto>> paragraphs = map $1, dp.searchRecords();
# Each record has:
# - "text": The paragraph text content
# - "style": The paragraph style (e.g., "Normal", "Heading1")

Reading Tables

%requires WordDataProvider
# Read table data from a document
WordReadDataProvider dp("document.docx", {
"content_type": "table",
"table_index": 0, # First table (0-based index)
"header_row": True, # First row contains headers
});
list<hash<auto>> rows = map $1, dp.searchRecords();
# Read with explicit headers
WordReadDataProvider dp2("document.docx", {
"content_type": "table",
"headers": ("Name", "Department", "Salary"),
});

Read Options

  • path: Path to the Word document
  • stream: Input stream for Word data
  • data: Binary Word document data
  • content_type: "paragraphs" (default) or "table"
  • table_index: Index of table to read (0-based, for table mode)
  • header_row: If True, first row of table contains headers
  • headers: List of header names to use

Writing Word Documents

Writing Paragraphs

%requires WordDataProvider
# Write paragraphs to a file
WordWriteDataProvider dp("output.docx", {
"content_type": "paragraphs",
"title": "My Document",
});
dp.createRecord({"text": "First paragraph content", "style": "Normal"});
dp.createRecord({"text": "Section Heading", "style": "Heading1"});
dp.createRecord({"text": "More content here.", "style": "Normal"});
dp.commit(); # Write the file

Writing Tables

%requires WordDataProvider
# Write table data to a file
WordWriteDataProvider dp("output.docx", {
"content_type": "table",
"headers": ("Name", "Department", "Salary"),
"title": "Employee Directory",
});
dp.createRecord({"Name": "Alice Smith", "Department": "Engineering", "Salary": "75000"});
dp.createRecord({"Name": "Bob Johnson", "Department": "Marketing", "Salary": "65000"});
dp.commit(); # Write the file

Writing to Binary

%requires WordDataProvider
# Create document in memory
WordWriteDataProvider dp({
"content_type": "paragraphs",
"title": "In-Memory Document",
});
dp.createRecord({"text": "Some content"});
# Get binary data instead of writing to file
binary data = dp.getData();

Write Options

  • path: Output file path
  • stream: Output stream
  • content_type: "paragraphs" (default) or "table"
  • headers: List of column headers (for table mode)
  • title: Optional document title (added as Heading1)

Paragraph Styles

When writing paragraphs, the following style names are supported:

  • "Normal": Regular paragraph text
  • "Heading1": Main heading (bold, 16pt)
  • "Heading2": Secondary heading (bold, 14pt)
  • "Heading3": Tertiary heading (bold, 12pt)

Error Handling

Common exceptions:

  • WORD-READ-OPTION-ERROR: Invalid read options or option conflicts
  • WORD-WRITE-OPTION-ERROR: Invalid write options or option conflicts

Release Notes

WordDataProvider v1.0

  • initial release of the module
  • support for reading Word documents (.docx format)
  • support for writing Word documents (.docx format)
  • paragraph and table read/write modes
  • header row detection for tables
  • binary data input/output support