Qore CsvUtil Module Reference  1.4
 All Classes Namespaces Files Functions Variables Modules Pages
CsvUtil Module

Introduction to the CsvUtil Module

The CsvUtil module provides functionality for parsing CSV-like files.

To use this module, use "%requires CsvUtil" in your code.

All the public symbols in the module are defined in the CsvUtil namespace

Currently the module provides the following classes:

Note that the CsvFileIterator class can be used to parse arbitrary text files; the field separator character can be specified in the constructor, as well as the quote character and end of line sequence. See the constructor documentation for more information.

Examples:

#!/usr/bin/env qore
%new-style
%requires CsvUtil
CsvFileIterator i("example-file.csv");
CsvFileWriter writer("example-file-copy.csv");
while (i.next()) {
printf("%d: %y\n", i.index(), i.getValue());
writer.writeLine(i.getValue());
}

If "example-file.csv" is:

UK,1234567890,"Sony, Xperia S",31052012
UK,1234567891,"Sony, Xperia S",31052012
UK,1234567892,"Sony, Xperia S",31052012
UK,1234567893,"Sony, Xperia S",31052012

The data is read verbatim, each value is returned as a string, header names are generated numerically; the output is:

1: {0: "UK", 1: "1234567890", 2: "Sony, Xperia S", 3: "31052012"}
2: {0: "UK", 1: "1234567891", 2: "Sony, Xperia S", 3: "31052012"}
3: {0: "UK", 1: "1234567892", 2: "Sony, Xperia S", 3: "31052012"}
4: {0: "UK", 1: "1234567893", 2: "Sony, Xperia S", 3: "31052012"}

Also the "example-file-copy.csv" will contain data from the original file formatted as CSV.

If header names are provided and field types are specified, the output looks different:

#!/usr/bin/env qore
%new-style
%requires CsvFileIterator
CsvFileIterator i("example-file.csv", ("headers": ("cc", "serno", "desc", "received"), "fields": ("serno": "int", "received": ("type": "date", "format": "DDMMYYYY"))));
while (i.next())
printf("%d: %y\n", i.index(), i.getValue());

Now the hash keys in each record returned are those given in the constructor, and the fields "serno" and "received" are given other data types; this produces:

1: {cc: "UK", serno: 1234567890, desc: "Sony, Xperia S", received: 2012-05-31 00:00:00 Thu +02:00 (CEST)}
2: {cc: "UK", serno: 1234567891, desc: "Sony, Xperia S", received: 2012-05-31 00:00:00 Thu +02:00 (CEST)}
3: {cc: "UK", serno: 1234567892, desc: "Sony, Xperia S", received: 2012-05-31 00:00:00 Thu +02:00 (CEST)}
4: {cc: "UK", serno: 1234567893, desc: "Sony, Xperia S", received: 2012-05-31 00:00:00 Thu +02:00 (CEST)}

Use the "header-lines" and "header-names" options to automatically read the header names from the file if present. Use the "fields" option to describe the fields and perform transformations on the data read. For more information, see the CsvFileIterator class.

Release Notes

Version 1.4

  • fixed the "format" field option when used with "*date" field types
  • implemented the "tolwr" parser option
  • changed the default field type when parsing and generating CSV files from "string" to "*string"

Version 1.3

  • added the "write-headers" option to CsvUtil::AbstractCsvWriter and subclasses to enable headers to be suppressed
  • added the "optimal-quotes" option to CsvUtil::AbstractCsvWriter and subclasses to enable more efficient csv output (now the default output option); to revert back to the previous behavior (where all fields are quoted regardless of data type or content), set to False in the constructor

Version 1.2

Version 1.1

Version 1.0

  • initial version of module