TableChopper (read ascii file)

configuration-examples TableChopper

The TableChopper is a scriptblock using the str-storagemanager and a set of GeoDMS functions to read data from a delimited ASCII File.

Delimited ASCII_Files are often csv files. The gdal.vect storagemanager is advised to read these files, use the TableChopper for very large files or for files with other separators as comma or semicolon.

To write data to delimited ASCII Files, use the TableComposer.

example

container TableChopper
{
   parameter<string>  filename := '%projdir%/data/TableChopper.csv';
   unit<uint8>        domain : nrofrows = 5;
   parameter<string>  fieldseparator := ';';

   parameter<string> filedata
   :   StorageType = "str"
   ,   StorageName = "=filename";

   parameter<string> headerline := readLines(filedata, void, 0);

   unit<uint32>field := Range(uint32, 0, strcount(headerline, fieldseparator) + 1)
   {
      attribute<string> name := ReadArray(headerline , field, string, 0);
   }
        
   attribute<string> bodylines (domain) := readLines(filedata, domain, headerline/ReadPos);

   container data := for_each_nedv(
          field/name
         ,'ReadElems(
             BodyLines
            ,string
           ,'+ MakeDefined(field/name[id(field)-1] + '/ReadPos','const(0, domain)')+'
         )'
         ,domain
         ,string
      );
}

explanation

  • The filename parameter refers to the ASCII file being read.
  • The uint8 configured domain-unit domain is used as domain_unit for the resulting attributes, in the example with a cardinality of 5.
  • The fieldseparator parameter configures the separator used in the ASCII file between the fields.

When the TableChopper is used as a template to read multiple files, these first three items are often used as case parameter.

  • The filedata string parameter refers to all the data from the ASCII file. It is read with the Str StorageManager.
  • The parameter headerline will read the first line from the filedata parameter.
  • The domain unit field is configured with as subitem name. This name attribute contains the names of the fields read from the header of the ASCII file.
  • The bodylines attribute will read the other (none header) lines from the filedata parameter.

The resulting data container will result in a subitem for each field. The bodylines attribute is split up in the separate values per field. The resulting items are all string attributes. Use conversion functions to convert the string values to desired values units / value types.