Class TLcdCSVModelDecoder

java.lang.Object
com.luciad.format.csv.TLcdCSVModelDecoder
All Implemented Interfaces:
ILcdInputStreamFactoryCapable, ILcdModelDecoder

@LcdService(service=ILcdModelDecoder.class, priority=20000) public final class TLcdCSVModelDecoder extends Object implements ILcdModelDecoder, ILcdInputStreamFactoryCapable
This model decoder decodes data in character-separated files.

Input files

File Required Entry point Description
*.csv, *.tsv x x File containing the data as character-separated values. The accepted extensions can be configured using the setExtensions(String[]) method.
*.csvt File specifying the data type for each column. See the GDAL documentation and the GeoCSV specification for more information about this file. In addition to the specification it is possible to specify a 'GeoJson' (case-insensitive) data type or a 'coordinates' (case-insensitive)data type
*.cpg

A plain text file which specifies the codepage number and which is used to identify the character encoding of the .csv file. The file can contain a code page number or the name of a char set.

When the *.csvt file is present, it will be used to determine which columns contain the geometry information when using the decode(String) method. If such a file is not present, the decoder will try to auto-detect those columns based on the name of the column.

Even when a *.csvt file is present, it does not allow to specify all information (like for example the amount of header rows to skip). You can specify all this information by creating a TLcdCSVDataSource, and passing that to the decodeSource(ILcdDataSource) method.

Model reference

The model reference is obtained from an ILcdModelReferenceDecoder. The default reference decoder set on this model decoder is based on all model reference decoders annotated with the LcdService annotation, and can handle If this fails, the decoder's default model reference is returned. Unless set by the user, the default model reference is a WGS84 model reference.

Supported files

When using the decode(String) method, the source file can only be decoded when certain settings can be auto-detected, or match the default settings. The following list details what settings can be auto-detected, and which defaults are assumed:
  • First row of the csv file contains the column names, rest of the file is the data. If the first line contains a valid WKT or GeoJson string, the first line will be considered as data.
  • Separator: an attempt to auto-detect the separator is done by scanning the first line of the file. If the separator cannot be determined, a "," is assumed.
  • The column types will be determined by parsing the corresponding .csvt file. When no such file is available, the geometry columns will be detected by scanning the first lines of the source file. If the first line contains the column names, the geometry columns are determined based on the column names:
    • WKT geometries: if a column name starting with "WKT" is found (case-insensitive), it is assumed to contain WKT encoded geometries.
    • GeoJson geometries: if a column name starting with "geojson" is found (case-insensitive), it is assumed to contain GeoJson encoded geometries.
    • Point geometries: the following column names are recognized:
      • XY coordinates: a column named "coordinates"
      • X coordinate: a column named "x", or a column name containing any of the following: longitude, long, lon, easting, east (case insensitive)
      • Y coordinate: a column named "y", or a column name containing any of the following: latitude, lat, northing, north (case insensitive)
      • Z coordinate: a column named "z", or a column name containing any of the following: altitude, height, elevation (case insensitive)
    If that is not sufficient to find the geometry, the first data line will be scanned for a valid WKT, GeoJson or coordinates string.
  • If there is a column named "id" (case insensitive), it is assumed to be a unique identifier.
  • The encoding is determined by parsing the corresponding optional .cpg file. When no such file is present, the CSV file is assumed to be encoded as UTF-8.
If your csv file does not match these settings, you need to use the decodeSource(ILcdDataSource) method with a TLcdCSVDataSource which specifies the settings for your csv file.

Supported file transfer protocols

  • This model decoder supports all transfer protocols that are supported by the InputStreamFactory of this decoder.

Model structure

Model descriptor

Model elements

The model elements implement ILcdDataObject. The geometries can be retrieved using the ALcdShape.fromDomainObject(Object) method.

All other property fields of the data object will have a type as specified in the *.csvt file. If the file is not present, all fields will be considered Strings.

If one of the columns represents an ID, you can use the TLcdCSVDataSource.Builder.id(int) method to specify this column. This property corresponding to this column will be annotated with a TLcdPrimaryKeyAnnotation.

Sample code


 ILcdModelDecoder decoder = new TLcdCSVModelDecoder();
 ILcdModel model = decoder.decode("world.csv");
 

Thread safety

  • The decoding of models is not thread-safe.
  • The decoded models are thread-safe for read access.
Since:
2018.0
See Also:
  • Field Details

  • Constructor Details

    • TLcdCSVModelDecoder

      public TLcdCSVModelDecoder()
  • Method Details

    • getDisplayName

      public String getDisplayName()
      Description copied from interface: ILcdModelDecoder
      Returns a short, displayable name for the format that is decoded by this ILcdModelDecoder.
      Specified by:
      getDisplayName in interface ILcdModelDecoder
      Returns:
      the displayable name of this ILcdModelDecoder.
    • canDecodeSource

      public boolean canDecodeSource(String aSource)
      Checks whether this model decoder can decode the specified data source. It is acceptable for this method to return true for a source name while decode throws an exception for that same source name.

      For performance reasons, we strongly recommend that this will only be a simple test. For example: check the file extension of a file, but not that the file exists or contains expected content.

      Specified by:
      canDecodeSource in interface ILcdModelDecoder
      Parameters:
      aSource - the data source to be verified; typically a file name or a URL.
      Returns:
      true if the file extension is contained in getExtensions()
      See Also:
    • decode

      public ILcdModel decode(String aSourceName) throws IOException
      Description copied from interface: ILcdModelDecoder
      Creates a new model from the given data source.
      Specified by:
      decode in interface ILcdModelDecoder
      Parameters:
      aSourceName - the data source to be decoded; typically a file name or a URL.
      Returns:
      A model containing the decoded data. While null is allowed, implementors are advised to throw an error instead.
      Throws:
      IOException - for any exceptions caused by IO problems or invalid data. Since decoding invalid data almost always results in RunTimeExceptions (NullPointerException, IndexOutOfBoundsException, IllegalArgumentException, ...) on unexpected places, implementations are advised to catch RuntimeExceptions in their decode() method, and wrap them into an IOException, as illustrated in the code snippet below.
      
         public ILcdModel decode( String aSourceName ) throws IOException {
            try (InputStream input = fInputStreamFactory.createInputStream(aSourceName)) {
               // Perform decoding ...
            } catch (RuntimeException e) {
               throw new IOException(e);
            }
         }
       
      See Also:
    • canDecodeSource

      public boolean canDecodeSource(ILcdDataSource aDataSource)
      Description copied from interface: ILcdModelDecoder

      Checks whether this model decoder can decode the data source(s), identified by the passed ILcdDataSource.

      For performance reasons, we strongly recommend that this will only be a simple test. For example: check the instance class of aDataSource, or check the file extension if it is a TLcdDataSource.

      The default implementation of this method will check if the given ILcdDataSource is a TLcdDataSource. If not, this method returns false. Otherwise, it delegates the source to the ILcdModelDecoder.canDecodeSource(String) method.

      Specified by:
      canDecodeSource in interface ILcdModelDecoder
      Parameters:
      aDataSource - the ILcdModelSource to be verified.
      Returns:
      true if this decoder can likely decode the data specified by aDataSource, false otherwise.
      See Also:
    • decodeSource

      public ILcdModel decodeSource(ILcdDataSource aDataSource) throws IOException
      Description copied from interface: ILcdModelDecoder

      Creates a new model from the given data source.

      By default, this method:

      Specified by:
      decodeSource in interface ILcdModelDecoder
      Parameters:
      aDataSource - the ILcdDataSource to be decoded.
      Returns:
      a model containing the decoded data. While null is allowed, implementors are advised to throw an error instead.
      Throws:
      IOException - for any exceptions caused by IO problems or invalid data. Since decoding invalid data almost always results in RunTimeExceptions (NullPointerException, IndexOutOfBoundsException, IllegalArgumentException, ...) on unexpected places, implementations are advised to catch RuntimeExceptions in their decode() method, and wrap them into an IOException, as illustrated in the code snippet below.
      
       public ILcdModel decodeSource(ILcdDataSource aDataSource) throws IOException {
         try {
           // Perform decoding ...
         } catch (RuntimeException e) {
           throw new IOException(e);
         }
       }
       
      See Also:
    • setDefaultModelReference

      public void setDefaultModelReference(ILcdModelReference aDefaultModelReference)
      Sets the model reference to be used for models when the model reference decoder is set to null.
      Parameters:
      aDefaultModelReference - the model reference to be used for models when the model reference decoder is set to null.
      See Also:
    • getDefaultModelReference

      public ILcdModelReference getDefaultModelReference()
      Returns the model reference to be used for models when the model reference decoder is set to null.
      Returns:
      the model reference to be used for models when the model reference decoder is set to null.
      See Also:
    • getModelReferenceDecoder

      public ILcdModelReferenceDecoder getModelReferenceDecoder()
      Returns the decoder used to produce model references.
      Returns:
      the decoder to produce model references.
      See Also:
    • setModelReferenceDecoder

      public void setModelReferenceDecoder(ILcdModelReferenceDecoder aModelReferenceDecoder)
      Sets the decoder to use to produce model references for models created with this decoder.
      Parameters:
      aModelReferenceDecoder - the decoder to use to produce model references for models created with this decoder.
      See Also:
    • getInputStreamFactory

      public ILcdInputStreamFactory getInputStreamFactory()
      Description copied from interface: ILcdInputStreamFactoryCapable
      Returns the input stream factory that is used.
      Specified by:
      getInputStreamFactory in interface ILcdInputStreamFactoryCapable
      Returns:
      the input stream factory that is used.
    • setInputStreamFactory

      public void setInputStreamFactory(ILcdInputStreamFactory aInputStreamFactory)
      Description copied from interface: ILcdInputStreamFactoryCapable
      Sets the input stream factory to be used.
      Specified by:
      setInputStreamFactory in interface ILcdInputStreamFactoryCapable
      Parameters:
      aInputStreamFactory - the input stream factory to be used.
    • getExtensions

      public String[] getExtensions()
      Returns the current file extensions which are used by the canDecodeSource(String) method to determine whether or not to accept a file
      Returns:
      the current file extensions which are used by the canDecodeSource(String) method to determine whether or not to accept a file
    • setExtensions

      public void setExtensions(String[] aExtensions)
      Sets the file extensions to use by the canDecodeSource(String) method to determine whether or not to accept a file
      Parameters:
      aExtensions - The extensions to accept
    • getDateFormat

      public DateFormat getDateFormat()
      Returns the date format used for parsing TLcdCSVDataSource.ColumnType.DATE_TIME columns. The default is a SimpleDateFormat with pattern "yyyy-MM-dd HH:mm:ssXXX".
      Returns:
      the used format
      Since:
      2020.1
    • setDateFormat

      public void setDateFormat(DateFormat aDateFormat)
      Configures the date format to use for parsing TLcdCSVDataSource.ColumnType.DATE_TIME columns.
      Parameters:
      aDateFormat - the format to use
      Since:
      2020.1