This article explains how you can make LuciadLightspeed decode data from Amazon S3. It helps you understand:
Accessing data using the Amazon Web Services (AWS) SDK
Updating the AWS SDK
The expected dataset structure on S3
Accessing data using the AWS SDK
As explained in Working with models, LuciadLightspeed accesses data by decoding the data into a model. The decoding is the responsibility of an ILcdModelDecoder. Different model decoders support different transfer protocols for the data they decode. However, many model decoders use a common abstraction to access the data: an ILcdInputStreamFactory. The model decoder usually indicates this by implementing the ILcdInputStreamFactoryCapable interface.
The default implementation of
ILcdInputStreamFactory, TLcdInputStreamFactory, supports file system and HTTP(S) file transfers.
It knows nothing about connecting to AWS though.
In particular, it doesn’t know how to authenticate or how to perform partial object transfers.
LuciadLightspeed also provides other implementations of
One such implementation supports accessing an Amazon bucket using s3://, let’s call it
It’s implemented using the
S3Client API from the AWS SDK for Java 2.x.
S3InputStreamFactory, you must configure the environment in which you’re using LuciadLightspeed with the appropriate
region settings for
ILcdInputStreamFactoryCapable model decoders can then decode data when you give them an
s3://-style URL as the source name.
All LuciadLightspeed model decoders that use an
ILcdInputStreamFactory to access the data are initially configured with a composite input stream factory. It’s backed by all the implementations
that the Java service loader makes available.
The services loader makes both the default implementation and
S3InputStreamFactory available by default.
Implementing your own S3InputStreamFactory
If the internal
S3InputStreamFactory doesn’t satisfy your needs, you can replace it with your own implementation.
To make this easier, its entire implementation is also available as sample code,
If you modify the sample code and add it to your application’s classpath, it will be picked up by the composite input stream
factory used by the model decoders.
When you implement your own version of the
S3InputStreamFactory, we recommend that you remove
lcd_aws.jar from the classpath.
Otherwise, you end up with two different input stream factories which both state that they can handle S3 URLs.
For LuciadFusion, you’ll also need to remove
platform/lcd_fusionplatform_resources_aws.jar from classpath, and include a customized version of the
S3ResourceConnector, which is available in sample code.
Make sure that this customized version knows how to retrieve the
S3Client from the input stream factory, and to include its package in the spring scan packages.
For example, by specifying it in the
fusion.config.additionalScanPackages VM parameter.
Updating the AWS SDK
We provide a version of the AWS SDK for Java 2.x library with LuciadLightspeed. We tested it to confirm that it works for us. Even so, Amazon steadily improves their library and often provides updated binaries. They do so to improve and expand the functionality, but also to mitigate any security concerns.
To prevent compatibility issues, we generally discourage updating LuciadLightspeed dependencies on your own. In this case, though, LuciadLightspeed uses only basic functionality from the AWS SDK library. In addition, all the code that interacts with it is available in the samples, so you can fix any incompatibilities arising from an update yourself.
For those reasons, we encourage you to use the latest compatible version of the AWS SDK for Java 2.x.
Expected dataset structure on S3
In general, model decoders assume that they can find files related to the entry point file using filename substitution. When you store the files for datasets that consist of multiple files on S3, you should preserve the directory layout of the dataset in the object keys.
Table 1, “Deriving S3 object keys from the dataset layout” shows an example of preserving the directory layout.
Note that it shows S3 objects for the files only, not for the directories.
You can replace <some prefix> with any string ending in
/, or the empty string.
To refer to this dataset, you use the entry point file location
|Dataset layout||S3 Object Key|
When recursively copying directories, the AWS CLI tool follows this convention.