Amazon Redshift Spectrum
This section describes how to use Redshift Spectrum to efficiently read data from Amazon S3.
Using Amazon Redshift Spectrum, you can efficiently query and retrieve structured and semistructured data from files in Amazon S3 without having to load the data into Amazon Redshift tables. Redshift Spectrum queries employ massive parallelism to run very fast against large datasets. Much of the processing occurs in the Redshift Spectrum layer, and most of the data remains in Amazon S3. Multiple clusters can concurrently query the same dataset in Amazon S3 without the need to make copies of the data for each cluster.
Topics
- Amazon Redshift Spectrum overview
- Getting started with Amazon Redshift Spectrum
- IAM policies for Amazon Redshift Spectrum
- Redshift Spectrum and AWS Lake Formation
- Data files for queries in Amazon Redshift Spectrum
- External schemas in Amazon Redshift Spectrum
- External tables for Redshift Spectrum
- Using Apache Iceberg tables with Amazon Redshift
- Amazon Redshift Spectrum query performance
- Data handling options
- Example: Performing correlated subqueries in Redshift Spectrum
- Metrics in Amazon Redshift Spectrum
- Query troubleshooting in Amazon Redshift Spectrum
- Tutorial: Querying nested data with Amazon Redshift Spectrum