©2019 by Raghavendra Kambhampati

How to analyze the data stored in AWS S3 with Amazon Athena ?

Amazon Athena is an interactive query service that makes it easy to analyze data in Amazon S3 using standard SQL. Athena is capable of querying CSV data. However, the Parquet file format significantly reduces the time and cost of querying the data.

To use AWS Glue with Amazon Athena, you must upgrade your Athena data catalog to the AWS Glue Data Catalog.

Open the AWS Management Console for Athena. The Query Editor displays both tables in the tpc-h database.

Select tpc-h database and it populates all the CSV and parquet tables.Here we select customer table in CSV format and parquet format to see how the query works.

Select Customer table in parquet format