Setting up Athena as a data source
Setting up Athena as your datasource requires you set up the proper permissions within AWS for Growthbook to access and then provide the correct credentials to Growthbook make use of those permissions. There are also optional connection data that will help Growthbook create the correct default sql to analyze your data.
Setting up Permissions in AWS
Unlike other database engines with their own user management system, Athena uses IAM for authentication.
Growthbook Cloud
We recommend creating a new IAM user with readonly permissions for GrowthBook.
The managed AWSQuicksightAthenaAccess is a good starting point. You will also need to give it permission to read from the s3 tables that hold your event data, by taking a modified version of AmazonS3ReadOnlyAccess
policy whose resources are confined to only those tables that hold your event data. For example with the following policy after replacing the <BUCKET NAME>
:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"s3:Get*",
"s3:List*",
"s3-object-lambda:Get*",
"s3-object-lambda:List*"
],
"Resource": [
"arn:aws:s3:::<BUCKET NAME>*"
],
}
]
}
Ideally that bucket should only have event data that growthbook needs to calculated its metrics and no other data. You can futher restrict the resources as your security policy requires.
Afterwards your IAM user's permission page might look like:
Self hosted
We recommend creating a new IAM role with the same permissions as for Growthbook Cloud. This role can then be attached to the ec2 instance that is running Growthbook.
Providing Credentials to Growthbook
Growthbook Cloud
If you are using Growthbook Cloud you will need to create an IAM user that has the above policies attached - either directly or through a group.
You must then create an access key by clicking on Security Credentials then "Create Access Key". You can then choose "Third-party service". It will warn you that this is not best practice, but unfortunately this is the only way to give Growthbook access at the moment. We are working on other ways to connect in the future. You can then confirm and click next. You can add a tag if you like and then press "Create access key". You should see then see following screen:
In another browser tab open up Growthbook Data Sources tab and choose your event tracker. Select Athena as your data source type. You can then copy the Access Key and Secret Access Key from the AWS browser tab to their corresponding fields.
Self hosting
If you are self hosting then in addition to the method above you can also pass the credentials in via environmental variables or part of the instance metadata. You can select which method you want in the Authentication Method
field.
Remaining Configuration
AWS Region
- This should be the AWS Region your Athena database is in. From the AWS console you would see it on the right side of the search bar on the top of the screen next to the account name.
Workgroup
- This is the workgroup within Athena.
Default Catalog
- This is that catalog where the event data lives.
Default Database
- This is the database where the event data lives.
S3 Results URL
- This is the s3 URL where the results to Athena queries get saved. When setting up Athena for the S3 results url, we recommend naming your bucket with the prefix aws-athena-query-results-
as the AWSQuicksightAthenaAccess gives permission to write to any bucket with this prefix. If Growthbook warns you that it can not write to an s3 location other than the one you select here it is most likely because you have set the workgroup to override client side settings. If that is the case you would either need to change that setting or add the permissions for growthbook to also write to the s3 results url saved there.