How are data sources used in Terraform?
Good examples up there!
The main difference between Terraform data source, resource and variable is :
Resource: Provisioning of resources/infra on our platform. Create, Update and delete!
Variable Provides predefined values as variables on our IAC. Used by resource for provisioning.
Data Source: Fetch values from our infra/provider and and provides data for our resource to provision infra/resource.
Examples are well explained above :)
Data sources provide information about entities that are not managed by the current Terraform configuration.
This may include:
- Configuration data from Consul
- Information about the state of manually-configured infrastructure components
In other words, data sources are read-only views into the state of pre-existing components external to our configuration.
Once you have defined a data source, you can use the data elsewhere in your Terraform configuration.
For example, let's suppose we want to create a Terraform configuration for a new AWS EC2 instance. We want to use an AMI image which were created and uploaded by a Jenkins job using the AWS CLI, and not managed by Terraform. As part of the configuration for our Jenkins job, this AMI image will always have a name with the prefix app-
.
In this case, we can use the aws_ami
data source to obtain information about the most recent AMI image that has the name prefix app-
.
data "aws_ami" "app_ami" {
most_recent = true
filter {
name = "name"
values = ["app-*"]
}
}
Data sources export attributes, just like resources do. We can interpolate these attributes using the syntax data.TYPE.NAME.ATTR
. In our example, we can interpolate the value of the AMI ID as data.aws_ami.app_ami.id
, and pass it as the ami
argument for our aws_instance
resource.
resource "aws_instance" "app" {
ami = "${data.aws_ami.app_ami.id}"
instance_type = "t2.micro"
}
Data sources are most powerful when retrieving information about dynamic entities - those whose properties change value often. For example, the next time Terraform fetches data for our aws_ami
data source, the value of the exported attributes may be different (we might have built and pushed a new AMI).
Variables are used for static values, those that rarely changes, such as your access and secret keys, or a standard list of sudoers for your servers.
Data sources can be used for a number of reasons; but their goal is to do something and then give you data.
Let's take the example from their documentation:
# Find the latest available AMI that is tagged with Component = web
data "aws_ami" "web" {
filter {
name = "state"
values = ["available"]
}
filter {
name = "tag:Component"
values = ["web"]
}
most_recent = true
}
This uses the aws_ami data source - this is different than a resource! It will instead just give you information, and not create anything. This example in particular will call out to the describe-images
AWS API call, pass in a few --filter
options as specified, and return an object that you can get information from - take a look at these attributes!
- name
- owner_id
- description
- image_id
... The list goes on. This is really useful if I were, let's say - always wanting to pull the latest AMI matching some tags, and keep a launch configuration up to date with it. I could use this data provider rather than always have to update a variable or hard-code the ID.
Data source can be used for other reasons as well; one of my favorites is the template provider.
Good luck!