Commit Graph

12 Commits

Author SHA1 Message Date
Steven Clark
e3f09b8c6d Update licensing across various source files - 1.13 (#24675)
* Fix licensing on various files

* Update packaging to use BUSL-1.1

* Update offset within config_test_helpers.go

 - Fix a test the same way it's been fixed on main/1.15
2024-01-08 12:24:57 -05:00
Ryan Cragun
4af9178d7e enos: fix licensing on backported files (#24163)
Signed-off-by: Ryan Cragun <me@ryan.ec>
2023-11-16 12:59:51 -07:00
hc-github-team-secure-vault-core
8a5e6fcc4e backport of commit a46def288f (#23868)
Co-authored-by: Ryan Cragun <me@ryan.ec>
2023-10-26 21:32:45 +00:00
Ryan Cragun
db1c24d904 test: wait for nc to be listening before enabling auditor (#23142) (#23151)
Rather than assuming a short sleep will work, we instead wait until netcat is listening of the socket. We've also configured the netcat listener to persist after the first connection, which allows Vault and us to check the connection without the process closing.

As we implemented this we also ran into AWS issues in us-east-1 and us-west-2, so we've changed our deploy regions until those issues are resolved.

Signed-off-by: Ryan Cragun <me@ryan.ec>
2023-09-18 15:10:12 -06:00
hc-github-team-secure-vault-core
737d25348f [QT-572][VAULT-17391] enos: use ec2 fleets for consul storage scenarios (#21400) (#21420)
Begin the process of migrating away from the "strongly encouraged not to
use"[0] Ec2 spot fleet API to the more modern `ec2:CreateFleet`.
Unfortuantely the `instant` type fleet does not guarantee fulfillment
with either on-demand or spot types. We'll need to add a feature similar
to `wait_for_fulfillment` on the `spot_fleet_request` resource[1] to
`ec2_fleet` before we can rely on it.

We also update the existing target fleets to support provisioning generic
targets. This has allowed us to remove our usage of `terraform-enos-aws-consul`
and replace it with a smaller `backend_consul` module in-repo.

We also remove `terraform-enos-aws-infra` and replace it with two smaller
in-repo modules `ec2_info` and `create_vpc`. This has allowed us to simplify
the vpc resources we use for each scneario, which in turn allows us to
not rely on flaky resources.

As part of this refactor we've also made it possible to provision
targets using different distro versions.

[0] https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/spot-best-practices.html#which-spot-request-method-to-use
[1] https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/spot_fleet_request#wait_for_fulfillment

* enos/consul: add `backend_consul` module that accepts target hosts.
* enos/target_ec2_spot_fleet: add support for consul networking.
* enos/target_ec2_spot_fleet: add support for customizing cluster tag
  key.
* enos/scenarios: create `target_ec2_fleet` which uses a more modern
  `ec2_fleet` API.
* enos/create_vpc: replace `terraform-enos-aws-infra` with smaller and
  simplified version. Flatten the networking to a single route on the
  default route table and a single subnet.
* enos/ec2_info: add a new module to give us useful ec2 information
  including AMI id's for various arch/distro/version combinations.
* enos/ci: update service user role to allow for managing ec2 fleets.

Signed-off-by: Ryan Cragun <me@ryan.ec>
Co-authored-by: Ryan Cragun <me@ryan.ec>
2023-06-22 20:11:23 +00:00
hc-github-team-secure-vault-core
05ca6d0f39 Backport of [QT-525] and [QT-530] into release/1.13.x (#20158)
* [QT-525] enos: use spot instances for Vault targets (#20037)

The previous strategy for provisioning infrastructure targets was to use
the cheapest instances that could reliably perform as Vault cluster
nodes. With this change we introduce a new model for target node
infrastructure. We've replaced on-demand instances for a spot
fleet. While the spot price fluctuates based on dynamic pricing, 
capacity, region, instance type, and platform, cost savings for our
most common combinations range between 20-70%.

This change only includes spot fleet targets for Vault clusters.
We'll be updating our Consul backend bidding in another PR.

* Create a new `vault_cluster` module that handles installation,
  configuration, initializing, and unsealing Vault clusters.
* Create a `target_ec2_instances` module that can provision a group of
  instances on-demand.
* Create a `target_ec2_spot_fleet` module that can bid on a fleet of
  spot instances.
* Extend every Enos scenario to utilize the spot fleet target acquisition
  strategy and the `vault_cluster` module.
* Update our Enos CI modules to handle both the `aws-nuke` permissions
  and also the privileges to provision spot fleets.
* Only use us-east-1 and us-west-2 in our scenario matrices as costs are
  lower than us-west-1.

Signed-off-by: Ryan Cragun <me@ryan.ec>

* [QT-530] enos: allow-list all public IP addresses (#20304)

The security groups that allow access to remote machines in Enos
scenarios have been configured to only allow port 22 (SSH) from the
public IP address of machine executing the Enos scenario. To achieve
this we previously utilized the `enos_environment.public_ip_address`
attribute. Sometime in mid March we started seeing sporadic SSH i/o
timeout errors when attempting to execute Enos resources against SSH
transport targets. We've only ever seen this when communicating from
Azure hosted runners to AWS hosted machines.

While testing we were able to confirm that in some cases the public IP
address resolved using DNS over UDP4 to Google and OpenDNS name servers
did not match what was resolved when using the HTTPS/TCP IP address
service hosted by AWS. The Enos data source was implemented in a way
that we'd attempt resolution of a single name server and only attempt
resolving from the next if previous name server could not get a result.
We'd then allow-list that single IP address. That's a problem if we can
resolve two different public IP addresses depending our endpoint address.

This change utlizes the new `enos_environment.public_ip_addresses`
attribute and subsequent behavior change. Now the data source will
attempt to resolve our public IP address via name servers hosted by
Google, OpenDNS, Cloudflare, and AWS. We then return a unique set of
these IP addresses and allow-list all of them in our security group. It
is our hope that this resolves these i/o timeout errors that seem like
they're caused by the security group black-holing our attempted access
because the IP we resolved does not match what we're actually exiting
with.

Signed-off-by: Ryan Cragun <me@ryan.ec>

---------

Signed-off-by: Ryan Cragun <me@ryan.ec>
Co-authored-by: Ryan Cragun <me@ryan.ec>
2023-04-23 17:12:59 -06:00
Mike Baum
39211b8772 [QT-441] Switch over to using new vault_ci AWS account for enos CI workflows (#18398) 2023-01-18 16:09:19 -05:00
Josh Brand
1c77309106 Add enterprise resources to CI cleanup (#18758) 2023-01-18 14:14:19 -05:00
Josh Brand
4768e74a35 Add automated CI account cleanup & monitoring (#18659)
This uses aws-nuke and awslimitchecker to monitor the new vault CI account to clean up and prevent resource quota exhaustion.  AWS-nuke will scan all regions of the accounts for lingering resources enos/terraform didn't clean up, and if they don't match exclusion criteria, delete them every night.  By default, we exclude corp-sec created resources, our own CI resources, and when possible, anything created within the past 72 hours. Because this account is dedicated to CI, users should not expect resources to persist beyond this without additional configuration.
2023-01-11 17:24:08 -05:00
Mike Baum
85bf592dbc Add Enos CI account service quotas limit increase requests to bootstrapping (#18309) 2022-12-12 13:14:38 -05:00
Mike Baum
0dd2f742ce [QT-318] Add workflow dispatch trigger for bootstrap workflow, update ssh key name (#18174)
* Added a workflow dispatch trigger for bootstrap workflow, updated ssh key name
* Ensure the bootstrap workflow is only run for PRs that change the bootstrapping code
2022-12-02 14:29:20 -05:00
Mike Baum
7cd3aab085 [QT-318] Add Vault CI bootstrap scenarios (#17907) 2022-11-30 12:44:02 -05:00