My guide to AWS IAM

Posted: 14th November 2020

Disclaimer: I'm not an accredited AWS security expert or anything, this is based on spending the last few years wrestling AWS IAM policies into doing what I want. Your mileage may vary and I may be wholly wrong in any or all matters.

One more disclaimer: I'm assuming you know the basics of AWS IAM, i.e. I assume you know (at least generally) what action, resource, principal mean and what the difference between a resource policy (e.g. one attached to an S3 Bucket) and an identity policy are (e.g. one attached to a Role).

Anyway, on to it...

An Intro

AWS' Identity and Access Management system is terrifying. Through a collection of 8211 actions[1] you can control any of AWS' 241 services[2]. Some services (like arsenal[3]) define a single action whereas EC2 has over 400 with the average service having 34. On top of that each service will have some number resource types and conditions you can check (e.g. is that satellite tagged "ion-cannon"[4]). Oh and there are at least 6 different places you can define these policies (here's a list of them[5]) then finally you can assign, assume and pass permissions between identities.

All of that makes working with IAM pretty overwhelming.

I'm not going to say "but here's one simple trick to make it easy" because there isn't and it wouldn't. Instead I'll start by covering my general approach then highlight some of the more horrible things I've dealt with/ranted about

I'm going to focus on resource policies and identity policies here but the general ideas should be applicable to the more exotic forms too (although if you're using those you probably know a lot more than I do).

If you do want one tip though: learn to love the excitingly titled "actions, resources, and condition keys for AWS services" pages which list all the services, their actions, conditions and resources (surprise!)

General Approach

So how do I write a policy? The first step is a set of questions[6] which should help you define the the boundary of least privilege.

  • Who do you want to grant permissions to?
    • For a resource policy this is the principal
    • For an identity policy this is the identity you attach the policy to (or inline it on)
    • Remember: "who" covers both humans and machines and may be shared depending on whether it's a role, group, user or policy object (although humans and machines probably shouldn't share policies).
  • What do you want to allow (or Deny)?
    • These are AWS' actions, you can find every action for every service somewhere in the previously mentioned actions and etc. etc. pages.
    • Start with the plain-English form (e.g. "read permissions on kinesis stream Y", "add a route table to a subnet") then see if you can find actions that ONLY do those things (e.g. kinesis:GetRecord and ec2:AssociateRouteTable[7])
  • Where are the resource(s) that you're granting permissions to?
    • For a resource policy this is what it's attached to
    • For an identity policy this is the resource term
    • It may be a specific resource (version 4 of this IAM role-policy) or a group of them (every EC2 instance in us-east-1)

With an idea of what's wanted now is a good time to look for example policies to crib from. Obviously AWS' docs are the best place to start (normally located by a search along the lines of "AWS [service] example IAM policy" because good luck navigating AWS' docs). I also try searching AWS' managed policies, these are often variations on the examples but not always and they may be exactly what you need.

Two potential gotcha's with AWS' policies: firstly the managed policies are pretty inconsistently named[8] so try some variations. More importantly, both AWS examples and managed policies, are often very broad (lots of Resource: "*"s and Action: "ec2:*" etc.) so make sure you know what potential sins those mighty wildcards are hiding.

Another major thing to look for when sketching a policy is whether the service you want needs other services. For example Kinesis often uses DynamoDB to implement read check-pointing while Elastic Beanstalk uses EC2, EC2-Autoscaling, Elastic Load Balancer and others services. Beyond checking the examples and reading the docs figuring out these relations is often a matter of trial and error so I start with the absolute minimum and work from there[9].

Hopefully this should give you a policy you can start testing. In addition to just running whatever program it is I highly recommend playing around with the AWS Policy Simulator[10] I won't go into detail on using it (docs are here) but there are a important gotchas to it I wanted to flag:

  • You can only directly test identities (either a specific new or proposed policy or all of the policies attached to a user or role).
  • It only tests the exact interaction you ask it to test[11] which gets confusing with services that do work on your behalf as it may not be clear who is doing what, where.
  • The policy simulator will only simulate a resource's policy if you have permissions to access that policy AND you target that that specific resource (and even then still doesn't seem to always work for me).
  • If you use conditionals these can be set in different places (e.g. global values like whether mfa was used vs resource values such as instance state[12])
  • It is only a simulation, any results you get from it should be treated very carefully.

And that's mostly it. One final tip (which doesn't really fit anywhere else) is to get a feel for how actions are named: [service]:[Verb][Resource]. As I cover below this allows you to write some pretty powerful policies using wild cards. Just beware: the verbs AWS services use are not consistent (e.g. would you like to Add, Create, Change, Set or simply Tag your tags on a resource?)

Horrible things

To start: not a horrible thing per-say but very much a double edged sword: wildcards. Most people will know that it can be used as a suffix for actions and resources (e.g. ec2:* or "arn:aws:s3:::my_corporate_bucket/*") but you can also use it multiple times (e.g. ec2:*Vpc*). You can write some very powerful policies like this (especially things like ec2:Describe* for a read-only policy) but just remember that the lists of actions only expand so these may end up including things you didn't intend. This is especially true in large services like ec2, talking of which...

EC2. EC2 has (as previously mentioned) over 400 actions. These cover everything from RunInstances (unsurprisingly, run some compute[13]) to StartVpcEndpointService-PrivateDnsVerification (which, I think, starts the process that'll allow you to use your data-centre's DNS in your AWS VPC). In my opinion, this is too much, but I don't work at AWS so there's not much I can do about it[14]. I can warn you to be very careful. The ec2: namespace covers everything from the network up whilst also being the namespace you're most likely to need, even with unrelated things (want your lambda to talk to your private RDS? You'll have to grant ec2:CreateNetworkInterface to its execution role). There are some tricks to deal with this: for example tag based access policies (although see below for some more on this); using conditions will allow you to limit aspects of an action outside of its target resource (e.g. which VPC am instance is deployed to); using wildcards carefully to allow or deny sets of actions on classes of resource (e.g. deny ec2:*vpc* to stop anything on VPCs or their associated stuff, like VPC endpoints).

As I said above, tags can help with e.g. the huge array of actions in ec2. Tag (or in AWS parlance "attribute") based access policies are pretty cool (want only 'admins' to edit that instance? Tag it "admin-only"!) AWS have an entire tutorial dedicated to setting it up... I wouldn't. It's a useful tool for limited situations but there're some pretty huge caveats:

  • Not every service supports it
  • You'll need to make sure you have further policies in place so that tags are maintained (although AWS Organizations' tag policies do this)
  • Many actions can write tags as a side effect (e.g. ec2:RunInstances)
  • Actions and Conditions for interacting with tags are not consistent between services (would you like to CreateTags, AddTags or merely Tag?)

Quick one now: Deny. This prevents the action, regardless of where amongst the policies it's specified. This is what you want: if you've explicitly denied something there's no way out, that's it: it's denied. The downside is in say, a legacy system, if there's one of these somewhere they can really trip you up. The policy sim can help as, instead of the usual Implicitly denied it will tell you that it matches a statement, then you just have to find it.

And now the final one: be careful with permissions in the iam: namespace. If you've read this far I'm sure this is obvious but pay extra attention to any permissions you grant here, you can easily create a policy that lets someone attach AdministratorAccess to themselves, either directly (the action is AttachUserPolicy) or, for example by being able to modify an inline policy on a role used as an instance profile.

[1]: when I started this post a month ago it was 8055 [back]

[2]: the numbers in this post are based on the output of this particular fork of aws-iam-reference. The exact figures don't really matter (they go of date pretty much monthly) [back]

[3]: no, I don't know what it does. Apparently its full name is "Application Discovery Arsenal" and it's only action is RegisterOnPremisesAgent so, who knows? [back]

[4]: groundstation is a real service, sadly you have to supply your own hardware. [back]

[5]: and some terrible attempts at Venn-diagrams [back]

[6]: These sort of follow the "five Ws", although there are only three here. I use the remaining two with colleagues though: "why" should always be asked and I treat "when" as "when can I take these away from you" (never is a reasonable answer but I want to know that). [back]

[7]: EC2 has a LOT of bits that should be separate services [back]

[8]: particularly in terms of "should this policy start with AWS, Amazon or just the service name". That being said the fuzzy matcher on the policy page isn't too bad so just try the service name. [back]

[9]: This is a VERY good case for using something like terraform to automate deployment and make sure all your resources match up and your json is correct. [back]

[10]: Sadly you need to sign in to AWS to use this but it doesn't cost anything [back]

[11]: This does mean that it has the annoying property that by the time you've figured out what (and how) to ask it your question you probably know the answer. [back]

[12]: This gets extra confusing when the condition is actually part of the request. Specifically the conditions aws:RequestTag/${TagKey} (e.g. if you want to set a tag on something) Vs aws:ResourceTag/ (e.g. does this resource have this tag). I think this may apply elsewhere but I'm not sure. [back]

[13]: Not to be confused with StartInstances [back]

[14]: Except complain on the internet [back]