Spring Actuator Security, Part 2: Finding Actuators using Static Code Analysis with semgrep

In the first part of this series, we have discussed the risks inherent in exposing the Actuator functionality of the Spring framework. If you haven't read that part yet, I recommend that you do so before reading this article.

In this article, we will discuss how we can detect exposed Spring Actuators in an application that you have source code access to. We will begin with manual steps, and then look at how you can automate the process using static security testing tools (dynamic testing will be covered in part 3 of the series).

Manually Looking for Exposed Actuators

The most basic method of finding dangerous actuators is to use... your eyeballs. If you have access to the application source code, you can look at the Spring configuration files to check if actuators are enabled, and how they are configured. Begin by checking the .properties file(s) (or the respective .yaml equivalents) where the Spring configuration is stored. Recall that the list of active actuators is controlled with the following key:

# Generic configuration for actuator endpoints, in this case
# activating two endpoints: health and prometheus
management.endpoints.web.exposure.include=health,prometheus

This setting controls which endpoints are exposed over the web. Individual endpoints can also be completely disabled by setting management.endpoints.$ACTUATOR.enabled=false - as a rule of thumb, I would recommend inspecting everything in the management category and see if any dangerous endpoints are being activated, and if so, if mitigations (authentication requirements, ...) are already in place.

(All examples in this article are targeting the current version of Spring - older versions may use a different configuration syntax, or even (in the case of Spring 1.X) expose all actuators by default. Adapt what you read to your version of Spring, if necessary.)

Automating the Search Using Semgrep

Checking the code manually isn't always feasible. Maybe you are part of a security team that is responsible for a large set of software repositories, or maybe you want to add a check for dangerous actuators to your CI, to ensure that they aren't inadvertently activated a few weeks down the line.

For these cases, let me introduce you to my favourite static code analysis tool: semgrep. It's a free Open Source tool that you can install and use right now (it only starts costing money if you want to use their dashboard to view the results, which is entirely optional, and all code scanning runs on your device - code is never uploaded to any servers). Stated briefly, semgrep searches for code matching specific patterns, taking the semantics of the code into account (hence, semantic grep). You can use it for security checks based on a large set of detection rules curated by the semgrep community, but where it really shines is when you start writing rules for your own use cases.

The Basic Case: All Actuators

Semgrep rules are fairly easy to wrap your head around, so let's build one for our example application from the previous part of this series. To be able to showcase some of the capabilities of semgrep, we'll be using the YAML configuration syntax for Spring. This is what a basic vulnerable Spring configuration could look like:

management:
  endpoints:
    web:
      exposure:
        # Activate all Actuators (this is a bad idea!)
        include: "*"
        # Alternative syntax would be:
        # include:
        #   - "*"

Semgrep patterns are also specified in YAML files. Here's an annotated semgrep rule to find this case, with explanations what the components are doing.

# The rules file must begin with this top-level element, followed 
# by a list of rules
rules:
  # Every rule must have a unique ID
  - id: spring-actuator-fully-enabled
    # Now, we define which pattern we are looking for. In this 
    # case, we want to match both of the possible syntaxes for 
    # activating all actuators, so we use a pattern-either as a 
    # top-level element. This tells semgrep that it should match 
    # if at least one of the rules specified below matches the 
    # code.
    pattern-either:
      # The first pattern specifies the string syntax for the 
      # wildcard actuator setting.
      # Note the frequent use of ... in the rule. This tells 
      # semgrep "I don't care if there is other stuff here, 
      # keep searching until you find the next specified 
      # part". If we omitted them, semgrep would not match if 
      # any non-specified elements are in the YAML tree, even 
      # if the pattern we are looking for is *also* in there.
      - pattern: |
          management:
            ...
            endpoints:
              ...
              web:
                ...
                exposure:
                  ...
                  include: "*"
      # Specify the second rule that matches the second way of 
      # defining the actuator activation.
      - pattern: |
          management:
            ...
            endpoints:
              ...
              web:
                ...
                exposure:
                  ...
                  include:
                    ... 
                    - "*"
    # Specify the message that should be shown if the rule matches
    message: Spring Boot Actuator is fully enabled. This exposes
      sensitive endpoints such as /actuator/env, /actuator/logfile, 
      /actuator/heapdump and others. Unless you have Spring 
      Security enabled or another means to protect these endpoints, 
      this functionality is available without authentication, 
      causing a severe security risk.
    # Activating all actuators is dangerous, so we set the severity 
    # to ERROR. This means that it could potentially fail a build, 
    # if we ran this as part of a CI job.
    severity: ERROR
    # Tell semgrep that the rule is for the YAML language. This 
    # will automatically cause it to only be evaluated for YAML 
    # files.
    languages:
      - yaml

This is already quite a handy rule to quickly audit a large codebase for an obvious misconfiguration. To run it, save it in a file, and then run it with semgrep like this:

$ semgrep -c path/to/semgrep-rule.yaml .
Scanning 2 files with 2 yaml rules.
  100%|███████████████████████████████████████████████████████████████|2/2 tasks

Findings:

  src/main/resources/application.yml 
     src.main.resources.spring-actuator-fully-enabled
        Spring Boot Actuator is fully enabled. This exposes sensitive
        endpoints such as /actuator/env, /actuator/logfile,
        /actuator/heapdump and others. Unless you have Spring Security
        enabled or another means to protect these endpoints, this
        functionality is available without authentication, causing a severe
        security risk.

          3┆ management:
          4┆   endpoints:
          5┆     web:
          6┆       # base-path: /internal
          7┆       stuff:
          8┆         - "Nonsense"
          9┆       exposure:
         10┆         include: "*"
         11┆

Note that it outputs the entire matched area of the code for inspection. It also did not have any problems with the commented-out section, or with the other YAML keys I added to the config file. Pretty neat, and something that would be difficult to achieve with a regular grep call.

Matching Specific Actuators

The example above is already quite handy, but it only checks for the wildcard operator. Where semgrep really starts to shine is if we encounter a configuration like this:

management:
  endpoints:
    web:
      exposure:
        include:
          - "health"
          - "prometheus"
          - "logfile"
          - "env"
          - "heapdump"
          - "togglz"

Now, for the sake of argument, let's say you are fine with exposing your health endpoint (which can be reasonable in some situations), and you also find it acceptable to expose the Prometheus metrics (I wouldn't recommend it, but you do you). All other actuators should be disabled. But how can we check this using semgrep?

Fairly easily, it turns out. Semgrep has a powerful feature called "Metavariables", which allows you to pull specific parts of the code into a variable that you can then reuse in other parts of the rule. Normally, this would be used to track variables or function names while analyzing source code. However, we can also use it to pull out the list of activated actuators and match them against a list of known-accepted actuators. Here's an annotated rule that does this:

rules:
  - id: spring-actuator-dangerous-endpoints-enabled
    # This time, our top-level pattern operator is "patterns" 
    # (instead of pattern-either), which means that all patterns 
    # below need to match for the entire rule to produce a match.
    patterns:
      # First, we mostly reuse the existing pattern to get down to 
      # the level of the activated actuators. However, once we 
      # reach it,  we pull the actuators into a metavariable called 
      # $ACTUATOR. The rule will be evaluated once for every 
      # actuator in the list, meaning that we can produce more than 
      # one finding. Note the use of ... inside the 
      # include: [..., $ACTUATOR, ...]. This is used to indicate 
      # that the actuator can be in any location inside the list. 
      # If we were to write [$ACTUATOR, ...], we would only match 
      # the first in the list.
      - pattern: |
          management:
            ...
            endpoints:
              ...
              web:
                ...
                exposure:
                  ...
                  include: [..., $ACTUATOR, ...]
      # Now comes the magic part :)
      # The metavariable-comparison operator allows us to pull 
      # in a metavariable and compare it using a python comparison 
      # operator. In this case, we explicitly cast the actuators to 
      # strings, and  then compare them to the list of actuators 
      # we want to allow.
      # Note that the "not VAR in LIST" syntax is due to a bug in
      # semgrep that prevented the "VAR not in LIST" construction
      # at the time of writing. In general, the latter should work
      # as well, as soon as the bug is fixed.
      - metavariable-comparison:
          metavariable: $ACTUATOR
          comparison: not str($ACTUATOR) in ["health", "prometheus"]
    # We can also use the metavariable in the message, so that we 
    # can directly output the offending actuator in the logs.
    message: Spring Boot Actuator "$ACTUATOR" is enabled. Depending 
      on the actuator, this can pose a significant security risk. 
      Please double-check if the actuator is needed and properly 
      secured.
    severity: ERROR
    languages:
      - yaml

If we run this rule against the configuration shown above, we get the following result:

$ semgrep -c path/to/rules.yaml .
Scanning 2 files with 2 yaml rules.
  100%|███████████████████████████████████████████████████████████████|2/2 tasks

Findings:

  src/main/resources/application.yml 
     src.main.resources.spring-actuator-dangerous-endpoints-enabled
        Spring Boot Actuator ""en" is enabled. Depending on the actuator,
        this can pose a significant security risk. Please double-check if
        the actuator is needed and properly secured.

          3┆ management:
          4┆   endpoints:
          5┆     web:
          6┆       # base-path: /internal
          7┆       exposure:
          8┆         include:
          9┆           - "health"
         10┆           - "prometheus"
         11┆           - "logfile"
         12┆           - "env"
           [hid 3 additional lines, adjust with --max-lines-per-finding] 
     src.main.resources.spring-actuator-dangerous-endpoints-enabled
        Spring Boot Actuator ""heapdum" is enabled. Depending on the
        actuator, this can pose a significant security risk. Please double-
        check if the actuator is needed and properly secured.

          3┆ management:
          4┆   endpoints:
          5┆     web:
          6┆       # base-path: /internal
          7┆       exposure:
          8┆         include:
          9┆           - "health"
         10┆           - "prometheus"
         11┆           - "logfile"
         12┆           - "env"
           [hid 3 additional lines, adjust with --max-lines-per-finding] 
... two more findings like this, for ""logfil" and ""toggl" ...

You will note three things:

The output of the metavariable is a bit bugged - this is a known issue in semgrep at the time of writing, but it is purely cosmetic (internally, the matching works).
We get four matches, which is the expected number for the configuration file above - health and prometheus were ignored, as requested.
Although we defined the rule using the syntax include: [..., $ACTUATOR, ...], it still matched the syntax from the config file, as it knows that the inline list format and the "individual lines prefixed by a dash"-Syntax are equivalent. This semantic knowledge is what makes semgrep so powerful.

Excluding Findings

This is already a quite useful pattern. However, maybe your organization actually wants actuators to be active, and simply requires them to be running on a specific port or IP that is not exposed publicly. In that case, a configuration like this would be perfectly acceptable:

management:
  server:
    # Actuators only listen on localhost:8080
    address: 127.0.0.1
    port: 8080
  endpoints:
    web:
      exposure:
        include:
          - "health"
          - "prometheus"
          - "logfile"
          - "env"
          - "heapdump"
          - "togglz"

So, how can we tell semgrep "please find cases where actuators are active, but not if they are only listening on a specific IP"? Fairly easily, actually. Let's build a rule for that.

rules:
- id: spring-actuator-dangerous-endpoints-enabled
    patterns:
      # Pattern identical to the previous example
      - pattern: |
          management:
            ...
            endpoints:
              ...
              web:
                ...
                exposure:
                  ...
                  include: [..., $ACTUATOR, ...]
      # pattern identical to previous example
      - metavariable-comparison:
          metavariable: $ACTUATOR
          comparison: not str($ACTUATOR) in ["health", "prometheus"]
      # We add a third pattern, with the pattern-not directive. 
      # This means that any patterns that match this are not 
      # considered for the results. In this example, we exclude 
      # cases where the address for the management server is 
      # explicitly set to 127.0.0.1.
      - pattern-not: |
          management:
            ...
            server:
              ...
              address: 127.0.0.1
    message: Spring Boot Actuator "$ACTUATOR" is enabled, and not
      bound to 127.0.0.1. Depending on the actuator, this can pose 
      a significant security risk. Please double-check if the 
      actuator is needed and properly secured. Company policy is to
      only allow actuators to listen on 127.0.0.1 - you can achieve
      this by setting management.server.address to 127.0.0.1.
    severity: ERROR
    languages:
      - yaml

Or maybe you don't actually care which IP the actuator is listening on, as long as a port and IP are explicitly set - maybe because the set of possible allowed IPs is too large, or maybe because you think that if a team is going to go to the trouble of setting these two values, they know what they are doing and don't need your handholding. In this case, simply change the pattern-not to the following:

- pattern-not: |
    management:
      ...
      server:
        ...
        port: $PORT
        address: $ADDRESS

In this case, you simply tell semgrep "hey, I don't actually care what the values of the two metavariable are, just make sure they are present". As a bonus, semgrep knows that the order of keys in YAML doesn't matter, so you don't have to worry about what happens if the two keys are in a different order in the config file - semgrep will find them.

Conclusion

We could go on for quite a while in refining these patterns, but in the end, you will have to adapt them to your own situation. For example, maybe you are securing your Actuators using Spring Security, and simply need to check if the correct authentication requirements are configured. Or maybe you are exposing all active actuators, but have turned off most actuators that are enabled by default. All of these things can be checked with semgrep, if you know how to write the rules.

I would be remiss if I did not mention one limitation of semgrep: Currently, the semantic features of semgrep are not yet available for all programming languages. In particular, this is the reason why I worked with YAML configuration files for Spring in this article - going through the plain-text properties format would be a lot more annoying with semgrep, as you cannot use the implicit hierarchy that YAML gives you to structure your queries. You can still write semgrep rules against these files using the generic language model, but it is subject to some limitations. I recommend playing around with it to see if you can get your use case to work - the semgrep playground allows you to do so without installing anything on your device.

I have contributed the rules from this article (written in a slightly nicer way that would have taken a bit longer to explain, so I opted to go with simpler rules for this article) to the semgrep rule registry, including a rule that uses the generic mode to find Actuator activations in .properties files. You can find the final rules in this pull request on GitHub.

And as a bonus: If you want to scan a large set of Git repositories with a set of semgrep rules, I wrote about how you can do this using the OWASP secureCodeBox, an Open Source project that I am an active contributor to.

This concludes this part of the Spring Actuator security series. I hope that you have gained some appreciation for the power and flexibility of using static code analysis to aid you in securing your systems. In the next part, we will discuss how to find actuators in deployed software using dynamic testing.