Operational enterprise environments are tempermental. Touch one thing, break another. Replace a server, break the interfaces to that server. Increase the security posture of the organization by changing an operational firewall? Well, we don’t want to think about that!
Wait. Actually, we do want to think about increasing the organization’s security posture.
This article focuses on protecting enterprises with outbound firewall rules. We’ll also explore network based threat hunts, how netflow models can trigger Hunt alerts, and how the models provide valuable metrics for hunters.
- Firewalls and networks
- Deploying firewalls in new enterprises
- Deploying firewalls in operational environments
- 1. Monitor and capture netflows
- 2. Explicitly allow “active” netflows; explicitly deny all others
- 3. Refactor “active” netflows
- 4. Continue monitoring netflows (threat monitoring)
- Conclusion and after thoughts
Firewalls and networks
Firewalls are security devices that protect enterprises from uncontrolled network flows, in much the same way as dams protect towns from uncontrolled water flows. Most enterprises recognize firewalls as “inbound protection devices”. But firewalls are much more than inbound protection devices. Configured correctly, firewalls protect against unauthorized inbound traffic AND unauthorized outbound traffic.
What does this mean? Consider an adversary (possibly an insider) that has landed on your network. This is already a bad situation — something has happened that allowed the adversary to wind up on the network.
This is where your outbound firewall configuration comes in. Without a firewall, the adversary is able to exfiltrate your sensitive data without you even knowing. That said, a properly configured firewall can make it more difficult for the adversary to exfiltrate data from your network. Even though the adversary is on the network, getting sensitive data out of the network can be made more difficult with the use of firewalls.
Define your network
Dealing with thousands of individual objects is a difficult task. When presented with thousands of individual objects, our minds work to categorize the objects.
Network objects are no different. Combining dozens of objects on a small network quickly become complex. Consider your home network. Probably pretty simple. You might have a half dozen cameras, an Internet ready doorbell, WiFi keypad locks, a couple of computers between you and the family, several phones, a WiFi thermostat or two, printers, WiFi smart watches, network enabled refrigerator, and several other devices. Even in this “pretty simple” environment, simple means dozens of devices.
Dozens of devices potentially means at least dozens of Firewall rules. And every new device means reconfiguring the Firewall. This effort can become unwieldy quite quickly.
So how to proceed? First, recognize that this process is iterative. Each iteration is a brand new opportunity to refine the solution.
Grouping network objects based on “service”
Dealing with large numbers of diverse objects is difficult. It is much better to group objects into “similar” or at least “similar enough”. When it comes to networks, shiny objects are not all created equal. One easy grouping of devices might be based on the “nature of network access”. For example, the groups might include:
(a) INTERNET ACCESS devices that need outbound connected Internet access, but no Internet device needs to initiate access into these devices. These devices include computers, laptops, and phones.
(b) INTERNET BLOCKED devices that do not need Internet access. They never need to communicate to the Internet, and the Internet never needs to initiate traffic to them. These devices include individual cameras that connect to a local DVR, WiFi enabled thermostats that are controlled only by phones that are on the network, and printers. Remember to consider that the devices will not be able to update themselves either, since they will not have direct access to the Internet. Creating a workflow for updating the devices is important, and usually handled by manual updates or by having a local server they’ll attach to that will allow updates.
(c) DMZ DEVICES devices that need to be controlled or accessed by the Internet. These devices require firewall routes from the internet “into” your network. The devices might include a web server if you are locally hosting web sites. This class of device are typically deployed in DMZs (network demilitarized zones) and will not be covered in this short tutorial.
To summarize, a simple categorization or segmentation is (a) devices that can access the Internet, and (b) devices that do not access the Internet.
It is easy to argue that “This binary Yes/No, Open/Blocked network segmentation is insufficient!” And yes, that is an accurate statement. Build as many different groups of devices as you wish, and remember this is an iterative process. At some point you’ll need to get started.
Deploying firewalls in new enterprises
Configuring firewalls in new environments is a much simpler task than configuring firewalls in operational environments. In a new environment, the firewall can start life with outbound connections set to Block All. Each new device, each new service, can be assessed for traffic requirements. For example, you know your employees need to access web sites? Open outbound TCP 80 and 443 for the workstation endpoint IPs. You know a server engineer needs to sftp to a remote server? Open outbound TCP 22 for that server IP.
In the Groupings solution defined above, onboarding each new device requires that the device is categorized as either (a) Internet access necessary or (b) Internet access is blocked. It is quite valuable to have subcategories as well. For example, the workstation endpoints should not necessarily have 22 open. On the other hand, Server endpoints often do not have 80 & 443 open (you don’t want your Server engineer to browse potentially nefarious web sites and download malware).
One thing to remember is to create policies & processes for onboarding new devices. Each new device should be attached to a group that will allow the appropriate and reasonable amount of Internet traffic.
Deploying firewalls in operational environments
Operational environments require a bit more planning and diligence. The problem is that blocking all ports is going to break everything — suddenly, nothing will work.
The basis of this recommendation is: Make a plan! Whatever you are going to do, make sure you’ve developed a plan, and make sure the plan includes backout steps.
Here is an operational plan for changing firewall rules that will work in every environment.
1. Monitor and capture netflows
Goal: Identify each (a) device that is communicating to the Internet, and (b) the remaining devices that have no need to access the Internet.
Understanding basic network metrics is the best place to start in protecting an existing environment with firewalls. Users are not impacted during the monitor and capture phase since traffic shaping does not occur during the monitor phase.
The monitor phase should continue for at least a month, more reasonably at least a quarter. The reason for this extended timeframe is to capture as much “known traffic” as practical. For example, vendor software updates are normally scheduled at least quarterly. By monitoring for at least a quarter, the capture will include vendor software update flow. To note, Microsoft and other vendors initiate the infamous “Patch Tuesday“.
The monitor phase metrics results in two useful artifacts.
- First, ports that are not used during the normalization phase can be considered for blocking (explained in the next phase).
- Second, the netflows can be used during threat hunts. The way this is used during a hunt is that the hunters have a model for “normal” traffic, and thereby can also recognize “not normal” traffic.
Know that this step is not going to stop an existing bad actor that has already infiltrated your network. In fact, you aren’t even going to be made aware of a bad actor during this step.
Recognizing “not normal” traffic is a key to network threat hunting. During a threat hunt, the team is looking for anomalies, for traffic that doesn’t belong. If a “disallowed” netflow shows up in a capture, the netflow might be an indicator of compromise, a key sign of trouble that needs to be investigated by the threat hunt team.
To explore this a bit, network modeling is not “binary”. That is, it isn’t just the “disallow” list that is important to modeling netflows. Ports that wind up on the “allow” list should continue to be monitored for excess traffic. An artful threat hunt includes investigating abnormal traffic spikes. If a port model demonstrates a certain daily traffic volume, then suddenly experiences a traffic spike, the excess traffic should result in a Security Alert.
2. Explicitly allow “active” netflows; explicitly deny all others
The second phase of tuning the outbound firewall rules is to only allow the “known active” ports. This is performed by explicitly Allowing netflows that were observed during Phase 1 Monitoring, and explicitly Denying all other flows.
Active block in a previously open enterprise is likely to introduce issues. The team needs to have a plan and procedure ready to “unblock” required flows. This step of “Explicit block” should be delayed until the policies and procedures are available. Blocking netflows in large complex enterprises should be handled delicately since these environments may require flows opened that simply didn’t show up during the analyze phase.
For complex poorly documented operational environments, it may be more reasonable to “alert on unused ports” instead of “block unused ports” during the early parts of the transition. However at some point the phase of “explicit deny” must conclude with “block unused ports”.
Advanced organizations might consider replacing simple “blocks” with redirects. For organizations that actively threat hunt, redirecting an unallowed/unused flow to a honeypot can quickly alert the crew to call Hunt On! Unused ports are easily identified in the Netflow capture since the unused ports simply will not show up in the list. For example, if Port 3389 (a port associated with Remote Desktop Connection) doesn’t show up during the monitor phase, and the team knows that there are no reasonable and acceptable outbound remote desktop connections, then an advanced team might consider redirect 3389 to a honeypot. If any devices wind up landing on that honeypot, the hunt team needs to search for the rogue device and user.
3. Refactor “active” netflows
Once the “known unused” ports have been handled successfully and the organization defaults to “Block” or “Redirect to Honeypot”, it is time to move on to refactoring the “active” netflows.
Refactoring reduces the firewall ruleset. If there are 150,000 endpoints in an environment, it is likely a good idea to distill those into different types of endpoints — for example, Workstations, Servers, Phones, and Cameras. The simplest refactoring will identify “all <specific types of> endpoints” allowed outbound traffic to “all destinations” over “listed ports”. For example, “<all Workstations> allowed outbound traffic to <all Internet destinations> over port 80 & 443”. However, this is just the beginning of this phase of tightening down the firewall.
In operational environments, refactoring operational ports is likely a multi-phased approach; one phase covering workstation endpoints; another phase covering servers; and several phases covering “other endpoints” like phones, cameras, and keypads/door entry systems. Eventually the firewall will have a collection of rules for many different types of endpoints.
For example, say that Ports 25, 465, and 587 show up in the “operational port” report. These ports are associated with SMTP (also known as Simple Mail Transport Protocol). While it is reasonable for a mail relay such as an Exchange server to communicate over these ports, it is less reasonable that a workstation/user endpoint relay their own mail. The ruleset should Allow the Exchange server and Deny all other systems.
Example: Web traffic
Another example exists for web traffic over 80 and 443. While it may be reasonable to open web traffic for all endpoints, an adversary can use those allowed flows to exfiltrate traffic. One might consider, is it reasonable for a Server to contact web sites over 80 & 443, or only Workstation endpoints configured for user traffic? Even moreso, is it appropriate for even the Workstation endpoints to communicate out directly, or is there a web proxy protecting the end users from visiting known malicious web sites?
4. Continue monitoring netflows (threat monitoring)
Threat hunters need data, and netflows are an invaluable form of data to a hunter. Continue monitoring netflows even after the firewalls have been normalized. The continuous monitoring provides data that is useful for computer network defenders and threat hunters. Identifying anomalies is a bases for alert generation, and identifying anomalous traffic volumes is an event that should trigger an alert.
Conclusion and after thoughts
Firewalls are “moderators to the real world”, they defend against inbound malicious traffic, and they defend against adversaries who are trying to exfiltrate traffic on outbound ports. Defending your precious sensitive data requires a fully operational bi-directional firewall.
Managing operational environments is a task in balancing many parts of a complex puzzle, from satisfying user demands, to enforcing security, to addressing Cxx level board room concerns. Managing underused firewalls in these operational environments can be an undoubtedly perilous concern, and managing firewalls is equally necessary to properly protect the environment.
As always, Prior planning prevents poor performance, and this adage holds true for deploying Firewall changes in operational environments. Make a plan, and stick to it. But what happens if the plan has too many edge cases? If the need arises to deviate from the Firewall Protection Plan, change the plan itself and restart instead of deviating from the plan.