Detecting C2-Jittered Beacons with Frequency Analysis
Hola everyone,
Today we are going to learn about Frequency Analysis using Fourier, applied to Cyber Security. This tool will allow us to find patterns within our dataset, in a much easier way than doing it in the time domain.**
As last time, do not worry, I will leave a link to my GitHub at the very end under "References & More Useful Information" so you can copy everything if you want.
**Disclaimer!! Remember that the problem you are trying to solve might be slightly different than the one I am presenting, and maybe time-domain tools work best for your case. Do your own analysis before copy-pasting the code in the GitHub for optimal results.
Executive Summary
What is Frequency Analysis.. and why do we care?
It allows us to count "bins" of data and aggregate them by frequency. In other words, what is the frequency at which something happens? Is a C2 beacon phoning home every hour, every 24 hours, or every minute? It does not matter, as long as it is a periodic signal.
And that is what frequency analysis is all about—detecting periodic patterns.
Does that mean that everything we detect with our tool is going to be malicious? Nope. Remember, this is just a tool to detect periodic patterns. How you interpret the data is completely up to you and your use case, and this is where domain expertise will be needed.
Frequency Analysis
Fourier
- Fourier Series – Represents periodic functions as a sum of sines and cosines.
- Fourier Transform (FT) – Converts a time-domain signal into its frequency-domain representation.
- Fast Fourier Transform (FFT) – A computationally efficient algorithm to compute the Fourier Transform.
- Applications – Used in signal processing, image analysis, audio compression (MP3), medical imaging (MRI), and physics.
Detecting C2 Beacons with Fourier
Why C2 Beacons?
Because that is something periodic. I do not really care if the threat actor is pinging every millisecond (which would be very easy to detect since it creates a lot of noise) or every week to remain stealthy. Remember, we care about periodicity—so as long as it happens two or more times, it is a frequency we can detect.
You can absolutely use this tool to detect DDoS attempts, stealthy Password Spraying attempts, and many more.
What do we need to detect this?
-
NumPy – Contains the
fft
module to perform the Fast Fourier Transform (FFT) on our data. -
Matplotlib – Used to create visualizations.
-
Pandas – For mathematical operations and handling DataFrames with our data.
I have broken down the process to do this into 3 simple steps.
Step 1.
- Blue Graph: Your network traffic data.
- Red Graph: A C2 beacon hidden in your network. You are not expected to plot this as we are supposed to find it, but because I am faking the data for educational purposes, I am showing it there.
- Green Graph: The Blue and Red signals combined (with the beacon's amplitude exaggerated for learning purposes).
- Bottom Graph: The FFT of our Green Signal, showing how the Red and Green components make the beacon easily detectable (big spike).
But now, what if you want to do this with your own data? Let's continue with Steps 2 and 3
Step 2.
This will allow you to build your own Green Graph, which is essential for identifying patterns.
Step 3.
Why is Sampling Important in Fourier Analysis?
Fourier analysis decomposes signals into their frequency components. When dealing with digital signals, sampling ensures that we can accurately analyze and reconstruct the signal in the frequency domain.
Nyquist-Shannon Sampling Theorem – This theorem states that to avoid loss of information (aliasing), the sampling rate must be at least twice the highest frequency present in the signal.
- If we sample too slowly (under sampling), high-frequency components will fold into lower frequencies, causing aliasing (distortion).
- If we sample at or above the Nyquist rate, we can perfectly reconstruct the original signal.
C2 beacon - 1h Periodicity
Beacon Frequency
-
The beacon occurs once per hour = 1 cycle per 3600 seconds
-
Frequency of the beacon signal:
-
-
Nyquist Sampling Rate
-
To properly detect this beacon, the sampling rate must be at least:
-
-
-
Convert to More Intuitive Units
-
A sampling frequency of 0.00056 Hz means:
minutes per sample 0.000561≈1800 seconds per sample=30 minutes per sample
-
So, you would need to take at least one sample every 30 minutes to reliably detect the beacon.
-
Choosing a Practical Sampling Rate
-
Nyquist Rate (Minimum): 1 sample every 30 minutes
-
Better Sampling (for accuracy): 1 sample every 10-15 minutes
-
High-resolution detection: 1 sample every 5 minutes or less
What if C2s have "Jitter"?
If you're still reading, congratulations! That means you are REALLY interested in finding beacons, so you’ve probably figured out that, so far... everything has been pretty theoretical.
In the real world, with real Threat Actors, they do not ping every hour exactly; they use "jitter." Jitter is a "randomness" in time and amplitude in their beacon signal, which translates to: "maybe now I ping back every 23 hours, or maybe every 25 hours, or maybe I ping back with a very strong signal, or a light one," at random intervals, making it stealthier.
- Blue Graph: Your network traffic data.
- Red Graph: A C2 beacon hidden in your network. This time, I have added some jitter in both time and amplitude. This beacon pings back every 20-29 hours.
- Green Graph: Blue + Red signals combined. The beacon amplitude is exaggerated for learning purposes.
- Bottom Graph: FFT of our Green signal. We can see how the Red and Green signals easily detect our beacon (big spike).
Now, we can see exactly the same thing, but this time, our Red signal (beacon) has jitter in terms of both amplitude and periodicity. It will ping back between 20 and 29 hours, and the amplitude also has a modifier to make it more random.
Upon analysis, we can see that we still have a spike, but it's much more difficult to detect. It’s flagged as the expected frequency where the beacon should be.
Note that this example still has a beacon with significant amplitude compared to our original signal, and even with that, it is hard to find. So, what can we do?
Sliding Window
Why Use a Window?: Jittered signals may have non-stationary components, meaning the periodic pattern can drift or change over time.
Wait, what? Why is it not better if we apply a Sliding Window technique?
For this example, I have created a window of 1 week that overlaps by 1 day and keeps moving. It effectively performs tiny FFTs within each window and then aggregates them at the end.
So why is this technique not necessarily better? Because it depends on what you're trying to find!
This test suggests that for my case, a window of 70 hours is optimal for detecting my beacon, which, in hindsight, sounds about right. Remember that my beacon was pinging back every 20-29 hours in a normal distribution fashion.
If I calculate manually, the minimum time required to ensure I always get at least 2, preferably 3 samples within our window size for overlapping is 3 days, which is roughly 70 hours.
This improves the use case of not using a window technique by roughly 15-20% in terms of magnitude, helping to detect beacons more effectively in our example.
Final Remarks
This has been a very fast overview of how to detect C2 beacons using Fourier, which honestly simplifies your life quite a lot when compared to trying to achieve the same in the time domain.
I haven’t gotten into fact-proving any of my claims or very important math demonstrations for you to fully understand this, but if you’re curious, I am linking some resources below.
All these results have been empirical through my testing, and they showcase that it’s not impossible to fight jitter; it just requires a more complex approach to do so.
Most of these concepts come from Signal Processing rather than Cybersecurity itself, but I hope I’ve demonstrated their value to you with this example. Of course, think big—C2s are just the beginning. There are plenty of interesting use cases out there where this tool will shine the most.
Lastly, this can be either created as a rule in your Jupyter Lab if you have some detection as a code pipeline or as threat hunting if you just export data every now and then. In the end we are looking for that big spike, so it´s a matter of playing with those thresholds and analyze top 3 or top 5 biggest spikes on your FFT graph to figure out if its malicious or not.
References & More Useful Information
- My GitHub, full code explained in this post can be found - HERE
- First Fourier example, fake data and rfft python code - HERE
- Fourier with time and amplitude Jitter python code - HERE
- Fourier with Sliding Window Technique Applied, python code - HERE
- Fourier, Optimal Window Calculation, python code - HERE
- Pandas Tutorial - HERE
- Pandas Tutorial - Video format - HERE
- Fourier Additional Resources - Visual Explanation - HERE
Comments
Post a Comment
Any comments / questions, please write it down in here!