Detecting C2-Jittered Beacons with Frequency Analysis

 Hola everyone,

Today we are going to learn about Frequency Analysis using Fourier, applied to Cyber Security. This tool will allow us to find patterns within our dataset, in a much easier way than doing it in the time domain.**

As last time, do not worry, I will leave a link to my GitHub at the very end under "References & More Useful Information" so you can copy everything if you want.

**Disclaimer!! Remember that the problem you are trying to solve might be slightly different than the one I am presenting, and maybe time-domain tools work best for your case. Do your own analysis before copy-pasting the code in the GitHub for optimal results.

-----------------------------------------------------------------------------------------------------------------------------

Executive Summary

Frequency Analysis using Fourier for Detection Engineering & Threat Hunting. Detecting C2 Beacons with and without Jitter, a technical analysis.
-----------------------------------------------------------------------------------------------------------------------------

What is Frequency Analysis.. and why do we care?

"In cybersecurity, frequency analysis is used to detect anomalies, uncover hidden patterns, and analyze encrypted or obfuscated data. It helps identify unusual network traffic, detect malware communication, and break weak encryption by studying the frequency of characters, packets, or access patterns. 

For example, in intrusion detection systems (IDS), analyzing packet frequency can reveal denial-of-service (DDoS) attacks, while monitoring login attempts can help detect brute-force attacks. 

Additionally, frequency analysis is used in cryptanalysis to break simple ciphers by analyzing letter or byte distributions. By leveraging statistical patterns, cybersecurity professionals can enhance threat detection and mitigate security risks."

It allows us to count "bins" of data and aggregate them by frequency. In other words, what is the frequency at which something happens? Is a C2 beacon phoning home every hour, every 24 hours, or every minute? It does not matter, as long as it is a periodic signal.

And that is what frequency analysis is all about—detecting periodic patterns.

Does that mean that everything we detect with our tool is going to be malicious? Nope. Remember, this is just a tool to detect periodic patterns. How you interpret the data is completely up to you and your use case, and this is where domain expertise will be needed.


Frequency Analysis

Fourier

We are very lucky, as it turns out that someone by the name of Joseph Fourier, a French mathematician and physicist, created the Fourier Transform, which breaks down complex signals into simpler sinusoidal components.


1. "Fournier" is a Spanish card maker, I personally always read it as "Fourier" - AI-Generated

    But wait… how is this related to cybersecurity? It is not yet related! But let me do a super quick intro on how this works.

    Key Concepts Related to Fourier:
    • Fourier Series – Represents periodic functions as a sum of sines and cosines.
    • Fourier Transform (FT) – Converts a time-domain signal into its frequency-domain representation.
    • Fast Fourier Transform (FFT) – A computationally efficient algorithm to compute the Fourier Transform.
    • Applications – Used in signal processing, image analysis, audio compression (MP3), medical imaging (MRI), and physics.
    We are interested in the Fast Fourier Transform, as the first two are just theoretical foundations of the third, and the last one is applied (mostly) to different fields.

    Detecting C2 Beacons with Fourier

    Why C2 Beacons? 

    Because that is something periodic. I do not really care if the threat actor is pinging every millisecond (which would be very easy to detect since it creates a lot of noise) or every week to remain stealthy. Remember, we care about periodicity—so as long as it happens two or more times, it is a frequency we can detect.

    You can absolutely use this tool to detect DDoS attempts, stealthy Password Spraying attempts, and many more.

    What do we need to detect this?

    • NumPy – Contains the fft module to perform the Fast Fourier Transform (FFT) on our data.

    • Matplotlib – Used to create visualizations.

    • Pandas – For mathematical operations and handling DataFrames with our data.

    I have broken down the process to do this into 3 simple steps.

    Step 1.

    Collect any data you want to analyze for a C2 beacon. This could include NetFlow logs, proxy logs, firewall logs, WAF logs, raw data from your IDS, etc.

    I am providing an example here with some "fake data"- plotted below.


    2. Fourier example, big spike on bottom graph is our C2 beacon

    • Blue Graph: Your network traffic data.
    • Red Graph: A C2 beacon hidden in your network. You are not expected to plot this as we are supposed to find it, but because I am faking the data for educational purposes, I am showing it there.
    • Green Graph: The Blue and Red signals combined (with the beacon's amplitude exaggerated for learning purposes).
    • Bottom Graph: The FFT of our Green Signal, showing how the Red and Green components make the beacon easily detectable (big spike).
    The main purpose of this example was to illustrate beaconing activity, so the amplitude was magnified to make it easier to understand.

    But now, what if you want to do this with your own data? Let's continue with Steps 2 and 3

    Step 2.

    Whenever you get your logs, add them to a Pandas DataFrame and extract any attributes you consider necessary, such as timestamp, amplitude, destination of the connection, number of packets, etc.

    This will allow you to build your own Green Graph, which is essential for identifying patterns.

    Step 3.

    Your signal will need to be sampled.

    Sampling is the process of converting a continuous signal (such as an analog audio waveform) into a discrete signal by measuring its amplitude at regular intervals. These discrete points are called samples, and the rate at which they are taken is called the sampling rate (measured in Hertz, Hz). 

    Why is Sampling Important in Fourier Analysis?

    Fourier analysis decomposes signals into their frequency components. When dealing with digital signals, sampling ensures that we can accurately analyze and reconstruct the signal in the frequency domain. 

    Nyquist-Shannon Sampling Theorem – This theorem states that to avoid loss of information (aliasing), the sampling rate must be at least twice the highest frequency present in the signal.

    • If we sample too slowly (under sampling), high-frequency components will fold into lower frequencies, causing aliasing (distortion).
    • If we sample at or above the Nyquist rate, we can perfectly reconstruct the original signal.


    This means, your sampling will depend on the frequency you are trying to find. 

    You will need to do some math depending on what you want to find. Let me walk you through an example so you can adapt it to your use case.

    C2 beacon - 1h Periodicity

    1. Beacon Frequency

      • The beacon occurs once per hour = 1 cycle per 3600 seconds

      • Frequency of the beacon signal:

        fbeacon=13600 Hz0.00028 Hzf_{\text{beacon}} = \frac{1}{3600} \text{ Hz} \approx 0.00028 \text{ Hz}
    2. Nyquist Sampling Rate

      • To properly detect this beacon, the sampling rate must be at least:

        fs2×fbeaconf_s \geq 2 \times f_{\text{beacon}}
      • fs2×0.000280.00056 Hzf_s \geq 2 \times 0.00028 \approx 0.00056 \text{ Hz}
    3. Convert to More Intuitive Units

      • A sampling frequency of 0.00056 Hz means:

      • 10.000561800 seconds per sample=30 minutes per sample

      • 0.0005611800 seconds per sample=30 minutes per sample
      • So, you would need to take at least one sample every 30 minutes to reliably detect the beacon.

    Choosing a Practical Sampling Rate

    • Nyquist Rate (Minimum): 1 sample every 30 minutes

    • Better Sampling (for accuracy): 1 sample every 10-15 minutes

    • High-resolution detection: 1 sample every 5 minutes or less




    What if C2s have "Jitter"?

    If you're still reading, congratulations! That means you are REALLY interested in finding beacons, so you’ve probably figured out that, so far... everything has been pretty theoretical.

    In the real world, with real Threat Actors, they do not ping every hour exactly; they use "jitter." Jitter is a "randomness" in time and amplitude in their beacon signal, which translates to: "maybe now I ping back every 23 hours, or maybe every 25 hours, or maybe I ping back with a very strong signal, or a light one," at random intervals, making it stealthier.

    I’ve created another example to illustrate this exact problem.: Example with Jitter


    3. Fourier example with Jitter, now it is harder to find, spike is smaller this time

      • Blue Graph: Your network traffic data.
      • Red Graph: A C2 beacon hidden in your network. This time, I have added some jitter in both time and amplitude. This beacon pings back every 20-29 hours.
      • Green Graph: Blue + Red signals combined. The beacon amplitude is exaggerated for learning purposes.
      • Bottom Graph: FFT of our Green signal. We can see how the Red and Green signals easily detect our beacon (big spike).

      Now, we can see exactly the same thing, but this time, our Red signal (beacon) has jitter in terms of both amplitude and periodicity. It will ping back between 20 and 29 hours, and the amplitude also has a modifier to make it more random.

      Upon analysis, we can see that we still have a spike, but it's much more difficult to detect. It’s flagged as the expected frequency where the beacon should be.

      Note that this example still has a beacon with significant amplitude compared to our original signal, and even with that, it is hard to find. So, what can we do?

      Sliding Window

      Why Use a Window?: Jittered signals may have non-stationary components, meaning the periodic pattern can drift or change over time.

      Providing you an example of how to implement it here: Sliding Window Code.

      You can divide the time series into overlapping windows (e.g., 10-20 intervals per window) to better capture the "essence" of the jittered signal.

      4. FFT with Sliding Window applied

        Wait, what? Why is it not better if we apply a Sliding Window technique?

        For this example, I have created a window of 1 week that overlaps by 1 day and keeps moving. It effectively performs tiny FFTs within each window and then aggregates them at the end.

        So why is this technique not necessarily better? Because it depends on what you're trying to find!

        I’ve modified the script to brute force through different window sizes to try to guess which one is optimal for us. You can find the code here: Fourier Optimal Window Code


        5. Optimal Sliding Window Size

          This test suggests that for my case, a window of 70 hours is optimal for detecting my beacon, which, in hindsight, sounds about right. Remember that my beacon was pinging back every 20-29 hours in a normal distribution fashion.

          If I calculate manually, the minimum time required to ensure I always get at least 2, preferably 3 samples within our window size for overlapping is 3 days, which is roughly 70 hours.

          This improves the use case of not using a window technique by roughly 15-20% in terms of magnitude, helping to detect beacons more effectively in our example.


          Final Remarks

          This has been a very fast overview of how to detect C2 beacons using Fourier, which honestly simplifies your life quite a lot when compared to trying to achieve the same in the time domain.

          I haven’t gotten into fact-proving any of my claims or very important math demonstrations for you to fully understand this, but if you’re curious, I am linking some resources below.

          All these results have been empirical through my testing, and they showcase that it’s not impossible to fight jitter; it just requires a more complex approach to do so.

          Most of these concepts come from Signal Processing rather than Cybersecurity itself, but I hope I’ve demonstrated their value to you with this example. Of course, think big—C2s are just the beginning. There are plenty of interesting use cases out there where this tool will shine the most.

          Lastly, this can be either created as a rule in your Jupyter Lab if you have some detection as a code pipeline or as threat hunting if you just export data every now and then. In the end we are looking for that big spike, so it´s a matter of playing with those thresholds and analyze top 3 or top 5 biggest spikes on your FFT graph to figure out if its malicious or not.

          Consider taking SEC595: Applied Data Science and AI/Machine Learning for Cybersecurity Professionals if this was interesting enough to learn more about it!

          References & More Useful Information

          • My GitHub, full code explained in this post can be found - HERE
          • First Fourier example, fake data and rfft python code - HERE
          • Fourier with time and amplitude Jitter python code - HERE
          • Fourier with Sliding Window Technique Applied, python code - HERE
          • Fourier, Optimal Window Calculation, python code - HERE
          • Pandas Tutorial - HERE
          • Pandas Tutorial - Video format - HERE
          • Fourier Additional Resources - Visual Explanation - HERE

          Tags

          #technical #python #C2 #Beaconing #Python #Automation #Jitter #Fourier #FrequencyAnalysis


          Meme of the day



          6. This one, I´m not explaining

          Comments

          Popular posts from this blog

          Web Scraping for Cyber Security

          Introduction