Hortonworks Cybersecurity Platform
Also available as:
PDF

Data Generator

We can create a simple python script to generate a stream of Gaussian noise at the frequency of one message per second as a python script which should be saved at ~/rand_gen.py.

#!/usr/bin/python
import random
import sys
import time
def main():
  mu = float(sys.argv[1])
  sigma = float(sys.argv[2])
  freq_s = int(sys.argv[3])
  while True:
    print str(random.gauss(mu, sigma))
    sys.stdout.flush()
    time.sleep(freq_s)

if __name__ == '__main__':
  main()

This script will take the following as arguments:

  • The mean of the data generated

  • The standard deviation of the data generated

  • The frequency (in seconds) of the data generated

If, however, you'd like to test a longer tailed distribution, like the student t-distribution and have numpy installed, you can use the following as ~/rand_gen.py:

#!/usr/bin/python
import random
import sys
import time
import numpy as np

def main():
  df = float(sys.argv[1])
  freq_s = int(sys.argv[2])
  while True:
    print str(np.random.standard_t(df))
    sys.stdout.flush()
    time.sleep(freq_s)

if __name__ == '__main__':
  main()

This script will take the following as arguments:

  • The degrees of freedom for the distribution

  • The frequency (in seconds of the data generated