Abstract
SDN distinguishes itself from the traditional network by offering numerous advantages such as the separation of the network control from the hardware devices that eliminates the need to configure each device individually. Thus a central network policy can be dispatched to the SDN devices while reducing the time of deployment thereby enhancing profits for the data center or service providers.
Get Help With Your Essay
If you need assistance with writing your essay, our professional essay writing service is here to help!
However, the centralized SDN can introduce security challenges such as distributed denial-of-service (DDoS) attacks against the SDN controller. Thanks to the availability of processing and storage capabilities, a good number of machine learning applications have been developed on networking security. In this paper, we propose to study how ML techniques can be used in SDN environments for security applications. In particular, we propose to study how ML techniques can be used to identify general anomalies or specific attacks.
Index Terms: SDN, Machine Learning, Network Security
I. INTRODUCTION
The idea of separating control and data planes in Software Defined Network (SDN) is of great interest and is the subject of much research both in the scientific world and in commercial industries. Many contributions have been made to this technology, which is still under development in academic and industrial circles. In combination with Network Function Virtualization (NFV), SDN provides a solution to traditional network problems such as scalability and management of many devices in the network. To this, it will be necessary to add the possibility to monitor and control all the traffics within the network, which opens the way to a new approach to secure applications [1].
The SDN functional architecture can be subdivided into three layers namely: the application layer, the control layer and the data layer. However, the control-data interface (or even the communication channel between layers) has been identified as a vulnerable solution because it can be subject to traffic modification and eavesdropping attacks [2]. As this architecture presents additional security issues, and as technology develops, additional elements are required. In order to integrate decision-making models of behavior and reasoning processes, in [3], Clark et al. suggest the addition of a Knowledge Plane (or KP) as an additional entity for the network with the aims to maintain a high-level view of the network and assist in the operation, management, and improvement. For this purpose, Machine Learning is one of the tools used to exploit KP.
Processing and storage capabilities, as well as the availability of large data sets, have been significantly improved and this has made possible the use of powerful advanced techniques such as Machine Learning to provide cognitive capabilities for identifying security vulnerabilities.
There is an important reference of SDN network usage. In [4], Google's B4 is a SDN deployment across the WAN network to connect multiple data centers. This includes a switch design to manage the interconnection with traditional networks and ONIX [5] as a controller. This technique has proved useful for the gradual integration of traditional infrastructure with the SDN infrastructure. However, this implementation did not present any security-related contributions, except for the use of the Paxos algorithm [6] for fault tolerance. Considering that there are no data available on security research in SDN, its becomes a challenge to obtain realistic datasets for IDS.
An overview of the challenges and opportunities for using ML in new technologies such as SDN is presented in [7]. However, description and study are not exhaustive. Other work such as [8], [9] [10], [11], [9] have shown different ML techniques for detecting SDN anomalies, but focuses on methods and lack of analysis from a network security's point of view. A most recent research for network security in SDN environment using ML techniques is on [12]. The authors present the surveyed papers organized per network attacks, in contrast to other surveys related to ML methods used in SDN. It also shows the testbeds and datasets commonly used in the literature.
The purpose of this paper is twofold. We present some of the most methods used for network security in an SDN environment using ML techniques. Our motivation is to contribute to the creation of a KP for SDN, focused on security. On the other hand, we present some approaches organized by network attacks, in contrast to other papers of ML methods used in the SDN.
To this end, we decided to organize this paper as follows. The first section introduces the different notions necessary for the understanding of the general principle and provides an overview of the SDN architecture and security issues. In the second Section, we present studies on ML based techniques for IDS, only with a proposal of the detection model, including data collection methods to feed the ML model, as well as mitigation schemes once the anomaly is detected. The third section presents a simulation using some ML techniques. Finally, section 4 summarizes the paper following by future directions.
Fig. 1. Main component of SDN Architecture: Data plane, Control plane and Applications plane.
II. THEORETICAL FRAMEWORK
In this section, we are interested in the SDN architecture and its security issues. In addition, we discuss an approach to use SDN as a mean to improve network security. Our approach is to analyze the use of Machine Learning to achieve the desired result.
A. SDN architecture and security
SDN was born with the idea of breaking the vertical integration of network equipment. The separation of the Control from the Data Plane, and OpenFlow protocol, as proposed in 2008, contributed a lot to the SDN development. This also allows defining network functions like software applications that can work over the control plane, for instance: routing, firewall, load balancing or bandwidth optimization. As illustrated in the figure 1, SDN architecture has three parts: the data plane (consisting of switches), the control plane (consisting of one or more controllers), and application plane (consisting of one or more network applications).
As discussed by Scott-Hayward et al. [13], the communication protocols between the different layers of the SDN can introduce new vulnerabilities, which are absent in the traditional network. For instance, the use of transport layer security is optional in OpenFlownetwork. These protocols can therefore introduce security issues such as DOS, insertion of fraudulent flow rules and rule changes. The different SDN components as shown in Figure 1 can be subject to attacks. For example, there may be software vulnerabilities in SDN controllers (Opendaylight, ONOS, Floodlight). In addition, the channels communication between the three tiers, i.e., the northbound APIs and southbound APIs, can face security attacks as shown in Figure 2.
Here are some of the attack vectors against targetable components.
Fig. 2. SDN targetable components.
Application Plane
Security issues in web applications such as Cross Site Scripting (XSS) and Cross Site Request Forgery (CSRF) also apply to SDN. The malicious or compromised applications can allow spread of the attack in the entire network.
Control Plane
The control plane include one or more controllers, for instance, OpenDaylight, POX, ONOS and other applications and, plugins to manage different types of protocols.
At this level, an attacker can generate some traffic from spoofed IP address and send a high volume of traffic to the controller [14]. Following this attack, the communication between the switch and the controller can be saturated, the latency increased or in the worst case scenario, cause the controller to stop working.
Data Plane
By forging the Link Layer Discovery Protocol (LLDP) packages, an attacker can poison the global view of the network.
An attacker can also observe the delay in communication between the control plane and data plan applications using specially crafted packages. This can help identify the controller's application logic and also allow to target the switches. The switch responsible for updating data plane flow rules often has limited memory and can be overloaded by generating a large number of flow rules.
Communication Channels
As presented by Romao et al. [15], the communication channel between switches and controllers (Southbound API), or between controllers and the application plan level (Northbound API) can be subjected to Man-in-theMiddle attack. ARP poisoning is an example of such security attacks. Other attacks that target the communication channel include spying traffic between hosts and stealthy changes traffic between hosts.
Figure 3 summarizes some of the security issues associated with different components of SDN as described above.
Fig. 3. Security Issues Associated with Different Layers of SDN
Fig. 4. Components of Open Flow Switch
B. Problem Discussion
In SDN environment, the switches can be Open Flow switches (only forwards the packets) or hybrid Open Flow Ethernet switches (bridging, routing along with forwarding). By opposite to traditional network switches or routers, in SDN environment, Open Flow switch separates these two functions. Figure 4 presents the components of Open Flow switch. Flow tables and group table are used to perform packet lookup and forwarding. Open Flow protocol is used to create, modify and remove flow entries in the flow table. Transport Layer Security (TLS) or Transmission Control Protocol (TCP) are the secure channel used to establish connection between the controller and the switch. Figure 5 shows the sample flow table entry.
Fig. 5. Components of flow entry in flow table
Cookies are used by the controller to filter the flow statistics. The instructions specify the action set for an entry. The packet header fields used to match the flow table entry with the incoming packet to the switch are shown in Figure 6.
Fig. 6. Packet Header matching fields
In the case of a DDoS (Distributed Denial of Service) attack against the controller, the controller becomes inaccessible and will no longer be able to process new legitimate packets. This is because the OpenFlow switch compares the incoming packet (packet header fields such as source port, destination port, source IP address, destination IP address, and so on) against the flow entries. If a match is found, the specified action can be executed. Otherwise, the packet will be sent to the controller using the PacketIn control message. When a large number of spoofed IP address packets are sent together, no match will be found in the flow table and the packet will be sent to the controller. By using this processing delay, the malicious attacker can modify the flow entries and force the legitimate package to be deleted, clone the entries in the flow table, resulting in an overflow in the flow table. Thus, there will not be enough memory space to accept the new flow instructions given by the controller. The controller will try to deal permanently with the legitimate and usurped packages and its resources will run out. Figure 7 illustrates this scenario.
Fig. 7. Modification to flow table by malicious host
The use of the TLS / SSL connection between the switch and the controller does not guarantee a secure communication. When the TLS connection is lost, the switch will try to connect to a backup controller if it is available. This is called "fail secure mode" or "fail standalone mode". In this mode, the switch can use the flow tables as desired. It can add, modify, or delete any entry in the flow table.
The communication between the switch and the controller can be done in two ways. The operator can configure the switches with the IP address of the controller or the controller can initiate the Hello request. When the connection breaks, an attacker can send a Hello request to the switch acting as a legitimate controller and access the flow tables. This scenario is treated as a DDoS attack before establishing the communication and is shown in Figure 8. A possible solution could be deploying an Intrusion Detection System (IDS).
Fig. 8. Malicious controller getting access to the entire network
Initiating an attack against the controller causes the switches losing its operating system. Existing approaches are based on traditional network and so far, there are no many papers that address the security issues of SDN controller.
The Machine Learning algorithm, trained with attack and normal patterns, can be used to detect DDoS attack against the controller. Hence, in the next section, we explore the possibility of launching DDoS attacks and detection of DDoS using the SVM Classifier [16].
C. ML-based intrusion detection Systems in SDN
Support Vector Machine (SVM) is a supervised learning algorithm that recognizes patterns by analyzing the data and use the pattern for classification.
Some widely used SVM methods are:
One-against-One (OVO): constructs N(N-l)/2 two classifiers, samples of the 1st class are trained as positive and samples of the 2nd class as negative.
One-against-all (OVA): N binary classifiers constructed for the N class problem and each class is trained against remaining N-l classes.
Binary Tree of SVM: classes are recursively divided into two groups by calculating the gravity centers of each group. Only (log N) classifiers need to be consulted to classify the test sample.
1) System Overview: The Introduction Detection System must first be fed with traffic information. The SVM classifier widely used to detect the attack and learn the pattern with few training samples. Then produce an accurate classification by reducing the false positive data. Due to it generalization capability, a trained SVM machine is able to classify unknown samples with the model learned from training dataset. SVM finds an optimum hyperplane (largest margin) that separates two classes. However, SVM is a linear classify and in practice, data is non-linear. Thus, kernel functions are used for nonlinear classification, such as linear, polynomial, radial basis function and sigmoidal.The of DDoS detection using SVM classifier is shown in Figure 9 [16].
Fig. 9. System architecture
Traffic data has attributes such as Source IP address, Destination IP address, Source port, Destination port, Protocol used for communication and the length of the packet. Attributes have to be converted as binary attributes, which has only two states or values. It is useful to perform the normalization process that helps to prevent higher values in the attribute dominating the lower range values. If the number of values in multi-valued attribute is n, after the conversion n binary attributes will be created to represent it. The value of the binary attribute has the value 1 when the nominal attributes take that particular value otherwise it is 0. During the normalization process, the attribute value is scaled to fit specific range (e.g. [0, 1 ]). Then, the SVM classifier is trained with training data set and the model is built upon it. Using the pattern recognized, the SVM classifier performs prediction of the category for new unknown traffic sample. The outcome of the SVM classifier for the test data point would be either normal or attack.
III. EXPERIMENTS
This section includes some examples implemented using a free software package called LIBSVM that interfaces with MATLAB. The goal is to reproduce some theories explained in the previous sections.
To reproduce the similar results, the reader should first install Matlab and LIBSVM, this last can be find easly on the Internet. For this experiment, and due to time constraints, we will reproduce results from [16] where Data Classification has been performed.
The dataset used is 2000 DARPA intrusion detection scenario provided by MIT Lincoln lab. The dataset contains DDoS attack launched by a novice attacker. The attack scenario is carried out over multiple network and audit sessions. The brief description of attack scenarios are given are: - IPsweep of a network from a remote site.
Probe of live IP's to look for the sadmind daemon running on Solaris hosts.
Breakins via the sadmind vulnerability, both successful and unsuccessful on those hosts.
Installation of the Trojan mstream DDoS software on hosts in the network.
Fig. 10. Comparison of SVM with RBF Kernels
Fig. 11. Comparison of classification Methods
Launching the DDoS attack.
The dataset includes only attack traffic. For the experimental purpose, the results of SVM DDoS detection method are compared with RBF kernel.
Figures 10 and 11 show how SVM performs better in terms of accuracy and false positive rate. RBF network achieves similar results of SVM, the training time of the RBF is very high.
IV. CONCLUSION
SDN improves the programmability within the network and also provides support for the dynamic nature of future functions. To achieve this goal, security challenges in SDN have to be addressed, e.g. DDoS attack on the controller causes flow table flooding and dropping of legitimate packets. It is important to detect the DDoS attack in the earlier stage. The Machine Learning algorithms detects the DDoS attack with the pattern generated from the dataset. SVM classifier has less false positive rate and high classification accuracy. As future work, one can think to integrate the traffic pattern built in SVM with the SDN controller and detect the DDoS attack online.
REFERENCES
[1] D. Huang, A. Chowdhary, and S. Pisharody, Software-Defined Networking and Security: From Theory to Practice. CRC Press, 2018.
[2] R. Kloti, V. Kotronis, and P. Smith, "Openflow: A security analysis." in¨ ICNP, vol. 13, 2013, pp. 1–6.
[3] D. D. Clark, C. Partridge, J. C. Ramming, and J. T. Wroclawski, "A knowledge plane for the internet," in Proceedings of the 2003 conference on Applications, technologies, architectures, and protocols for computer communications. ACM, 2003, pp. 3–10.
[4] S. Jain, A. Kumar, S. Mandal, J. Ong, L. Poutievski, A. Singh, S. Venkata, J. Wanderer, J. Zhou, M. Zhu, et al., "B4: Experience with a globally-deployed software defined wan," in ACM SIGCOMM Computer Communication Review, vol. 43, no. 4. ACM, 2013, pp. 3–14.
[5] T. Koponen, M. Casado, N. Gude, and J. Stribling, "Distributed control platform for large-scale production networks," Sept. 9 2014, uS Patent 8,830,823.
[6] L. Lamport, "The part-time parliament," ACM Transactions on Computer Systems (TOCS), vol. 16, no. 2, pp. 133–169, 1998.
[7] J. Pan and Z. Yang, "Cybersecurity challenges and opportunities in the new edge computing+ iot world," in Proceedings of the 2018 ACM International Workshop on Security in Software Defined Networks & Network Function Virtualization. ACM, 2018, pp. 29–32.
[8] N. Sultana, N. Chilamkurti, W. Peng, and R. Alhadad, "Survey on sdn based network intrusion detection system using machine learning approaches," Peer-to-Peer Networking and Applications, vol. 12, no. 2, pp. 493–501, 2019.
[9] D. Kwon, H. Kim, J. Kim, S. C. Suh, I. Kim, and K. J. Kim, "A survey of deep learning-based network anomaly detection," Cluster Computing, pp. 1–13, 2017.
[10] T. N. Nguyen, "The challenges in sdn/ml based network security: A survey," arXiv preprint arXiv:1804.03539, 2018.
[11] M. Coughlin, "A survey of sdn security research," University of Colorado Boulder, 2014.
[12] J. A. Herrera and J. E. Camargo, "A survey on machine learning applications for software defined network security," in International Conference on Applied Cryptography and Network Security. Springer, 2019, pp. 70–93.
[13] S. Scott-Hayward, G. O'Callaghan, and S. Sezer, "Sdn security: A survey in: Future networks and services (sdn4fns), 2013 ieee sdn for," IEEE, Trento, Italy, 2013.
[14] K. Kalkan, G. Gur, and F. Alagoz, "Defense mechanisms against ddos attacks in sdn environment," IEEE Communications Magazine, vol. 55, no. 9, pp. 175–179, 2017.
[15] D. Romao, N. van Dijkhuizen, S. Konstantaras, and G. Thessalonikefs,˜ "Practical security analysis of openflow implementation," 2013.
[16] R. Kokila, S. T. Selvi, and K. Govindarajan, "Ddos detection and analysis in sdn-based environment using support vector machine classifier," in 2014 Sixth International Conference on Advanced Computing (ICoAC). IEEE, 2014, pp. 205–210.
Cite This Work
To export a reference to this article please select a referencing style below: