Before using the data for learning, it must go through a pre-processing phase to clean the data. Another important step prior to learning, or training a model, is feature extraction. These features act as discriminators for learning and inference. In networking, there are many choices of features to choose from. Broadly, they can be categorized based on the level of granularity. At the finest level of granularity, packet-level features are simplistically extracted or derived from collected packets, e. The key advantage of packet-level statistics is their insensitivity to packet sampling that is often employed for data collection and interferes with feature characteristics [ ].
On the other hand, Flow-level features are derived using simple statistics, such as mean flow duration, mean number of packets per flow, and mean number of bytes per flow [ ]. Whereas, connection-level features from the transport layer are exploited to infer connection oriented details. In addition to the flow-level features, transport layer details, such as throughput and advertised window size in TCP connection headers, can be employed.
Though these features generate high quality data, they incur computational overhead and are highly susceptible to sampling and routing asymmetries [ ]. Feature engineering is a critical aspect in ML that includes feature selection and extraction. It is used to reduce dimensionality in voluminous data and to identify discriminating features that reduce computational overhead and increase accuracy of ML models.
Feature selection is the removal of features that are irrelevant or redundant [ ].
Top Journals for Networks and Communications
Irrelevant features increase computational overhead with marginal to no gain in accuracy, while redundant features promote over-fitting. Feature extraction is often a computationally intensive process of deriving extended or new features from existing features, using techniques, such as entropy, Fourier transform and principal component analysis PCA.
However, in this case, the extraction and selection techniques are limited by the capability of the tool employed. Therefore, often specialized filter, embedded, and wrapper-based methods are employed for feature selection. Filtering prunes out the training data after carefully analyzing the dataset for identifying the irrelevant and redundant features. In contrast, wrapper-based techniques take an iterative approach, using a different subset of features in every iteration to identify the optimal subset. Whereas, embedded methods combine the benefits of filter and wrapper-based methods, and perform feature selection during model creation.
Furthermore, it is important to consider the characteristics of the task we are addressing while performing feature engineering. To better illustrate this, consider the following scenario from network traffic classification.
- Table of Content.
- CIS STUDY Become Exceptional-- monpeotensua.tk by tonimor82 tonimor82 - Infogram.
- Extract of sample "Network Design".
- (DOC) CIS/ Term Paper: Networking | Sheena Lindsey-Smith - monpeotensua.tk.
- Research Papers?
- term papers for high school students.
One variant of the problem entails the identification of a streaming application e. Netflix from network traces. Intuitively, average packet-size and packet inter-arrival times are representative features, as they play a dominant role in traffic classification. Average packet size is fairly constant in nature [ ] and packet inter-arrival times are a good discriminator for bulk data transfer e.
FTP and streaming applications [ ]. However, average packet size can be skewed by intermediate fragmentation and encryption, and packet inter-arrival times and their distributions are affected by queuing in routers [ ]. Furthermore, streaming applications often behave similar to bulk data transfer applications [ ]. Therefore, it is imperative to consider the classes of interest i. Finally, It is also essential to select features that do not contradict underlying assumptions in the context of the problem.
For example, in traffic classification, features that are extracted from multi-modal application classes e. WWW tend to show a non-Gaussian behavior [ ]. These relationships not only become irrelevant and redundant, they contradict widely held assumptions in traffic classification, such as feature distributions being independent and following a Gaussian distribution. Therefore, careful feature extraction and selection is crucial for the performance of ML models [ 77 ]. Establishing the ground truth pertains to giving a formal description i.
There are various methods for labeling datasets using the features of a class.
Free network design Essays and Papers
Primarily, it requires hand-labeling by domain experts, with aid from deep packet inspection DPI [ , ], pattern matching e. AutoClass using EM [ ]. For instance, in traffic classification, establishing ground truth for application classes in the training dataset can be achieved using application signature pattern matching [ ]. Application signatures are built using features, such as average packet size, flow duration, bytes per flow, packets per flow, root mean square packet size and IP traffic packet payload [ , ].
Average packet size and flow duration have been shown to be good discriminators [ ]. Application signatures for encrypted traffic e.
However, these application signatures must be kept up-to-date and adapted to the application dynamics [ ]. Alternatively, it is possible to design and rely on statistical and structural content models for describing the datasets and infer the classes of interest.
For instance, these models can be used to classify a protocol based on the label of a single instance of that protocol and correlations can be derived from unlabeled training data [ ]. On the other hand, common substring graphs capture structural information about the training data [ ].
These models are good at inferring discriminators for binary, textual and structural content [ ]. Inadvertently, the ground truth drives the accuracy of ML models. There is also an inherent mutual dependency on the size of the training data of one class of interest on another, impacting model performance [ ]. The imbalance in the number of training data across classes, is a violation of the assumptions maintained by many ML techniques, that is, the data is independent and identically distributed.
Therefore, typically there is a need to combat class imbalance by applying under-, over-, joint-, or ensemble-sampling techniques [ ]. For example, uniform weighted threshold under-sampling creates smaller balanced training sets [ ]. Average of the absolute error between the actual and predicted values. Facilitates error interpretability. Average of the squares of the error between the actual and predicted values. Heavily penalizes large errors. Percentage of the error between the actual and predicted values. Not reliable for zero values or low-scale data.
Squared root of MSE. Represents the standard deviation of the error between the actual and predicted values. Normalized RMSE. Facilitates comparing different models independently of their working scale.
Metric based on the logistic function that measures the error between the actual and predicted values. Proportion of correct predictions among the total number of predictions. Not reliable for skewed class-wise data. Proportion of actual positives that are correctly predicted. Represents the sensitivity or detection rate DR of a model. Proportion of actual negatives predicted as positives.
Represents the significance level of a model. Proportion of actual negatives that are correctly predicted. Represents the specificity of a model. Proportion of actual positives predicted as negatives. Inversely proportional to the statistical power of a model.
- Packet switching - Wikipedia?
- ECS 252 Computer Networks.
- Theme 1: Design Education.
- statistics paper professional?
- grad school essays format.
- oedipus essay on tragic hero?