
QuestionStart up the Wireshark packet sniffer, and begin Wireshark packet captureOpen the Windows Command Prompt application and type the following command:ping -n 10The argument “-n 10” indicates that 10 ping messages should be sent.When the Ping program terminates, stop the packet capture in Wireshark.solve the following1. What is the IP address of your host? What is the IP address of the destination2. Why is it that an ICMP packet does not have source and destination port numbers?3. Examine one of the ping request packets sent by your host.a. What are the ICMP type and code numbers?b. What other fields does this ICMP packet have?4. Examine the corresponding ping reply packet.a. What are the ICMP type and code numbers?b. What other fields does this ICMP packet have?5. How many bytes are the checksum, sequence number and identifier fields?Filter modelsare agnostic to the particular classification algorithm being used. In some cases, it may beuseful to leverage the characteristics of the specific classification algorithm to select features.As you will learn later in this chapter, a linear classifier may work more effectively with a setof features where the classes are best modeled with linear separators, whereas a distance-based classifier works well with features in which distances reflect class distributions.Therefore, one of the inputs to wrapper-based feature selection is a specific classifica-tion induction algorithm, denoted by A. Wrapper models can optimize the feature selectionprocess to the classification algorithm at hand. The basic strategy in wrapper models is toiteratively refine a current set of features F by successively adding features to it. The algo-rithm starts by initializing the current feature set F to {}. The strategy may be summarizedby the following two steps that are executed iteratively:1. Create an augmented set of features F by adding one or more features to the currentfeature set.2. Use a classification algorithm A to evaluate the accuracy of the set of features F. Usethe accuracy to either accept or reject the augmentation of F.The augmentation of F can be performed in many different ways. For example, a greedystrategy may be used where the set of features in the previous iteration is augmented withan additional feature with the greatest discriminative power with respect to a filter criterion.Alternatively, features may be selected for addition via random sampling. The accuracy ofthe classification algorithm A in the second step may be used to determine whether thenewly augmented set of features should be accepted, or one should revert to the set offeatures in the previous iteration. This approach is continued until there is no improvementin the current feature set for a minimum number of iterations. Because the classificationalgorithm A is used in the second step for evaluation, the final set of identified features willbe sensitive to the choice of the algorithm A.A frequent pattern mining model describes the data in terms of an underlying code-book of frequent patterns. The larger the size of the code-book (by using frequent pat-terns of lower support), the more accurately the data can be described. These modelsare particularly popular, and some pointers are provided in the bibliographic notes.All these models represent the data approximately in terms of individual condensed compo-nents representing aggregate trends. In general, outliers increase the length of the descrip-tion in terms of these condensed components to achieve the same level of approximation.For example, a data set with outliers will require a larger number of mixture parame-ters, clusters, or frequent patterns to achieve the same level of approximation. Therefore,in information-theoretic methods, the components of these summary models are looselyreferred to as “code books.” Outliers are defined as data points whose removal results inthe largest decrease in description length for the same error. The actual construction of thecoding is often heuristic, and is not very different from the summary models used in conven-tional outlier analysis. In some cases, the description length for a data set can be estimatedwithout explicitly constructing a code book, or building a summary model. An example isthat of the entropy of a data set, or the Kolmogorov complexity of a string. Readers arereferred to the bibliographic notes for examples of such methods.While information-theoretic models are approximately equivalent to conventional modelsin that they explore the same trade-off in a slightly different way, they do have an advantagein some cases. These are cases where an accurate summary model of the data is hard toexplicitly construct, and measures such as the entropy or Kolmogorov complexity can beused to estimate the compressed space requirements of the data set indirectly. In such cases,information-theoretic methods can be useful. In cases where the summary models can beexplicitly constructed, it is better to use conventional models because the outlier scoresare directly optimized to point-specific deviations rather than the more blunt measure ofdifferential space impact. The bibliographic notes provide specific examples of some of theaforementioned methods.Discuss the advantages and disadvantages of clustering models over distance-basedmodels.8. Implement a naive distance-based outlier detection algorithm with no pruning.9. What is the effect of the parameter k in k-nearest neighbor outlier detection? Whendo small values of k work well and when do larger values of k work well?10. Design an outlier detection approach with the use of the NMF method of Chap. 6.11. Discuss the relative effectiveness of pruning of distance-based algorithms in data setswhich are (a) uniformly distributed, and (b) highly clustered with modest ambientnoise and outliers.12. Implement the LOF-algorithm for outlier detection.13. Consider the set of 1-dimensional data points {1, 2, 2, 2, 2, 2, 6, 8, 10, 12, 14}. What arethe data point(s) with the highest outlier score for a distance-based algorithm, usingk = 2? What are the data points with highest outlier score using the LOF algorithm?Why the difference?14. Implement the instance-specific Mahalanobis method for outlier detection.15. Given a set of ground-truth labels, and outlier scores, implement a computer programto compute the ROC curve for a set of data points.16. Use the objective function criteria of various outlier detection algorithms to designcorresponding internal validity measures. Discuss the bias in these measures towardsfavoring specific algorithms.17. Suppose that you construct a directed k-nearest neighbor graph from a data set. Howcan you use the degrees of the nodes to obtain an outlier score? What characteristicsdoes this algorithm share with LOF?Image transcription text4) If you roll a die, what is the probability that you get a 3 facingup? Explain how you got your answer. 16 There are six sides of adie, one side with 3. When two events occur together … Show more… Show moreImage transcription text3. The AVERAGE of the VALUES of aset of events = the sum of(PROBABILITY of each e… Show more… Show moreImage transcription textProbIEm 1* The Cherno?’ bound. TheCherno?’ bound is a powerful tool thatrelies on the transform as… Show more… Show moreImage transcription textParagraph Subtle Em- Select -Styles Editing Calculating Upper TailProbabilities: P(T>t), … Show more… Show moreEngineering & TechnologyComputer ScienceCOMPUTER S CISS 170
Get a plagiarism-free order today we guarantee confidentiality and a professional paper and we will meet the deadline.