I have started developping applications based on Mind Mapping to help professionals in the analysis of security log files.
My first example, to start with something easy, has been creating a program to analyze Endpoint Protector log files.
Here is a presentation about this issue.
I plan to create more complex applications in case they can be useful to the information security community.
I would like to get feedback about the first impressions about the possibilities of Mind Mapping in Security Visualization.
Here is an Apache logfile visualisation created in Gephi using data extracted from logstash and elasticsearch.
See blog post at: http://data.andyburgin.co.uk/post/65706647269/visualising-logstash-apache-data-in-gephi
To see the animated version check out http://youtu.be/29FV0zj7UJg
I have spent quite a bit of time with the VAST 2013 Mini Challenge 1. The given network traffic log is interesting, but bears some challenges. One of them is the ominous source/destination confusion where the network flow collector didn't correctly record the client side of the connection as the source, but recorded it as the destination. That will create all kinds of problems in your data analysis and you therefore have to fix that first.
I wrote a blog entry on Cleaning Up Network Traffic Logs where I am going step by step through the network logs to determine which records need to be turned around. I am using both SQL and some parallel coordinate visualizations to get the job done. The final outcome is this one-liner Perl hack to actually fix the data:
$ cat nf*.csv | perl -F\,\ -ane 'BEGIN {@ports=(20,21,25,53,80,123,137,138,389,1900,1984,3389,5355);
%hash = map { $_ => 1 } @ports; $c=0} if ($hash{$F[7]} && $F[8}>1024)
{$c++; printf"%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s",
$F[0],$F[1],$F[2],$F[3],$F[4],$F[6],$F[5],$F[8],$F[7],$F[9],$F[10],$F[11],$F[13],$F[12],
$F[15],$F[14],$F[17],$F[16],$F[18]} else {print $_} END {print "count of revers $c\n";}
Read the full article here: Cleaning Up Network Traffic Logs
If you want to know how to setup a columnar data store to query the network flows, I also wrote a quick step by step guide on loading the network traffic logs into Impala with a Parquet storage engine.
Big data and security intelligence are the two hot topics in security for 2013. We are collecting more and more information from both the infrastructure, but increasingly also directly from our applications. This vast amount of data gets increasingly hard to understand. Terms like map reduce, hadoop, mongodb, etc. are part of many discussions. But what are those technologies? And what do they have to do with security intelligence? We will see that none of these technologies are sufficient in our quest to defend our networks and information. Data visualization is the only approach that scales to the ever changing threat landscape and infrastructure configurations. Using big data visualization techniques, you can gain a far deeper understanding of what's happening on your network right now. You can uncover hidden patterns of data, identify emerging vulnerabilities and attacks, and respond decisively with countermeasures that are far more likely to succeed than conventional methods. The attendees will learn about log analysis, big data, information visualization, data sources for IT security, and learn how to generate visual representations of IT data. The training is filled with hands-on exercises utilizing the DAVIX live CD.
This class features brand-new material, first presented at BlackHat USA in July 2013. Here is what students said:
"Raffy obviously put a lot of time and effort into preparing for this course. Having already read the book, I expected a lot of the material to be a re-hash of what I already saw in the book. I was surprised at how much new material there was to get out of it. Looking forward to applying a lot of these concepts in the real world."
"Raffael did a great job! He knows and understands the subject matter extremely well. I highly recommend this course and instructor."
"One of the best trainings I have ever taken!"
Dates: December 9-10 & 11-12, 2013
Location: Washington State Convention Center
Seattle, Washington, USA
Sign Up Now
Early registration discount ends October 24th!
Date: November 3-4, 2013
Location: Dubai, UAW
Sign Up Now
Tools to gather data:
We are also using a number of visualization tools to analyze example data in the labs:
Under the log management section, we are going to discuss:
The section on big data is covering the following:
Raffael Marty is one of the world's most recognized authorities on security data analytics. The author of Applied Security Visualization and creator of the open source DAVIX analytics platform, Raffy is the founder and ceo of PixlCloud, a next-generation data visualization application for big data. With a track record at companies including IBM Research and ArcSight, Raffy is thoroughly familiar with established practices and emerging trends in data analytics. He has served as Chief Security Strategist with Splunk and was a co-founder of Loggly, a cloud-based log management solution. For more than 12 years, Raffy has helped Fortune 500 companies defend themselves against sophisticated adversaries and has trained organizations around the world in the art of data visualization for security. Practicing zen has become an important part of Raffy's life.
Log Analysis
Log Management and SIEM
Visualization
Security Visualization
I was greatly honored when I got an invitation from the Conference on Knowledge Discovery and Data Mining (KDD) to give a talk about data mining and cyber security.
Knowing me, you might be able to guess the topic I chose to present: Visual Analytics. I am focussing on not the visualization layer or the data layer, but on the analytics layer. In the presentation I am showing what we have been doing with data analytics and data mining in cyber security. The presentation starts out with an overview of what security is and what our data looks like. While I show a few examples for different areas in cyber security, I am mainly highlighting problems and challenges we have been facing within these areas with regards to analytics and data mining.
The presentation has 5 parts:
For each of the six areas in data mining, the following slide shows a couple of challenges that one will run into when trying to apply them to cyber security data:
At the end, I am presenting a number of challenges to the community; hard problems that we need help with to advance insights into cyber security of infrastructures and applications. The following slide summarizes the challenges I see in data mining for security:
Definitely not a complete list. Please comment and add other challenges! If you have any suggestions on solving the challenges, please contact me or comment on this post as well!
We are preparing for the next DAVIX release and have constructed a survey to get your input on the tools you would like included, the delivery mechanism, and general information on your security visualization needs. Your participation in the survey would be greatly appreciated!
The survey is located at http://www.surveymonkey.com/s/769KG3C.
We would like to collect all responses by July 31, 2013.
I recently released a short slide deck on AfterGlow.
AfterGlow is a security 'visualization' tool that simplifies the task of creating network graphs. It reads CSV files and converts them into a graph representation based on a set of configurations that the user defines (colors, edge thickness, node sizes, clustering, etc.). AfterGlow is a pretty powerful tool and filly this slide deck summarizes the features and provides a couple of interesting examples of how to use the tool.
These slides will also be part of my Visual Analytics workshop during BlackHat at the end of the month. There are still a couple of seats available!
The 10th Visualization for Cyber Security (VizSec) will be held in Atlanta GA, USA on October 14, 2013 in conjunction with IEEE VIS. VizSec brings together researchers and practitioners in information visualization and security to address the specific needs of the cyber security community through new and insightful visualization techniques.
The paper deadline has been extended to July 22, 2013 at 5:00pm PDT. Full papers offering novel contributions in security visualization are solicited. Papers may present techniques, applications, practical experience, theory, analysis, or experiments and evaluations. We encourage papers on technologies and methods that promise to improve cyber security practices, including, but not limited to:
We are also soliciting posters. Poster submissions may showcase late-breaking results, work in progress, preliminary results, or visual representations relevant to the VizSec community. Accepted poster abstracts will be made available on this website. Poster submissions are due August 23, 2013 at 5:00pm PDT.
See vizsec.org for the full Call for Papers and additional details.
This year's IEEE VAST Challenge features two mini-challenges that particularly appeal to the SecViz community. These challenges are open to participation by individuals and teams in industry, government, and academia. Creative approaches to visual analytics are encouraged.
Mini-Challenge 2 tests your skills in visual design. The fictitious Big Enterprise is searching for a design for their future situation awareness display. The company's intrepid network operations team will use this display to understand the health, security, and performance of their entire computer network. This challenge is also very different from previous VAST Challenges, because there is no data to process and no questions to answer. Instead, the challenge is to show off your design talents by producing a creative new design for situation awareness. Please visit http://www.vacommunity.org/VASTchallenge2013MC2 for more information.
Mini-Challenge 3 focuses on unusual happenings on the computer network of a marketing company. Can you identify what looks amiss on the network using the network flow and network health data provided? And can you ask the right questions to help you piece together the timeline of events? Two weeks of data will be released for this challenge. Week 1 data is now available. Please visit http://www.vacommunity.org/VASTchallenge2013MC3 for more details.
For more information, please contact vast_challenge@ieeevis.org
Big data and security intelligence are the two hot topics in security for 2013. We are collecting more and more information from both the infrastructure, but increasingly also directly from our applications. This vast amount of data gets increasingly hard to understand. Terms like map reduce, hadoop, mongodb, etc. are part of many discussions. But what are those technologies? And what do they have to do with security intelligence? We will see that none of these technologies are sufficient in our quest to defend our networks and information. Data visualization is the only approach that scales to the ever changing threat landscape and infrastructure configurations. Using big data data visualization techniques, you can gain a far deeper understanding of what's happening on your network right now. You can uncover hidden patterns of data, identify emerging vulnerabilities and attacks, and respond decisively with countermeasures that are far more likely to succeed than conventional methods. The attendees will learn about log analysis, big data, information visualization, data sources for IT security, and learn how to generate visual representations of IT data. The training is filled with hands-on exercises utilizing the DAVIX live CD.
Log Analysis
Log Management and SIEM
Visualization
Security Visualization
Raffael Marty is one of the world's most recognized authorities on security data analytics. The author of Applied Security Visualization and creator of the open source DAVIX analytics platform, Raffy is the founder and ceo of PixlCloud, a next-generation data visualization application for big data. With a track record at companies including IBM Research and ArcSight, Raffy is thoroughly familiar with established practices and emerging trends in data analytics. He has served as Chief Security Strategist with Splunk and was a co-founder of Loggly, a cloud-based log management solution. For more than 12 years, Raffy has helped Fortune 500 companies defend themselves against sophisticated adversaries and has trained organizations around the world in the art of data visualization for security. Practicing zen has become an important part of Raffy's life.
This image shows network traffic to and from a single host in a smaller network. The visualization uses HTML (SVG) and D3 to render the SVG. It's fully interactive so that explorations become possible.
A single selection is visible here. The non selected links and nodes fade out and the selected connections are detailed out in the text box.
In conjunction with the 2013 IEEE International Conferences on Intelligence and Security Informatics (ISI), we present a special topics workshop on:
Evaluating Security Visualizations in Supporting Analytical Reasoning & Decision Making in Cybersecurity
Workshop Description
As the potential for visualizations in cybersecurity analysis becomes exceedingly more apparent, efforts to evaluate these visualizations become more imperative than ever to supporting the cybersecurity mission. As technology and big data continue to grow rampantly so does the deployment of insufficiently evaluated cybersecurity visualizations that claim to be most aligned with how analysts think and perceive data. Before organizations may intelligently incorporate visualization into their cybersecurity analysis process they must be prepared to pose tailored sets of questions that directly relate to the particular objective of the cyber analyst. This workshop addresses these gaps with the intent of bringing together experts from a variety of disciplines relevant to the topic of evaluating cybersecurity visualizations in their ability to support analytic reasoning and decision making in cybersecurity.
Paper Topics
We welcome paper submissions on the following or related topics:
Empowering the Human Analysts
Methods and techniques for evaluating the impact cybersecurity visualizations have on enabling the human perception and cognitive processes that are required for intelligent decision making.
Addressing current deficiencies in cybersecurity analysis
Methods and techniques for measuring the impact cybersecurity visualization tools have on addressing current deficiencies that still exist in cybersecurity analysis such as exploration and prediction.
The Unique nature of Cybersecurity Visualization
Identifying aspects that are specific to cybersecurity visualization, and identifying relevant contributions from current research in the broader fields of information visualization and scientific visualization, and from visualizations in other domains.
Important Dates
Workshop papers due: March 31, 2013
Notices of acceptance and comments provided to authors: April 12, 2013
Camera ready paper submitted: April 29, 2013
Website: http://www.isiconference2013.org/pgs/workshop-on-cybersecurity-visualizations.php
Paper Submission:
Submission file formats are PDF and Microsoft Word. Required Word/LaTex templates (IEEE two-column format) can be found on IEEE's Publications web pages. Submissions can be long (6,000 words, 6 pages max) or short (3000 words, 3 pages max). Papers in English must be submitted by email to Lisa Coote at Lisa.Coote@innovative-analytics.com. The accepted workshop papers from will be published by the IEEE Press in formal Proceedings. Authors who wish to present a poster and/or demo may submit a 1-page extended abstract, which, if selected, will appear in the conference proceedings.
Conference content will be submitted for inclusion into IEEE Xplore as well as other Abstracting and Indexing (A&I) databases. The selected IEEE ISI 2013 best papers will be invited for contribution to the Springer Security Informatics Journal.
Organizing Committee:
Kevin O'Connell, Innovative Analytics & Training
Lisa Coote, Innovative Analytics & Training
Program Committee:
Raffael Marty, PixlCloud
Tomas Budavari, John Hopkins University
Antonio Sanfilippo, Pacific Northwest National Laboratory
John T. Langton, VisiTrend LLC
Claudio Silva, NYU Polytechnic
Bernice Rogowitz, Visual Perspectives Consulting
Cullen Jackson, APTIMA
Enrico Bertini, NYU Polytechnic
John Goodall, Oak Ridge National Laboratory
VizSec 2013 will be held in Atlanta, Georgia on October 14, 2013 in conjunction with IEEE VIS. Paper submissions are due July 8, 2013 and poster abstracts are due August 23, 2013.
The 10th International Symposium on Visualization for Cyber Security (VizSec) is a forum that brings together researchers and practitioners from academia, government, and industry to address the needs of the cyber security community through new and insightful visualization and analysis techniques. VizSec will provide an excellent venue for fostering greater exchange and new collaborations on a broad range of security- and privacy-related topics. Accepted papers will appear in the ACM Digital Library as part of the ACM International Conference Proceedings Series.
Important research problems often lie at the intersection of disparate domains. Our focus is to explore effective, scalable visual interfaces for security domains, where visualization may provide a distinct benefit, including computer forensics, reverse engineering, insider threat detection, cryptography, privacy, preventing 'user assisted' attacks, compliance management, wireless security, secure coding, and penetration testing in addition to traditional network security. Human time and attention are precious resources. We are particularly interested in visualization and interaction techniques that effectively capture human analyst insights so that further processing may be handled by machines, freeing the analyst for other tasks. For example, a malware analyst might use a visualization system to analyze a new piece of malicious software and then facilitate generating a signature for future machine processing. When appropriate, research that incorporates multiple data sources, such as network packet captures, firewall rule sets and logs, DNS logs, web server logs, and/or intrusion detection system logs, is particularly desirable.
See http://www.vizsec.org/ for additional information.
It might not be the pretties graph, but it tells a story. This graphs was generated with R and shows the number of HTTP connections per Host header field. I generated this graph with data from Bro's HTTP.log file.
A few other graphs and R scripts can be found on my blog, http://anthonykasza.webs.com/blog23.html
This graph shows a part of the global SSL trust relationships. It was generated with AfterGlow and Gephi (by using the -k parameter of AfterGlow to generate a GDF file). Node size is based on the number of children for each node. The big green node is the DFN CERT. And no, it doesn't mean that the DNF CERT is trusted more than any other certificate authority...
As I've been putting together an R package for mining open source IP "intelligence" data, I decided to play with visualizing malicious host categories in AlienValut's IP reputation database. This image is a network graph plot (using R & igraph) of AlienVault identified C&C nodes as they relate to host ASNs (with ASN peers included). Red nodes are the C&C hosts, gold nodes are the the ASNs.
I did the same with a subset of "Malicious Hosts" in AlienValut's db and am thinking that tracking these over a day (/week/month/year) would make for an interesting view of the ebb & flow of C&C hosts.
In December I'll be presenting on security intelligence and the interplay of visualization and data mining.
I wrote a blog post that introduces the talk in Palo Alto a little bit. It's about Supercharging Visualization with DataMining. Check it out and make sure you RSVP for the event tomorrow.