Before we can get into the real nitty-gritty of what Burp Suite is and what it does, we’ll have to take baby steps getting into it. And the first step is configuring Burp Suite to work with our browsers. This Burp Suite setup guide will show you how. First, let’s open it up. I should mention that to run the Burp .jar file you need version 1.6 or later of Java. If you’re not sure what version you have, you can just type “java -version” into Command Prompt and it’ll tell you. Unless your computer has a virus made specifically to stop Burp Suite from running, you should see a splash screen, and then this:New Project

I’m going to assume you didn’t already buy the premium version or Burp, so just click Next with ‘Temporary Project’ selected, and select ‘Use Burp Defaults’ and click Start Burp on the screen after that. Now we’re here:
Burp Home

I remember the reaction I had the first time I came upon this page, which was “Woah”; that top bar has more tabs than I have immediate family members. Don’t you worry dear reader, I’ll go over each tab one by one, and you’ll be a pro at this in no time. For now, we can ignore most of these and focus on what we’re trying to do right now, which is set up Burp with a browser of your choice. Let’s go to the second tab, ‘Proxy’, and then the ‘Options’ subtab under it. I’ll show what we’re looking for specifically:Proxy Listener

Check to make sure that in the Proxy Listeners table there is an entry that has the values I underlined here. If there isn’t, press the gear to the left of the table and then ‘Restore Defaults’.

The next thing we’re going to do is set up your browser to use Burp as an HTTP proxy server. It’s different for every browser, so I’ll just put them all and you can skip ahead to the browser you’re working with.

Internet Explorer:
Press the gear at the top right corner and then ‘Internet Options’. This will take you to this window:IE Internet Options

Go to the Connections tab at the top and press ‘Lan Settings’. Uncheck the ‘Automatically detect settings’ and ‘Use automatic configuration script’ boxes. Check the “Use a proxy server for your LAN” box and enter the Burp proxy listener address and  port which are and 8080 by default. Uncheck “Bypass proxy server for local addresses” box if it’s checked. Click ‘Advanced’ and check the ‘Use the same proxy server for all protocols’ box, and make sure that are no entries in the ‘Exceptions’ field. 

Chrome uses the same proxy settings as your computer, so you can just follow the instructions for Internet Explorer and Chrome will pick up on it as well.

Press the three lines in the top right corner, click on ‘Options’ and then ‘Advanced’ on the left. Click the ‘Network’ tab and click on the ‘Settings’ button under ‘Connection’. Now you’re here:Firefox Connections Options

Select ‘Manual proxy configuration’ and enter your Burp proxy listener ( in the HTTP Proxy field and 8080 for the port. Check the ‘Use this proxy server for all protocols’ box and make sure the ‘No Proxy for’ field is empty (unlike in the picture example).

After Setting Up Browser
I just made this subtitle so you wouldn’t get confused about where the Firefox heading ends. Anyway, try out what you have so far by going to any HTTP website (not HTTPS yet, I’ll get to that).The site shouldn’t load completely, and that’s what’s supposed to happen. Open up Burp again and go to the ‘Proxy’ and then the ‘Intercept’ tab under it. Your HTTP request should be there. This just means that Burp intercepted your HTTP request for tinkering. Click on the ‘Intercept is on’ button so it changes to ‘Intercept is off’, and that will allow the website to load. If you tried to load an HTTPS URL though, you would get a warning from your browser. To allow you to work with HTTPS URL’s, you need to download Burp’s CA certificate, which is different for each browser.

Internet Explorer
With Burp running, go to http://burp/ and click on CA Certificate at the top. Download the file and open it. Click ‘Install Certificate’, then ‘Next’, then ‘Place all certificates in the following store’ and ‘Browse’. Here it should give you a small window with a bunch of different folders. Select ‘Trusted Root Certification Authorities’ and then just click ‘Next’, ‘Finish’, and ‘Yes’ to complete the installation process. Restart IE and you should be able to go to any HTTPS website.

Just as before, Chrome uses the same settings as IE does so just follow the instructions for that.

With Burp running, go to http://burp/ and click on CA Certificate at the top. Download the file, but you don’t have to open it. Press the three little lines at the top right and then ‘Options’. Click on the ‘Advanced’ tab, and then the ‘Certificates’ subtab. Click on ‘View Certificates’. Select the ‘Authorities’ tab, and ‘Import’. Find the file you downloaded just now and click ‘Open’. A dialog box should pop up, check ‘Trust this CA to identify web sites’ and click ‘OK’. Close everything and after restarting Firefox you should be able to go to any HTTPS website.

In The End
If everything is running smoothly, you should be able to intercept HTTP and HTTPS websites without a hitch. In a couple of day I’ll start posting about the different bits and pieces of Burp, and what makes it such a powerful tool.

One of the main aspects of security is penetration testing and vulnerability assessments. Simply put, these terms are just fancy ways of saying that the only safe way to know how you can be hacked is to hack yourself. Companies hire security consultants to legally tear apart their websites piece by piece and put them back together again, stronger and more secure than they were before. Security consultants (and malicious hackers) employ several tools to do their jobs, one of which being Burp Suite.

Burp Suite is an interception proxy. What a proxy is, is it’s a program, computer, or server that acts as a hub that your network will use to access the internet. They’re usually used to anonymize the user by hiding his or her IP address, and replacing it with the address of the proxy instead. This allows the user to hide their identity from the rest of the world. Burp Suite works on the same principle. It takes the internet traffic going through it and (here’s the fun part) lets us mess with this traffic. That’s where the “interception” part of “interception proxy” comes in. I’ll make a separate post on how to set up the program itself and how to configure it with your machine because there are quite a few steps to do that; this post is just to help you understand what you can do with Burp.

Burp has a number of tools that you can use to perform a wide variety of tasks, ranging from simple to incredibly advanced. These tools are shown as subsections in the program.

  • The first is Spider, which you can use to crawl a site or web application. “Crawling” is the act of sifting through every page that a site has to offer in order to gain the scope of the task. Without it, you might miss a couple of vulnerabilities that you could have caught. If you have the time for it, crawl manually without Spider, or at the very least don’t rely solely on the program to do it for you, it can make mistakes too.
  • Next is Scanner, a premium-only program that makes your job easier by scanning the site for any vulnerabilities. This is a pretty important tool and is worth Premium’s price point.
  • The Intruder tool comes next, and it’s a powerful one. This is your main attacking tool that you’ll use to prod and poke at a website to see what makes it tick. You can use it for a very large variety of purposes, for example, if the site has the option of letting a user sign up or log in, you can try to see what characters work, what don’t, and what crash the site or give administrator access by accident. 
  • Repeater, similarly to Intruder, can be used to repeatedly (thus the name) issue HTTP requests into different input or manipulation fields.
  • Sequencer looks over the site’s random elements, the important stuff that you want to be encrypted or randomised, and analyses just how random it is.
  • Decoder, a relatively simple tool, decodes and encodes (translates) different types of data. It takes HTML, URL, Base64, GZIP, hexadecimal, ASCII hexadecimal, Octal, and Binary.
  • Finally, the Comparer tool makes comparisons between two pieces of data. If two pieces of data are both much too long you can pop them both into the Comparer and it’ll tell how they differ.

This is a very, very, very basic look at what Burp Suite is and what it can do for you. I’ll be rolling out blog posts with specific instructions and examples for each tool in the coming weeks. Keep these in mind until then and remember to always stay on your toes. See you next week.

Think about what security means to you. It’s not too hard to come up with a few lessons or adages that help us stay safe in our everyday lives. Lock your door when leaving the house, walk in well-lit areas, know your emergency numbers. Security gives us a sense of comfort, knowing that we, our loved ones, and our assets, are safe. The strategies that companies and governments employ in order to maintain their security against physical, real-world threats are well-known and can be easily observed (although just as easily misinterpreted) by anyone. I’m talking guards, cameras, vaults. Big, glaring signs of power that have “Don’t Mess With Me” written all over them. Our hi-tech age, however, is changing things. People are communicating globally, entire libraries are uploaded to the cloud, and information has never been more abundant or easier to obtain. With this come new security risks, more subtle, and yet more devastating as well. I’m talking, as you may have guessed, about hacking.

Hacking portrayed in movies and TV is at the same time exactly the same and completely different from how it is in real life. This is because the term is so broad and generalised that it can encompass a myriad of individuals and professions. Hackers who live in their vans, sustaining themselves on a steady diet of Cheetos and Diet Pepsi which they pay for by selling email accounts they acquired from phishing bots do exist, along with suit-and-tie businessmen who make good money, legal money in fact, from hacking the world’s top companies and selling them the flaws. There also exist those who would release an entire database of user information to the world for no other reason than poops and giggles. A hacker can shut down a power station, or take control of a million PC’s that’ll run DDoS attacks to shut down a bank’s website. Point is, whilst before planning a security breach consisted mostly of “shoot X, blow up Y”, the possibilities of digital crime now are endless.

These new digital dangers are the reason this blog was made. Every week or so, I will make a blog post summarising a concept in security. If a concept is too big for one post (or if I just really like it), then I’ll spread it out into several. I’ll try to keep the topics as varied as possible, from how the CIA plans to open the Boston Bomber’s iPhone to why you should never trust a Nigerian Prince begging for money. However, know that I am explaining these concepts purely with the intention to help protect and inform, not breach or destroy. You are forbidden, dear reader, from going out into the world and hacking into McDonalds’ Corporate office using a Starbucks’ WiFi. Be warned that this is not only unethical but more importantly illegal as all hell. Keep this in mind and remember to always stay on your toes. See you next week.

Apache HDFS                            2.3.0
Apache MapReduce (for MR1) 1.2.1
Apache YARN (for MR2)          2.3.0
Apache Hive                              0.12.0
Cloudera Impala                       2.0.0
Apache HBase                           0.98.0
Apache Accumulo                     1.6.0
Apache Solr                               4.4.0
Apache Oozie                             4.0.0
Cloudera Hue                             3.5.0
Apache ZooKeeper                     3.4.5
Apache Flume                            1.5.0
Apache Sqoop                             1.4.4
Apache Sentry (Incubating)      1.4.0-incubating

In short:

  • Hadoop Common: A set of shared libraries
  • HDFS: The Hadoop filesystem
  • MapReduce: Parallel computation framework
  • ZooKeeper: Configuration management and coordination
  • HBase: Column-oriented database on HDFS
  • Hive: Data warehouse on HDFS with SQL-like access
  • Pig: Higher-level programming language for Hadoop computations
  • Oozie: Orchestration and workflow management
  • Mahout: A library of machine learning and data mining algorithms
  • Flume: Collection and import of log and event data
  • Sqoop: Imports data from relational databases
  • The Hadoop Distributed File System, or HDFS, is often considered the foundation component for the rest of the Hadoop ecosystem. HDFS is the storage layer for Hadoop and provides the ability to store mass amounts of data while growing storage capacity and aggregate bandwidth in a linear fashion. HDFS is a logical filesystem that spans many servers, each with multiple hard drives. This is important to understand from a security perspective because a given file in HDFS can span many or all servers in the Hadoop cluster. This means that client interactions with a given file might require communication with every node in the cluster. This is made possible by a key implementation feature of HDFS that breaks up files into blocks. Each block of data for a given file can be stored on any physical drive on any node in the cluster. The important security takeaway is that all files in HDFS are broken up into blocks, and clients using HDFS will communicate over the network to all of the servers in the Hadoop cluster when reading and writing files.
    • NameNode
      The NameNode is responsible for keeping track of all the metadata related to the files in HDFS, such as filenames, block locations, file permissions, and replication. From a security perspective, it is important to know that clients of HDFS, such as those reading or writing files, always communicate with the NameNode.
    • DataNode
      The DataNode is responsible for the actual storage and retrieval of data blocks in HDFS. Clients of HDFS reading a given file are told by the NameNode which DataNode in the cluster has the block of data requested. When writing data to HDFS, clients write a block of data to a DataNode determined by the NameNode. From there, that DataNode sets up a write pipeline to other DataNodes to complete the write based on the desired replication factor.
    • JournalNode
      The JournalNode is a special type of component for HDFS. When HDFS is configured for high availability (HA), JournalNodes take over the NameNode responsibility for writing HDFS metadata information. Clusters typically have an odd number of JournalNodes (usually three or five) to ensure majority. For example, if a new file is written to HDFS, the metadata about the file is written to every JournalNode. When the majority of the JournalNodes successfully write this information, the change is considered durable.
    • HttpFS
      HttpFS is a component of HDFS that provides a proxy for clients to the Name‐Node and DataNodes. This proxy is a REST API and allows clients to communicate to the proxy to use HDFS without having direct connectivity to any of the other components in HDFS. HttpFS will be a key component in certain cluster architectures.
    • NFS Gateway
      The NFS gateway, as the name implies, allows for clients to use HDFS like an NFS-mounted filesystem. The NFS gateway is an actual daemon process that facilitates the NFS protocol communication between clients and the underlying HDFS cluster. Much like HttpFS, the NFS gateway sits between HDFS and clients and therefore affords a security boundary that can be useful in certain cluster architectures.
    • KMS
      The Hadoop Key Management Server, or KMS, plays an important role in HDFS transparent encryption at rest. Its purpose is to act as the intermediary between HDFS clients, the NameNode, and a key server, handling encryption operations such as decrypting data encryption keys and managing encryption zone keys.
  • Apache YARN
    •  Originally described by Apache as a redesigned resource manager, YARN is now characterized as a large-scale, distributed operating system for big data applications.
    • Other processing frameworks and applications, such as Impala and Spark, use YARN as the resource management framework. While YARN provides a more general resource management framework, MapReduce is still the canonical application that runs on it. MapReduce that runs on YARN is considered version 2, or MR2 for short.
  • Apache MapReduce
    • MapReduce is the processing counterpart to HDFS and provides the most basic mechanism to batch process data. When MapReduce is executed on top of YARN, it is often called MapReduce2, or MR2. This distinguishes the YARN-based verison of
      MapReduce from the standalone MapReduce framework, which has been retroactively named MR1. MapReduce jobs are submitted by clients to the MapReduce framework and operate over a subset of data in HDFS, usually a specified directory. MapReduce itself is a programming paradigm that allows chunks of data, or blocks in the case of HDFS, to be processed by multiple servers in parallel, independent of one another. While a Hadoop developer needs to know the intricacies of how MapReduce works, a security architect largely does not. What a security architect needs to know is that clients submit their jobs to the MapReduce framework and from that point on,the MapReduce framework handles the distribution and execution of the client code across the cluster. Clients do not interact with any of the nodes in the cluster to make their job run. Jobs themselves require some number of tasks to be run to complete the work. Each task is started on a given node by the MapReduce framework’s scheduling algorithm.
    • A key point about MapReduce is that other Hadoop ecosystem components are frameworks and libraries on top of MapReduce, meaning that MapReduce handles the actual processing of data, but these frameworks and libraries abstract the MapReduce job execution from clients. Hive, Pig, and Sqoop are examples of components that use MapReduce in this fashion.
  • Apache Hive
    • The Apache Hive project was started by Facebook. The company saw the utility of MapReduce to process data but found limitations in adoption of the framework due to the lack of Java programming skills in its analyst communities. Most of Facebook’s analysts did have SQL skills, so the Hive project was started to serve as a SQL abstraction layer that uses MapReduce as the execution engine.
  • Cloudera Impala
    • Cloudera Impala is a massive parallel processing (MPP) framework that is purposebuilt for analytic SQL. Impala reads data from HDFS and utilizes the Hive metastore for interpreting data structures and formats.
    • New users to the Hadoop ecosystem often ask what the difference is between Hive and Impala because they both offer SQL access to data in HDFS. Hive was created to allow users that are familiar with SQL to process data in HDFS without needing to know anything about MapReduce. It was designed to abstract the innards of MapReduce to make the data in HDFS more accessible. Hive is largely used for batch access and ETL work. Impala, on the other hand, was designed from the ground up to be a fast analytic processing engine to support ad hoc queries and business intelligence (BI) tools. There is utility in both Hive and Impala, and they should be treated as complementary components.
  • Apache Sentry
    Sentry is the component that provides fine-grained role-based access controls (RBAC) to several of the other ecosystem components, such as Hive and Impala. While individual components may have their own authorization mechanism, Sentry
    provides a unified authorization that allows centralized policy enforcement across components. It is a critical component of Hadoop security.

    • Sentry server
      The Sentry server is a daemon process that facilitates policy lookups made by other Hadoop ecosystem components. Client components of Sentry are configured to delegate authorization decisions based on the policies put in place by Sentry.
    • Policy database
      The Sentry policy database is the location where all authorization policies are stored. The Sentry server uses the policy database to determine if a user is allowed to perform a given action. Specifically, the Sentry server looks for a matching policy that grants access to a resource for the user. In earlier versions of Sentry, the policy database was a text file that contained all of the policies.
  • Apache HBase
    • HBase is an open source, non-relational, distributed database modeled after Google’s BigTable and written in Java. It runs on top of HDFS (Hadoop Distributed Filesystem), providing BigTable-like capabilities for Hadoop. HBase features compression, in-memory operation, and Bloom filters on a per-column basis. Tables in HBase can serve as the input and output for MapReduce jobs run in Hadoop, and may be accessed through the Java API but also through REST, Avro or Thrift gateway APIs. Hbase is a column-oriented key -value data store. HBase typically utilizes HDFS as the underlying storage layer for data.
  • Apache Accumulo
    • Apache Accumulo is a sorted and distributed key/value store designed to be a robust, scalable, high-performance storage and retrieval system. Like HBase, Accumulo was originally based on the Google BigTable design, but was built on top of the Apache Hadoop ecosystem of projects (in particular, HDFS, ZooKeeper, and Apache Thrift). Accumulo uses roughly the same data model as HBase.
  • Apache Solr
    • The Apache Solr project, and specifically SolrCloud, enables the search and retrieval
      of documents that are part of a larger collection that has been sharded across multiple physical servers. Search is one of the canonical use cases for big data and is one of the most common utilities used by anyone accessing the Internet. Solr is built on top of the Apache Lucene project, which actually handles the bulk of the indexing and search capabilities. Solr expands on these capabilities by providing enterprise search features such as faceted navigation, caching, hit highlighting, and an administration interface.
      Solr has a single component, the server. There can be many Solr servers in a single deployment, which scale out linearly through the sharding provided by SolrCloud. SolrCloud also provides replication features to accommodate failures in a distributed environment.
  • Apache Oozie
    • Apache Oozie is a workflow management and orchestration system for Hadoop. It allows for setting up workflows that contain various actions, each of which can utilize a different component in the Hadoop ecosystem. For example, an Oozie workflow could start by executing a Sqoop import to move data into HDFS, then a Pig script to transform the data, followed by a Hive script to set up metadata structures. Oozie allows for more complex workflows, such as forks and joins that allow multiple steps to be executed in parallel, and other steps that rely on multiple steps to be completed before continuing. Oozie workflows can run on a repeatable schedule based on different types of input conditions such as running at a certain time or waiting until a certain path exists in HDFS.
      Oozie consists of just a single server component, and this server is responsible for handling client workflow submissions, managing the execution of workflows, and reporting status.
  • Apache ZooKeeper
    • Apache ZooKeeper is a distributed coordination service that allows for distributed systems to store and read small amounts of data in a synchronized way. It is often used for storing common configuration information. Additionally, ZooKeeper is heavily used in the Hadoop ecosystem for synchronizing high availability (HA) services, such as NameNode HA and ResourceManager HA. ZooKeeper itself is a distributed system that relies on an odd number of servers called a ZooKeeper ensemble to reach a quorum, or majority, to acknowledge a given transaction. ZooKeeper has only one component, the ZooKeeper server.
  • Apache Flume
    • Apache Flume is an event-based ingestion tool that is used primarily for ingestion into Hadoop, but can actually be used completely independent of it. Flume, as the name would imply, was initially created for the purpose of ingesting log events into HDFS. The Flume architecture consists of three main pieces: sources, sinks, and channels. A Flume source defines how data is to be read from the upstream provider. This would include things like a syslog server, a JMS queue, or even polling a Linux directory. A Flume sink defines how data should be written downstream. Common Flume sinks include an HDFS sink and an HBase sink. Lastly, a Flume channel defines how data is stored between the source and sink. The two primary Flume channels are the memory channel and file channel. The memory channel affords speed at the cost of reliability, and the file channel provides reliability at the cost of speed. Flume consists of a single component, a Flume agent. Agents contain the code for sources, sinks, and channels. An important part of the Flume architecture is that Flume agents can be connected to each other, where the sink of one agent connects to the source of another.
  • Apache Sqoop
    Apache Sqoop provides the ability to do batch imports and exports of data to and from a traditional RDBMS, as well as other data sources such as FTP servers. Sqoop itself submits map-only MapReduce jobs that launch tasks to interact with the RDBMS in a parallel fashion. Sqoop is used both as an easy mechanism to initially seed a Hadoop cluster with data, as well as a tool used for regular ingestion and extraction routines. Sqoop1 is a set of client libraries that are invoked from the command line using the sqoop binary.
  • Cloudera Hue
    • Cloudera Hue is a web application that exposes many of the Hadoop ecosystem components in a user-friendly way. Hue allows for easy access into the Hadoop cluster without requiring users to be familiar with Linux or the various command-line interfaces the components have. Hue has a number different security controls available. Hue is comprised of the following components:
  • Hue server
    • This is the main component of Hue. It is effectively a web server that serves web content to users. Users are authenticated at first logon and from there, actions performed by the end user are actually done by Hue itself on behalf of the user. This concept is known as impersonation.
  • Kerberos Ticket Renewer
    • As the name implies, this component is responsible for periodically renewing the Kerberos ticket-granting ticket (TGT), which Hue uses to interact with the Hadoop cluster when the cluster has Kerberos enabled.,_Sqoop,_Flume_and_More:_Apache_Hadoop_Defined


  • Practical Hadoop Security by Bhushan Lakhe
  • Securing Hadoop by Sudheesh Narayanan
  • Hadoop Security by Ben Spivey and Joey Echeverria
  • Big Data Forensics – Learning Hadoop Investigations by Joe Sremack
  • Zed Attack Proxy is a web application penetration tool
  • Used as a framework for automated security tests
  • It’s a cross platform tool and can be used on UNIX, Windows or Mac OS
  • ZAP is intercepting proxy
  • It provides both active and passive scanners, passive scanner just examines our requests and responses, active scanner performs wide range of attacks
  • It has an excellent report generation ability
  • ZAP can also find hidden directories and files using Brute Force(based on OWASP DirBuster code) component
  • It can also fuzz parameters including fuzzing libraries (using fuzzdb & OWASP JBroFuzz)
  • ZAP has the following additional features:
    • Auto tagging, this feature tag messages that you can easily see which message has hidden fields
    • Port scanner, so you can see which ports are open on a computer
    • Parameter analysis, it analyzes all requests and shows you the summary of all of parameters that application uses
    • Smart card support, it’s very useful if an application you are testing uses smart card or tokens for authentication 
    • Session comparison
    • Invoke external applications
    • API + Headless mode
    • Dynamic SSL Certificates allows to intercept HTTPs trafic
    • Anti CSRF token handling
  • During initial installation ZAP offers you to create SSL Root CA certificate, it allows proxy to intercept all HTTPs traffic, you will need it if you are planning to test any application using HTTPs protocol, steps are the following: 
    • Generate SSL certificate
    • Save it
    • Import it to your browser
  • Don’t forget to amend Connection Settings in your browser and specify ZAP as your HTTP proxy
  • After successful installation you can perform basic penetration test
  • A basic penetration test
    • Configure your browser to use ZAP as a proxy
    • Explore the application manually
    • Use the Spider to find hidden content
    • See what issues the Passive Scanner has found
    • Use the Active Scanner to find vulnerabilities
    • Review all vulnerabilities that were found during Active Scanning
  • ZAP can be used for completely automated security tests in conjunction with Apache Ant and Selenium framework
  • ZAP has three modes: Safe mode doesn’t allow you to do anything potentially dangerous, Protected mode allows you to do potentially dangerous things on item in Scope and Standard mode allows you to do dangerous things on anything
  • ZAP can keep track of all HTTP sessions and allows to switch between them
  • Nowadays web sockets are very popular and currently  ZAP has one of the best support for web sockets
  • Password encryption
    • Never store passwords in plain text
    • Ideal is one way encryption
    • Don’t use MD5 anymore, good choices are SHA-1, SHA-2 (SHA-256, SHA-512), Whirlpool, Tiger, AES, Blowfish
    • The best is Blowfish it’s secure, free, easy, and slow
  • Salting passwords
    • Salt is an additional data added to the password before encryption, the main purpose of salts is to defend against dictionary attacks
    • Unique to each user salts can be created
    • Salts can be created using pseudo random string using time functions, in this case salts need to be saved in the database, salts can be hashed as well
  • Password requirements
    • Require certain length, but not limit length
    • Require non-alphanumeric characters
    • Ask user to confirm password
    • Report password strength to user
    • Do not record password hint
    • Security questions may be vulnerable to attacks, internet research could reveal information to security questions, user’s friends or family members might know answers to security questions
  • Brute force attacks
    • Hacker tries all possible passwords over and over again until the correct solution is found
    • To strengthen the password allow all characters and long strings
    • Enforce clipping level and slow password hashing algorithms as well as timing and throttling
  • SSL
    • Provides communication security
    • Verifies authenticity of remote server
    • Encrypts all data exchanged with server
    • Prevents snooping, session hijacking
    • Requires all assets on a webpage such as JavaScript, CSS, images to be secure
    • With SSL you must encrypt all credit card transactions, username/passwords being sent to the server
  • Protecting Cookies
  • Regulating Access Privilege
    • Least privileges
    • Make privileges easy to revoke
    • Restrict access to access privilege administration tools
    • Divide restricted actions into “privilege areas”
    • Regulate access by user access level or category
  • Handling Forgotten Passwords
    • Ask about privileged information
    • Ask security challenge questions
    • Since the email of the person is his identity we can send email with with reset token
  • Multi factor authentication
    • Authentication requires two or more factors
    • Something only the user knows, something only the user has, something only the use is
  • Cross-Site Scripting (“XSS“)
    • Hackers can inject JavaScript into a web page
    • Used to steal cookies a session data
    • Often very successful just because browser trusts JavaScript
      • Protection:
        • Sanitize any dynamic context that gets output to browser (HTML, JavaScript, JSON, XML…)
        • Pay special attention to data that come directly from URLs or forms
        • Be careful about database data, cookies, session data
        • Use Whitelists to allow certain HTML tags and sanitize everything else
    • Cross-Site Request Forgery (CSRF)
      • Hackers tricks user into making a request to your server
      • Used for fraudulent clicks
      • Forging login request
        • Protection
          • Accept POST request only
          • Use a “form token” in user’s section
          • Add a hidden field to forms with form token as value
          • Compare session form token and submitted form token
          • Store the token generation time in user’s session
    • SQL Injection
      • Hacker is able to execute arbitrary SQL request in order to probe database schema, steal data(usernames, passwords, credit cards, encrypted data), assign elevated privileges, truncate or drop tables
        • Protection
          • Use limited privileges to application’s database user
          • Sanitize input
          • Escape for SQL using libraries
          • Use prepared statements
    • URL manipulation
      • Editing the URL string to probe the site
      • Can be used for revealing private information, performing restricted actions
        • Protection
      • Remember that URLs are exposed and easily editable
      • Implement proper access control
      • Keep error messages vague
      • Clarify your GET and POST requests, only POST requests should be used for making changes
    • Cookie Stealing
      • Cookie data is visible to users
      • Cookies can be stolen using XSS attack
      • Remember that cookies can be sniffed by observing network traffic by using packet analyzers (most popular Wireshark)
        • Protection
          • Put only non sensitive data in cookies
          • Use HttpOnly cookies
        • Use HTTPs cookies
        • Set cookie expiration date
        • Set cookie domain, sub domain and path
        • Encrypt cookie data
        • User server side sessions instead of client side cookies
    • Session hijacking
      • Stealing session ID is similar to stealing cookie but much more valuable
      • Can be used to steal personal info, passwords
      • Often done by network sniffing
      • Never use open wireless networks at coffee shops for transmitting sensitive data
      • Variation of session hijacking is session fixation
      • Session fixation is opposite to session hijacking, it trick a user into a hacker provided session identifier
        • Protection
          • Use SSL
          • Save user agent in session and confirm it (not ideal method)
          • Check IP address of a computer who is making a request (not ideal as well)
          • Use HttpOnly cookies
          • Regenerate session ID periodically, at key points, especially important to regenerate after log in
          • Expire and remove old session files regularly and keep track of last activity in session
        • Do not accept session identifier from from GET or POST variables, session identifier should come from only one place – cookies
    • Remote system execution attack

      • It’s the most dangerous attack when hacker remotely run operating system commands on a web server
        • Protection
          • Avoid system execution keywords (they are language specific)
          • Perform system execution with extra caution
          • Sanitize any dynamic data carefully
          • Understand system commands and their syntax
          • Add additional data validation
    • File upload abuse
      • Can be used to upload too much data (quantity, file size)
      • Can be used to upload warm or virus
        • Protection
          • Require user authentication, no anonymous uploads
          • Limit maximum upload size
          • Limit allowable file formats, file extensions
          • Use caution when opening uploaded file
          • Do not host uploaded files which have not been verified
    • Denial of Service (DoS) attack

      • Attempt to make a server unavailable to users
      • Usually performed by overloading a server with requests
      • Includes DNS and routing disruption
      • If performed by distributed network pf computers it called DDoS
        • Protection
          • Properly configure firewalls, IDS, switches, load balancers and routers
        • Collection of reverse proxies
        • Map you infrastructure
        • Keep infrastructure up to date
        • Make network traffic visible
        • Develop DRP plan
        • Consider changing IP address
        • “Black hole” or “null route” traffic
  • Regulate Request Method
    • Make sure that your application accepts only the request methods that you expect (for examples, for GET requests: URLs, links; for POST request: forms) and ignores all overs
  • Validating Input
    • Is the input acceptable?
    • Determine data expectations (preventing bugs, as well as hacks)
    • Consider application and database requirements
    • Regulate the data inputs to your application and only allow expected data
    • Set good default values, default should prevail
  • Common Validations
    • Presence of data
    • Length of data
    • Type of data
    • Format of the data
    • Uniqueness
    • Double check validation logic
    • Search on a web for your programming language for “logical pitfalls”
  • Sanitizing data
    • It’s the most important step that can be taken toward more secure web server
    • In order to neutralize the thread we should use type casting, not type juggling this way you maintain control over the process
    • Sanitize any SQL, HTML, JavaScript, XML in general any data that you receiving, all power characters should be sanitizing, power characters depend on programming language you are using
    • Add escape characters before powerful characters
    • Do not write custom sanitization methods, use well tested, language specific functions instead
    • Do not remove or correct invalid data, stick to encoding and escaping
    • Consider where the data goes
    • Consider where the data might go later
    • Sanitize early and continue sanitize it constantly
  • Labeling Data
    • Use names to identify condition of data (for example dirty, raw, unsafe…), when we sanitize data variable names can be changed to “clean”,”filtered”,”safe”
  • Keep Code Private
    • Be sure that libraries directories are not accessible by the web server
    • Web Server should be configured properly: set document root, allow/deny access for all directories/files and so on
  • Keep Credential That Your Code Uses Private
    • Plain text credentials are dangerous
    • Keep them separate form code
    • Keep credential file out of version control
    • Have as few copies of password as necessary
    • Don’t reuse passwords, passwords should be unique for each computer,database, environment
    • Hash password whenever possible, public key cryptography is an excellent choice
  • Keep Error Message Vague
    • Turn off detailed error reporting for production server
    • Return only generic error pages
    • Configure web server to use same error pages
  • Smart Logging
    • Errors
    • Sensitive actions
    • Possible attacks
    • Data worth logging:
      • Date and Time
      • Source (user, IP)
      • Action
      • Target
      • Cookie
      • Session
      • URL and all parameters
      • Backtrace
    • Review logs routinely
    • Don’t log sensitive data such as passwords , beware POST psarameters and database quires
    • Filter out passwords, keys, tokens from logging
    • Keep an old content, so it can be easily restored
  • Least Privilege  “Every program and every privileged user of the system should operate using the least amount of privilege necessary to complete the job.” —  Jerome Saltzer 
  • Least Privilege Benefits:
    • Code stability
    • Controlled data access
    • System security
    • Vulnerabilities are limited and localized
    • Easier to test actions and interactions
  • Simple Is More Secure
    • Use clearly named functions and variables
    • Write code comments
    • Break up long sections of code into small, more manageable functions
    • Don’t repeat yourself
    • Legacy code is a security concern
    • Try to use built-in functions whenever possible
    • Disable all unused features when possible
  • Never Trust Your Users
    • People are prone to mistakes
    • Don’t trust even admins
    • Identity can be stolen
    • Use cation with contractors
    • Establish the process that allows to revoke user access instantaneously
    • Remember that hacks happen offline as well(Phone, printouts…)
  • Defense In Depth
    • You should have a number of layers of defense
    • Over time attacks lose momentum
    • Redundant Security
      • People (security policy, best practices implementation …)
      • Technology (IDS, SIEM, system administration, encryption, access controls…)
      • Operations(periodic security reviews, data handling procedures, threads handling…)
  •  Security Through Obscurity
    • More info benefits hackers
    • Limit exposed information
    • Limit feedback
    • Obscurity doesn’t mean misdirection
  • Whitelisting Is Much More Secure Than Blacklisting
    • Whitelisting means restricting by default which is much more secure approach
  • Map Exposure Points
    • Incoming Exposure Points
      • URLs
      • Forms
      • Cookies
      • Sessions
      • Database reads
      • Public APIs
    • Outgoing Exposure Points
      • HTML
      • JavaScrip/JSON/XML/RSS
      • Cookies
      • Sessions
      • Database writes
      • Third-party APIs
  • Map Data Passageways
    • What paths does data takes?
    • Know your site topography and your environment architectural landscape
    • Ideally you should have a graphical representation of all of your access points
  • Database Service level access controls
    Scalable Authentication: Database clusters consist of a large number of nodes, and the authentication models should be scalable to support such large network authentication
    Impersonation: Database services should be able to impersonate the user submitting the job so that the correct user isolation can be maintained
    Self-Served: Database jobs run for long time, so they should be able to ensure that the jobs are able to self-heal the delegated user authentication to complete the job
    Secure Inter Process Communication: Database services should be able to authenticate each other and ensure secured communication between themselves
  • User level access controls
    Users of Database should only be able to access data that is authorized for them
    Only authenticated users should be able to submit jobs to the Database cluster
    Users should be able to view, edit and kill only their own jobs