Monday, March 17, 2008

WIL, Q1 2008

Programming
Life with qmail
  • recordio: I noticed that if you run out of disk space in the location where recordio stores stuff, things might silently stop working. Since recordio doesn't run as root (for all the good reasons in the world), df (that you're then most likely running as root *grin*) will still report that there's a bit of space left on the device. But you'll certainly notice that lsof -i will report a bunch of recordio processes with connections in CLOSE_WAIT state...
Misc.
  • How to map an IP address to an ASN? Take a look at the Webalizer project and download the source code. When you build it, you end up with a utility called asndb-reader that let you query a database that can map IPs to ASN numbers. Sweet. Of course, for a fee, you can by from vendors fully qualified databases. They will map IPs to ASNs and also tell you what type of business is behind it (for instance, whether it's an ISP or not.)
Security
  • Don't try to see malice when idiocy is just a better explanation -- or, how appalling the finance industry best security practices really are
  • I was reading the other day about an interesting way of gathering intelligence on an entity with a discreet (as in, for instance, not exactly known to DNS) web presence: create a page with a set of URLs pointing at these IPs, wait for search engines to index them and then craft custom queries to start searching for what you're looking for.
  • FFSN have been around for a little while now, but it's worth pointing at this interesting recently published paper on detection and measurements of FFSN.
Web
  • Someone pointed me at this app that will help you grapghically visualize a website.

Labels:

Wednesday, January 02, 2008

WIL, Q4 2007

Security:
  • CRSF: Cross Site Request Forgery. That's how gmail got hacked -- some evil site you were visiting at the same time you had gmail running would send a request to gmail to, for instance, add a filter to CC your email to some other account. Bad. (It got fixed, but apparently it's been coming back.)
  • I learned a bit about jail, a BSD extension that implements system partitioning: what chroot does to a directories and sub-directories, jail does it to system resources: file system (chroot style restriction, device creation restriction, etc...), process (seeing other process, sending signals, etc...) and network (locked IP address per jail, no spoofing, etc...) Very useful for ISPs that give shell access to their customers.
  • Taint checking is possible with perl (see the -T option) and will check the use of tainted variables in function calls. A tainted variable is a variable that contains data that comes from the outside world. For instance, if variable $a gets its value from STDIN, it will be considered tainted. And if $b is assigned the content of $a, then $b will be tainted too. When taint check is in order, attempt to use a tainted variable to affect the state of the system outside of the program itself will results in die being called. Regular expressions memory references aren't considered tainted, which is a way to produce non tainted content from tainted content -- the idea is to force the programmer to implement variable checks before using them (i.e. affected the state of the system outside of the program itself.) There's more here.
  • SMACK: as people grew tired of the SE linux administration headache, the Kernel developers came up with SMACK, which works by attaching permissions to do things to running process. While we're on the topic of SE linux and friends, don't forget AppArmor: Unix/Linux Discretionary Access Control is supplemented by the Mandated Access Control -- as the documentation says "AppArmor enforces the idea of least privilege for programs, that is, granting programs only the privileges they need to do their job and nothing else.", something reminiscent of the qmail security model.
  • Team Cymru -- a recommended read.
  • This is the paper I was looking for back in early 2006. A pity I didn't discover it then. It's now dating a bit (and maybe will be updated) but some predictions can now be checked:
    • Mobile malware will rise as mobiles are used to conduct monetary transaction. Well, it's been said already that 2008 will be the year of the iPhone hacking.
    • Open source to improve code: Yes, this is already the case.
    • Worm stealing files (pdf, doc, design files, etc...) or private keys Gasp -- so far, personal information are more interesting. It might be because stealing other secrets yield enormous amount of data that needs to be categorized before it can be resold, if you find the market for it.

  • Pssst, are you looking for hashes?
  • rawpacket.org has a live CD project called Hex that features a bunch of utility on a bootable Linux image to turn a PC into a packet analyzer. The list of packets is impressive, among them, this cool utility called EtherApe that maps activities (per protocols) between systems...


Email Security:
  • Email anti-forgery measures as SPF first implemented (i.e. authenticating bits issued) by major mail providers is being extended to end users: if you want to send email to hotmails, you should have SPF enabled on your side too. While you're at it, make sure to take a look at DKIM: when SPF let you verify that the originator of the message hasn't been forged, DKIM allows you to validate the identity associated to the message -- both are complementary in their strengths and weaknesses.
  • If a message is sent to a user that usually receives a lot of SPAM, the chances are that the message is SPAM: that's the base for Abaca's recipient's reputation SPAM fighting approach -- you'll find more details in this white paper. It's been picked up by every single blog even distantly related to the matter of message classification and I'm anxiously waiting for more detailed efficiency results.
  • For quite some time now, Unicode-Obfuscated SPAM has been trying to find its way around SPAM categorization systems by replacing characters by similarly looking characters found in the vast Unicode character set. It works better for emails written using HTML where, for instance, you spell `Viagra' `Vⅰaɡrа' There has been a lot of papers written on the subject and there are definitions files (in the megabyte range) using the Kernel Density Estimation method to compute list of similar characters -- you can use a threshold to choose how similar your character is going to be (I found that 0.05 gives me, for Latin characters, something close enough to fool me, but note that this threshold can be a much larger number for Sino-Tibetan languages. The `Viagra' example above was generated with a 0.05 threshold on character similarity -- a very selective one -- and a .9 probability of character substitution, to make my point across.) Note that this not only applies to messages but also to domain names (since IDN -- International Domain Names using unicodes now exist.) All languages are affected, some more than others.


Testing:
  • Why not use a reputation based system to improve the quality of your testing?


Misc:
  • I'm been reading about all sorts of ways to hide files. For instance, this presentation explains how one can hide files in fake bad blocks, in the swap or in unallocated space between partition. Of course, the idea is to hide the existence of a file to common tools, but encryption is necessary otherwise low level scanners will still be able to show that some data on the disk obviously belongs to a file.
  • Ha -- I didn't know about the watch command. Combine it with lsof -i and you've got a nice utility in one line of script: watch -n 2 -d 'lsof -i'


Performance:
  • I stumbled across yet an other tool to collect performance
    data. This one is called collectl (Collect for Linux -- Collect apparently was written fo True64.) As a bonus, it's written in Perl
  • Enters Systemtap which systemtap lets you write script to monitor system activities at the kernel level. The main advantage is that the tool is wrapped in a way that there's only one command that does everything to get you started. Take a look at the tutorial and you'll be hooked.

Labels:

Monday, November 05, 2007

Storm bot

It's hard not the get worried about the storm bot these days. In case you've been living under a rock, here are a few headlines about the beast:
  • Rub it the wrong way and it will DDoS you (there have been reports of a couple of intrusive packets and queries and you're out. The witnessed bandwidth figures have been pretty worrisome -- something in the vicinity of 5 Gbits/s.)
  • Convenient packaging, shared, improved upon by several individuals and even localized
  • Use of fast-flux networks
  • Ability to assess of a campaign effectiveness, before it starts (one will actually take a close look at the targeted countries and operating system. Windows 98 -- I wouldn't have guessed so many people are still using it.)
  • Partitioning of communications using a 40-bit key system
  • Emergence of the managed spamming appliance service, how should I call it -- "concept"?
I guess we'll just have to see who's going to hire its services -- this past few weeks words from the people tracking Storm Botnet's behavior have been "Precursor to..." or "Something new is upon us". Given that previous DDoS campaigns have been able to severly disrupt normal Internet operation in a small country, it's hard not to feel that a massive gun for hire is being pointed at something -- I can't help picturing it as the infamous WWI's Paris-Gun, don't ask me why.

Update: here's a link to a nice presentation.

Labels:

Sunday, September 30, 2007

WIL, Q3 2007

Aministrivia:
  • Apparently, if you're a French citizen, you can obtain, for free, your criminal records from the web. Of course, it is punishable by law to obtain the criminal records of anyone else but you, but there isn't really any verification that is done when you're using the online forms.

Networking:
  • (j)whois configuration files get quickly outdated and should be refreshed every now and then. The file /etc/jwhois.conf has a section on where to go to obtain a fresh copy. This is important: an outdated configuration file can make whois return inaccurate answers.
  • When handling UDP connection, you'll need to code to track to incoming traffic port to be able to reply on the same port. I guess one doesn't write UDP socket code too often and these things are easy to forget.
  • mDNS stands for Multicast DNS -- getting some DNS support on local network wtihout existing/configured DNS servers. It multicasts UDP packets to port 5353 and makes use of a new .local. TLD.
  • TCP/IP tarpitting: ipfilter has a tar-pit option to shrink connection windows to very small amount and not replying to ACKs. This will force the attacking system to (1) send small amount of data and (2) increasingly retry to do so. Tarpitting is used for TCP/IP attacks -- it's better than a simple DROP that transforms a TCP flood into a SYN flood (as the connection is retried right after being denied.)

    If you want to still handle the last case gracefully, it takes a dedicated IP stack that will allocate small preconnection objects until the final SYN,ACK shows up. Incremental blacklisting can also be used and resource consumption measurement can help with stealth attacks, where traffic volume ramps up without tripping the IDSs.

  • Here's an interesting article that explains how ARP packets and networking stack idiosyncrasies can be used to detect equipment whose NIC have been set in promiscuous mode (and which might be sniffing your network traffic.)
  • Since IPv6 issues made it really hard to implement a resolver as a library, bind9 provides a lightweight DNS resolver library and daemon (lwresd). The deamon forwards queries to a real DNS server and caches results. The lwresd library talks to the deamon through a much simplier UDP protocol on port 921.
  • FEC is a Forwarding Equivalence Class. In MPLS environment, a FEC is a group of IP packets that, because of shared characteristics, are forwarded the same way using the same MPLS label. Destination IP address QoS can be used to define a FEC.
  • Handling DDoS: managed security service offerings can provide scrubbed pipes, where the traffic has been cleaned up. BGP Real Time Black Holing can help, or even better: BGP can be used to diver the traffic and scrub it before forwarding it. Ingress filtering (where no spoofed packets are allowed) is also used: spoofed packets are detected through uRPF -- unicast revert path forwarding: if the packet's source IP can't be reached through the interface it came from, it's spoofed and will be dropped. There's a lot of interesting papers and presentations on the subject.

Security:
  • I read this interesting paper about worms using search engine queries to find new hosts to infect: Mydoom.O will use domain found on the infected host in search engine queries to find email accounts to send itself to. Santy uses search engine queries to find vulnerable phpBB hosts. A well behaved search engine, such as Google, will detect these queries and challenge the issuing agent with a Captcha.
  • NACs -- just in case you haven't heard of them as they seem to be this year hot topic in network security. It stands for Network Access/Admission Control. It's something that tries to assess whether computers trying to access the network can comply with imposed security policies and if so, will be granted some access controlled by the NAC. When I say "something" I mean software installed each computers or an agent inspecting network traffic. A NAC should be able to assess and act before the computer is granted access to the network. NACs can be implemented from vulnerability scanners such as Nessus. NACs don't like NAT, as they're not really able to tell what's behind it -- something virtualization is making even worse: you can have an unpatched XP running as a virtual slide hiding behind t he NATing Linux host that's hosting the slice.
  • A Darknet is a bad packet honeypot. No legitimate packet should ever enter a Darknet and when they do, they will be captured and analyzed. Note that there's a bit of confusion on the word Darknet as it seems that it's been claimed to a couple of different meanings.
  • This is a really interesting article on TCP/IP connection hijacking, where one learn how to guess client ports; client and server packet sequence number (a nice combination sending spoofed ACK packets and observing IP IDs -- works when the client is W2K, XP or a non IP ID randomizing FreeBSD.)
  • Here's something about data seepage, including the link to an interesting presentation where we here about project KARMA -- or how to observe and even fake the expectation devices have when they desperately try to assume familiar network environments when they start: Wifi, DHCP, IM, etc -- all these can be observed or faked and provided as a way to gain information about a device and its user: who the user is, its IM contact, the Wifi access point it's been connecting to recently, etc...
  • Email fragmentation is really evil. Why? Because things go past filtering email gateway unscanned (it becomes impossible to do so.) Well behaved filtering mail gateway should just 5xx email transfer requesting fragmentation.

Programming:
  • MISRA: Motor Industry Software Reliability Association. For instance, they propose a C/C++ coding standard for the automobile industry.
  • POD: Plain Old Data (Object Oriented Programming lingo.) Designate a data structure that doesn't use any of the OO facilities that can be associated with set of bits in order to turn them into objects (ctor/dtor, inheritance, etc...) For instance, POD would be represented in C as a struct (C++ struct are very close to class, just that a struct has all its members public.
  • pthread mutex can be initialized with the PTHREAD_PROCESS_SHARED attribute which allows mutex to be accessed by all threads that have access to the memory where the mutex is allocated, even when this memory is in a space shared by other processes.
  • _SVID_SOURCE? _XOPEN_SOURCE_EXTENDED? _ISOC99_SOURCE? If you sometimes have a hard time remembering what these are all about, there's a man page that summarize what they are and what they enable in your source code: man feature_test_macros.
Internet:
  • Use OPML to share RSS feed subscriptions.
  • www.wigle.net is a Wifi access point database aims at providing coverage for the entire world. Alas, querying addresses works only for the US.

Labels:

Wednesday, July 11, 2007

Setting up a Spamtrap.

So you want to get spam? It's easy and fun -- you just need to setup a dedicated trap. Here we take a look at the situation of being behind a DSL. Here's what to do:
  • You can give your outside IP address (which is going to be your MX record, the IP address MTAs are going to use to send you spam) a symbolic name using dynamic DNS services. The idea is that if your external IP changes, you only change the dynamic DNS record and not all your MX records for all the domains you've setup to catch spam.

    EasyDNS works well, they will send periodic update asking you to reconfirm that your external IP address is the one they have in their records. They will also provide you with a perl script (ddclient, you'll need IO::Socket::SSL and Net::SSLeay which are easy to get and the configuration looks like:

    use=web, web=checkip.dyndns.org/, web-skip='IP Address'
    server=members.dyndns.org, \
    protocol=dyndns2 \
    my-easydns-domain
  • Find a old server to install Linux on. Make sure you configure IP tables. Whether you want to bother with SE linux is your choice. I don't. Don't give it any domain name.
  • Install qmail. Qmail is nice but it's a bit delicate. The software hasn't been updated in a while and the author doesn't allow new releases. So there's the official last distribution and a pletora of patches. After a couple of trials that didn't go very well, I settled for the directions given by qmailrocks which, well, rocks.

    I went through all the steps, I didn't install the following options: elzm, autoresponder and maildrop. I certainly installed vpopmail and the web based vpopmail management interface which makes it really easy to create new domains (you might want to register different domains to catch more spam.)

    During compilation, I had to create the following symlinks:

    cd /usr/include
    ln -s /lib/modules/2.6.9-1.667/build/include/linux
    I also did the same thing with asm-generic (to /usr/include/asm-generic) and asm-i386 (to /usr/include/asm) so that /usr/include/linux exist (that's for errno.h, for instance.)

    My recommendation is that you spend an hour or so reading all the steps to know what's coming to your and prepare everything before running the installation for real.

    Qmailrocks.com will walk you through all the steps, all the way down to starting qmail and making sure that it works. As far as domains are concerned (for the virtual domain or the rcpthost, I used my-easydns-domain that I registered with EasyDNS.)

  • Next you create a mail domain -- use the vpopmail user interface that should be running on your mail host. This domain could be the one you registered with EasyDNS, but you can also create others. For now, let's go with my-easydns-domain. Create a postmaster for each domain and also an account where all the spam will go, for instance spam.

    Make sure that these accounts provide limited access (no pop, no web access for instance.) Since you're going to advertise email address that do not exist (you don't want to add these users manually all the time) and that spammers are going to try their luck with possible email addresses that could exist in your domain, the easiest thing to do is to redirect all incoming email that isn't sent to postmaster to the spam account. It's easy to do: just edit the .qmail-default file that exists for a particular domain (this file exists in /home/vpopmail/domain/my-easydns-domain so that it contains:

    | /home/vpopmail/bin/vdelivermail '' \
    /home/vpopmail/domains/my-easydns-domain/spam
  • From your internal network, you can start testing that things are working. For instance, you can telnet to the port 25 of your mailhost and try a SMTP session. In bold is what you type:
    telnet 192.168.1.4 25
    Trying 192.168.1.4...
    Connected to host (192.168.1.4).
    Escape character is '^]'.
    220 my-easydns-domain
    HELO foo.edu
    250 my-easydns-domain
    MAIL FROM bar@foo.edu
    250 ok
    RCPT TO: baz@my-easydns-domain
    250 ok
    DATA
    354 go ahead
    Hello!
    .

    250 ok 1184182572 qp 17173
    QUIT
    221 my-easydns-domain
  • Now take a look at /home/vpopmail/domains/my-easydns-domain/spam/Maildir/new/ and you should see a file name 1184182572.17175.hostname,S=240 which contains the RAW mail you just sent to your domain. hostname is the name you gave to your mail host, the string returned when you type the command hostname.

  • It's a good idea to make qmail log all SMTP transaction for further analysis (for instance, you'll be able to write script to identify DHA, a simple knock on your SMTP door or transaction that fail for whatever reasons.) Here's how yo do this (thank you Chris for the tip!)

    Modify the file /service/qmail-smtpd/run to add /usr/local/bin/recordio before the invocation of /var/qmail/bin/qmail-smtpd. Once the modification is done, the file will look like (modification in bold, only the last few lines are shown:)

    ...
    exec /usr/local/bin/softlimit -m 30000000 \
    /usr/local/bin/tcpserver -v -R -l "$LOCAL" -x /etc/tcp.smtp.cdb -c "$MAXSMTPD" \
    -u "$QMAILDUID" -g "$NOFILESGID" 0 smtp \
    /usr/local/bin/recordio \
    /var/qmail/bin/qmail-smtpd my-easydns-domain \
    /home/vpopmail/bin/vchkpw /usr/bin/true 2>&1

    The logs are going to end up in the /var/log/qmail/qmail-smtpd directory. First in a file called current and then in files named after the date of their last modificatio, using an hexadecimal notation: 8 hexadecimal digits for the number of second since Epoch (as returned by ctime(3)) and the last other 8 hexadecimal digits being the fractions of seconds, all of this using a @40000000 prefix that I haven't tried to interpret.

  • Let's summerize what we have:
    • We have a TLDN that points to the IP address of our DSL modem
    • We have a mailhost that runs qmail. On this mail host we have:
      • Domains (such as my-easydns-domain) that we can send emails to, and the user doesn't have to exists, all goes into the spam account under /home/vpopmail/domains/my-easydns-domain/spam/Maildir/new.
      • qmail will save the logs of the entire SMTP transaction
We're now ready to have SPAM flow to your newly configured spam trap.
  • Just modify your DLS modem configuration so that incoming SMTP traffic on port 25 is redirected to your mail box, the one running qmail. You can conduct a SMTP test from the outside to make sure that (1) the port rediction works and (2) you're don't have a firewall rule on the DSL box or on the mailbox that prevents traffic coming from the outside to flow through port 25.
  • Now register domains with a registar (anyone you want that gives you control over the values you put in your records.) For instance, if you create serialhacker.org (this one is taken, sorry!) using the vpopmail web based admin interface, you want to register the serialhacker.org domain. During registration or after, you just set the MX records for that domain to point to...my-easydns-domain.

    Once the record have been taken into consideration (this takes more or less 24 hours) you will be able to send mail to serialhacker.org using a Yahoo! or gmail account for instance and this mail will be sent to the IP pointed to by my-easydns-domain

  • The next step is to advertise bogus email address using the domains that you registered. You can add them to web pages you maintain or post test messages to test news groups (you can automate this process using the perl Net::NNTP package in a simple script.)
  • Soon spam will flow in. It's up to you to do whatever you want with it, but I personally wrote scripts to monitor traffic: where it's coming from, its intensity, etc... These scripts are running as cronjobs to send me email if something happens...
A word of caution here: what you're going to receive is going to be extremely nasty: from offensive content to phishing attempts to viruses. You want to be extremly careful in the maner you're handling the content -- you have been warned.

Labels:

Saturday, June 30, 2007

WIL, Q2 2007

Just like everybody else, I always try to take useful notes. A couple month ago, I decided that I was going to keep a list of things that I have learned (it's easy to learn, it's even easier to forget what you learn unless you spend a lot of time applying the new knowledge, and even that only last for a couple weeks -- at least for me.)

So I have a file that I stuff with short notes of things that I've learned in the course of working. It's kept on a per week-basis and I've decided that every now and then I'd consolidate everything in a quarterly document.

This is what I've learned during a fraction of Q2 2007. I'm publishing them in a form that's easy for me to consult and here's to hopping that they'll be useful for you too. There's nothing earth shattering here, most of it is trivia that one has to run into every now and them...

Productivity:

  • Tramp is a nice substitute to the emacs ange-ftp mode; when FTP access just isn't feasible. I use the Transparent Remote Access, Multiple Protocols to remote edit files in the comfort of my local Emacs instance.

Profiling:
  • You can make valgrind and friend support a lot more friends if you recompile defining VA_N_THREADS to be the maximum number of threads you need to handle.
  • System wide Linux profiler oprofile needs to be specially configured for VMWare -- that's if you need to profile code running on a VMWare slice. oprofile can't rely on the instruction count CPU extensions so it has to use the interrupt mode instead.

Programming:
  • Use inet_ntop to convert an IP address from binary to a string -- it just works with different address families (but everyone can trust you with the bit-shifting thingy too at least for AF_INET.)
  • Everybody uses the format Perl feature, it's just a bit more readable to use formline and then get things directly out of $^A -- you just need to reset $^A between invocation. So now you can write, for instance:

    formline ("@<<<<<<<<<<<<< @<<<<<<<<".
    "@|||| @>>>>", v1, v2, v3, v3);
    push (@res, $^A);
    $^A = "";

  • libevent is just the RightWay™ to program application driven by events on file descriptors. Apparently, there are performance gains to be had in situations where only a small amount of events occur on a large amount of active file descriptors.

Sysadmin:
  • EasyDNS will offer you free registration and a perl script to update automatically their system if you external IP changes. Every now and then they'll send you an email with an easy link to click on to renew your binding for one more month. The scripts is called ddclient and requires IO::Socket::SSL which depends on NET::SSLeay. The configuration has one important section where server= is where you specify the domain(s) to update:

    use=web, web=checkip.dyndns.org/, web-skip='IP Address'
    server=members.dyndns.org, \
    protocol=dyndns2 \
    your-domain-here

  • Hmm, if you can, stay away from the BroadCom Wifi chipset (as in
    the Dell Latitude D620) unless you want to update to the cutting edge Linux kernel (2.6.20 at the time) and use bcm43xx-fwcutter or download the RPM that has all the content you need. This chipset isn't well supported (I believe because the specs aren't readily available to driver developers) and even with that support, it doesn't work really well (lock-ups during initialization, loss of carrier, etc...)
  • If you want to install qmail with all the patches that are required for it to work properly on current Linux distribution (including the errno fix), just visit qmailrocks.com. There's a bit more on qmail here, where you'll learn, notably, how to use recordio.
  • smtp-sink can be used to capture emails and take dispositions with them.
  • The nagging type "exit" to leave the shell message is controlled by the IGNOREEOF variable. I had it in my .bashrc for ages, only to forget that it's there when given account on a new machine.

Security
  • Need to create a honeypot? Just use Honeyd.

Networking
  • Anycast is mostly used for UDP and, in some limited cases, TCP traffic. It takes the form of ad-hoc routing: hosts on different point of the network have the same IP address and the traffic they see and reply to is based on some heuristic. Anycast is how the curent F DNS root server is implemented. The level of replication of service it offered help twart the February 2007 attack against core DNS servers.

Scripting
  • doexec will let you invoke a command with its arguments with the following twist: you get to specify what argv[0] is -- but not $0, it can't be used for shell scripts.

Labels:

Tuesday, September 26, 2006

On offshoring...

Offshore software development management:

This is an interesting article. And I realized that we've learned very similar lessons:
  • Contact and communication
  • Give people autonomy, they crave it and love it. Apply feedback when necessary
  • Considerations for cultural changes
  • Use a Wiki to freely organize information
  • Test scripts playing a greater role in stating requirements (maybe not as much as we should here)
  • Build based feedback: we're lucky to have a large QA team that can help us test early and often
  • Regular short status meeting (yes, per functional teams)
  • Short iterations
something we haven't done nearly enough:
  • Scheduled visits
The articles stresses out some of the issues that we ran into: source control systems are slow over links, if you don't have a solution that allows for caching, it's going to be painful to check out entire trees. Milestone builds (otherwise we have our local build servers though) need to be carefully planned and you need the right resources on both sides to help you if something goes wrong (along with a clear idea of what your network availability is going to be.)

Labels:

Saturday, June 10, 2006

On NAC.

NAC is a buzzword these days: in a nutshell: Network Access Protocol tries to govern what you are allowed to do on the network based on who you are. So, there's a policy that is defined according to three parameters:
  1. Who you really are (authentication)
  2. Your end-point security (are you from a fixed and uptodate security-wise desktop running a secure operating system or are you connecting from a Windows desktop that hasn't been upgraded in the last three month)
  3. Network environmental information: are you connecting through a wireless access point or through a VPN. Are you in the main building or a remote office.
Couple points:
  1. What's NAC's ROI?
  2. NAC is reactionary, it's worrying about last week threats
  3. It's complicated, there are tons of solutions (M$, Juniper, Cisco) and its granularity (down to individual FW policies, for instance) can make it difficult to deploy it adequately.
Lots of links are available on the topic. Here's one that's a good starting point.

Update: there has been a lot of articles published on NAC recently, as the topic is picking up more steam than ever before. The company StillSecure.com for instance, publishes a bunch of interesting papers on the topic.

Labels:

Friday, April 07, 2006

Virtualization (or: it's about time)

Virtualization is everywhere these days (duh.) Because of increase in hardware performance, security requirements and the need to run several OS on one system, virtualization is now mainstream enough to be offered on desktop operating systems for the masses to use (Linux, OSX, Windows.) It is so far a x86 game only, although hypervisors for other architectures (like XScale) are in the work.

Hardware vendors are adding a lower level of instructions to help writing hypervisors that will control the VM supporting operating systems running at their current priviledge level. This pushes the security one level down while offering the devastating ability to compromise the software that manages the execution of all operating systems on the platform. This thread could be circumvented by the use of the TPM/PKI to ensure that the hypervisor isn't being tempered with (take a look at this diagram)

Some articles I should have read a long time ago.

There's certainly hope that hardware will address some of the existing performance degradations brought by virtualization. Some applications that could greatly benefit from virtualization can't suffer the I/O performance degradation that existing virtualization technology fails to entirely alleviate.

Labels: ,

Monday, December 19, 2005

Firewall failover

I just ran into a nice write up of stateful failover capable firewall running BSD pf.
  • CARP (Common Address Redundancy Protocol) is used to switch identity during failover and CARP traffic is used as a measure of the availability node (the master advertises using CARP and if the backup doesn't hear from its master, it'll start advertising itself. Carp is IP protocol #112. CARP has a ARP balance feature that can be used to direct traffic to particular hosts and can be seen as similar to VRRP but presents the further advantages of being more secure and not encumbered by patents.
  • pfsync implements the IP protocol #240 and performs connection state synchronization so that a stateful failover (i.e. not TCP connections are lost) can be supported. A node joining the firewall cluster will receive a bulk update of the existing connections and then will be updated periodically on a best effort basis.

Labels: ,

Friday, November 04, 2005

Layer switches.

This `switch' terminology that applies to all layers of the OSI model is getting confusing. I needed a clear picture so I went to do a quick search and here's what I found:

Layer 2 switch:
  • that's your regular switch. It learns what traffic shows up on what port to direct direct there faster instead of forwarding to all the ports like a hub does. It operates at the MAC address level.
Layer 3 switch:
  • Layer 3 switches are high performance routers. The switching part of the business is done in hardware instead of being handled by a CPU and packets are switched based on their IP address, by doing routing table comparison entries. The routing tables are of course populated as the result of responding to routine protocols. More on the topic can be found here.
  • Layer 3, being switches, are aware of VLAN and can route traffic between VLANs.
Layer 4 switch:
  • It does policy based switching -- a lot of what a firewall does if the device acts on layers above 4. A L4 switch can also do load balancing by identifying sessions and directing them to a dedicated server. See more here and here too.
Further into blurrying layers, MPLS (Multi Protocol Label Switching) that inserts layer 2 and/or layer 4 information into tags that exists at layer 3 to be handled by specialized equipment (LSR: Labelled Switch Routers) to circumvent congestions, bottlenecks or link failures.

Labels:

Tuesday, October 04, 2005

STP based attacks.

Following a link from the Linux Bridging Ethernet project, I got to read about the security implications of bridging and the use of the STP (Spanning Tree Protocol.) The attacks are all based on abusing:
  1. The inherent trust that exists between bridging equipment in terms of Bridge Protocol Data Unit acceptance.
  2. The implementation of the topology management that mandates ports to sometimes (partially) block traffic.
Here are some possible attacks:
  • Trigger eternal elections of the bridge root. Upon detection of prediodical BPDU packets, the attacker replies with a BPDU claiming that its root status superceeds the one expressed in the packet it just detected. While in bridge root election mode, traffic forwarding on ports is disabled, the eternal elections lanch a DoS attack.
  • This type of election based attacks can be made localized, isolating clients on one segment served by a bridge from a server located on a segment served by an other bridge, giving a chance to the attacker to impersonate the server.
  • An attacker equiped with links to two independently connected bridges can sever the bridges connection (by initiating and winning elections to be the designated bridge for the two segments) and becomes trusted in forwarding packets between the two bridges, effectively perpetrating a MITM attack.
  • STP extensions disabeling the Learning state on user ports (Port Fast, Fast Start, etc...) for reasons of responsiveness will, upon perpetual elections, force the switch reset its switching table making the interfaces in then promiscuous mode subject to APR cache poisoning and the like types of attack.
In the light of these vulnerabilities, assessment of the use of STP as well as the STP enabled equipment deployment should be made and decisions on STP use should be taken.

Labels:

Tuesday, June 14, 2005

HTTP request smuggling.

This paper recently posted on /. introduces HTTP request smuggling as a way to exploit discrepencies in the way applications parse HTTP/1.1 requests and act on their content.

Specially crafted combined HTTP requests can lead one application to see a certain request with a certain content. The data then reaches a second application where it is decoded differently:
HTTP_REQUEST .... ; Seen by application A and B
...
HTTP_REQUEST ... ; Seen by application B
...
HTTP_REQUEST ... ; Seen by application A
This discrepancy is exploited to:
  • Poison a web cache: the web cache A sees content 1 but the web server B see something different and instead serves content 2 that gets associated with content 1 by web cache A.
  • Make a firewall such as an unpatched FW-1 R55W not see malicious content in a page and passing it down to IIS where it will be wrongly absorbed (because of a IIS limitation/bug.)
  • Smuggle an XSS attack.
HTTP request smuggling gravitates around misformed HTTP requests such as:
  • Double content-length statements advertising different lengths -- some application pick the first as being the right one, some pick the second one as being the right one, leading to different content interpretation.
  • GET requests plus content-length.
  • Buffer size limit anomalies, such as the IIS/48k limit
What's really to blame in HRS is that all implicated applications are using different HTTP parsers, with different interpretation of edge and borderline cases. The HRS techniques is similar to the HTTP Response Splitting, presente here: HTTP response splitting relies on application bugs that will generate two responses for one request, with the second response content being controlled by the attacker -- the attack works by sending a first crafted request to the application that will generate two responses. A second request is sent to be matched by the second response. Imagine that the two responses are managed by a web cache, cache poisoning is effectively achieved.

Labels:

Friday, May 27, 2005

Firewalk

I've been reading a bit about firewalk, a traceroute variant with knowledge of what a firewall might
do in order to stop traffic at its door. Here are some notes.

On using traceroute
  • Letting traceroute use UDP packet and watch it not being able to report on tells us that some filtering is happening on some hosts. For instance, if we see this in our traceroute output:
    13  193.251.243.30  180.328 ms  183.666 ms  172.724 ms
    14 * * *
    15 * * *
    We know that at 193.251.243.30 some UDP filtering is happening. If instead we force traceroute to send ICMP packets (-I option,) we might start to see what's behind the filtering host:

    13 193.251.243.30 164.937 ms 169.354 ms 170.566 ms
    14 193.251.251.54 175.550 ms 177.654 ms 173.038 ms
    15 193.252.117.254 173.594 ms 170.467 ms 180.090 ms
  • If the filtering host was blocking ICMP traffic as well, we could try to tickle it with some other UDP traffic it might accept such as DNS. The trick is then to reach the host with the right port number (since traceroute increment the port number for each attempt made) The computation is rather simple, based on the number of hops and the number of attempt. With the-p option, traceroute traffic will reach the filtering host with the right port number. It's also easy to modify traceroute source code not to increment the port number, so that tracing can start at a static port number of your choice.
Firewalk

Firewalk use a combination of all these methods and works once you've (1) discovered a gateway (like 193.251.243.30 in the example above) and (2) discovered a host behind the gateway (like 193.251.251.54 in the example above.) It will send the IP traffic of your choice to ports, using the right TTL in an attempt to find out what's behind the filtering agent, hereby obtaining a map of interesting hosts behind a firewall.

While reading the firewalk paper, I wrote a unpretentious perl script that invokes traceroute and tries to report any host that will do UDP filtering (by comparing the invocation of traceroute with using UDP or ICMP.) Just for kicks, it'll try to add geo location information with the hosts it discovered. Here. Here's an invocation sample:


$ geotraceroute.pl wanadoo.fr -geolocate -flag_gw
10.2.0.1: Marina Del Rey, California, United States.(lat=33.98, lon=-118.45)
10.10.72.254: Marina Del Rey, California, United States.(lat=33.98, lon=-118.45)
209.172.100.193: San Jose, California, United States.(lat=37.34, lon=-121.89)
209.172.121.229: San Jose, California, United States.(lat=37.34, lon=-121.89)
209.172.123.1: San Jose, California, United States.(lat=37.34, lon=-121.89)
140.174.37.61: Englewood, Colorado, United States.(lat=39.58, lon=-104.90)
129.250.26.38: Englewood, Colorado, United States.(lat=39.58, lon=-104.90)
193.251.250.41: Amsterdam, North Holland (province), Netherlands.(lat=52.35, lon=4.90)
193.251.240.2: Amsterdam, North Holland (province), Netherlands.(lat=52.35, lon=4.90)
193.251.241.133: Rue, Somme (department), Picardy (region), France.(lat=50.27, lon=1.67)
G 193.251.251.54: Rue, Somme (department), Picardy (region), France.(lat=50.27, lon=1.67)
193.252.117.254: Issy-les-moulineaux, Hauts-de-seine (department), Ile-de-france (region), France.(lat=48.82, lon=2.27)
193.252.122.2: Issy-les-moulineaux, Hauts-de-seine (department), Ile-de-france (region), France.(lat=48.82, lon=2.27)
193.252.122.18: Issy-les-moulineaux, Hauts-de-seine (department), Ile-de-france (region), France.(lat=48.82, lon=2.27)
193.252.122.103: Issy-les-moulineaux, Hauts-de-seine (department), Ile-de-france (region), France.(lat=48.82, lon=2.27)



Notice the entry marked with a leading G. It indicates that this host filters UDP but not ICMP. It might just be a gateway. With netpbm it should be possible to map all the entries on a world map. Supported flags are -debug, -resolve, -flag_gw and -help.

Labels:

Thursday, May 26, 2005

Metasploit

I'm looking at the Metasploit source code. It's written in Perl and fairly well organized. It seems to contain libraries that are worth looking at. A couple notes:
  • It ships with NetPacket, a perl package that allows for packet crafting -- neat, just combine it with Pex::Racksocket
  • Their lib/Pex section contains all sorts of routine for low level manipulation of data available as packages. For instance, the x86 one can help you generate x86 machine code. It's used to achieve parametrization of an exploit or a payload.
  • But I saw also mention of InlineEgg, which per its documentation seem to be even more powerful.

Labels:

Thursday, May 19, 2005

DNS Cache poisoning

Read this good Honeynet article on phising, I followed links of pharming or how to redirect traffic to a known site to a site of your choice, which can be achieved by DNS cache poisoning.

Basically, once you tricked a DNS server to consult an other DNS servers, the information sent back can contain new IP address assignments for some domain names -- anything goes as far as having the victim DNS server consult an other DNS server: email to a non existent user, embedded image links in email, banner ads, etc.

If your DNS software is flawed or not configured properly (NT4 and 2000 have insecure default configuration,) it will accept these as a replacement of what it already knows about the said domain names. M$ is of course the prime target, and also some DNS packages from other well known security companies. Nice.

Labels:

Tuesday, May 17, 2005

Crypto Refresh.

I 'm refreshing my knowledge on applied cryptography.

Symetric cryptography:
  • Symmetric/public key crypto: parties share a secret, for instance an encryption key. Shared secrets must be communicated securely (with appropriate public/private key crypto, agreed upon generated key after secret signing, via a third party, etc... methods abound.)
  • Two set of primitives: symetric encryption algorithms (ensure data secrecy) and message authentication codes (MACs) to ensure data transmission integrity.
Example of an symetric encryption scheme:
  • A block cipher mode and encryption method must be selected
  • A key is picked, for instance, a password can be used to generate a key (involves salt, iteration numbers.)
  • The block cipher mode determines how the the encryption method will be used. For instance, it can be choosen to prevent the same unencrypted input from yielding the same crypted output (better resitance to dictionary attacks.) For instance one could XOR the first chunk with a randomly generated string (an IV: initialization vector, which has to be transmitted) and then have this block encrypted. For the next block use the previously encrypted block for the XOR operation (this is how CBC works -- just encrypting blocks as they come is ECB, a bad block cypher mode.)
  • At the heart of the modern computerized encryption schemes are Feistel rounds which use permutation boxes, substitution box (S-boxes) on subkeys and the data to encrypt to achieve Shannon's confusion and diffusion.
  • Recommendation would be to use CBC/AES.
  • Outside of block cipher mode consideration, decryption is achieved by either running the same encryption scheme on the encrypted data, or by running a dedicated decryption function.
Hashes and messages authentification codes (MACs):
  • Cryptographic hash function: process input and produce a fixed sized output (hash value or message digest.) Properties: one wayness, noncorrelation (bit flip resistant) weak/strong/partial collision resistance.
  • Universal hash function: keyed hashes.
  • MAC: hash function processing a message with a secret key (+ possible nonce) to give output that can't be obtained without the key.
Public key cryptography, digital signature:
  • Involves large prime numbers and factorisation properties.
  • Allows for key agreement, digital signature and identity establishement.
  • Note that its 1000 times slower than symetric key cryptography when comparison is applicable.
  • RSA (does all) Diffie-Hellman and DSA (digital signatures only.)
  • Public/private key crypto: your public key is available to parties use to encrypt data that only you and your private key can decrypt.
  • PKI is used to establish trust between entities. Before using a public key, you must be sure it belongs to who you want to send a message to. Signature: a document is hashed, and the result encoded with your private key. Parties can use your public key to retrieve the hash and compare it to what it's expected to be. If the hash can be decrypted with your public key, you must be the one that signed it with your private key.
Signing and encrypting with public/private keys:
  • Concatenate the recipient's public key with the message and sign/encrypt the result
  • Simply signing and encrypting the signature and the message doesn't work as an intermediary can re-sign with somebody else's public key.
Key and certificates encoding:
  • Keys and certificates can be encoded to a binary object (DER: Distinguished Encoding Rules encoding)
  • Keys and certificates can be encoded to plaintext (PEM: Privacy Enhanced Mail encoding)
Authentication and Key Exchange:
Further read here. A nice concise presentation on some crypto basics here.

Labels:

Monday, May 16, 2005

TiddlyWiki (GTD TiddlyWiki)

I ran into this thing today. Great: it's a mix of HTML/CSS/JavaScript providing an application that runs in your browser without requiering a server. It seems easy to modify, but there's no real separation of code and data, but updating seem easy as it is described.

Put that on a thumbdrive, along with a copy of FireFox for all popular platforms, and you should be able to edit content everytime you find a computer. Neat.

It's here. The GTD evolved from TiddlyWiki, adding multiple features.

Labels:

Friday, May 13, 2005

Channel Attacks

This excellent page (along with a really good paper) made me discover the concept of channel attack. Basically, a channel attack consists in discovering something about data being processed by measuring the side effect of processing them, such as latency or even power consumption. Here's an other paper on channel attacks. Fascinating stuff as they seem to be really hard to close (they rely on the nature of the data processing that is being performed.)

The page that triggered the post speaks of channel attacks in the context of hyperthreading able CPUs. Recommendation to limit exposure are numerous:
  • HT aware cache sharing prevention
  • OS level prevention of execution of applications of different privilege level on the same core.
  • Crypto library should be redesigned to prevent channel attacks
Now here's something about an instance of side channel attack, and the process of a side channel attack is presented here.

Labels:

Monday, May 09, 2005

Cross Site Scripting

A recent Firefox security alert raised the possibility of cross site scripting vulnerabilities for Firefox users. As I didn't know anything about cross site scripting, Google came to the rescue and pointed me to this handy PDF. The idea is to entice the victim to visit a legitimate web site while forcing its browsing agent to execute malicious code while vising the site.

How does malicious code gets embedded in a page served by an uncompromised server? For instance by placing that code into a CGI parameter the targetted site accepts and whose content be will served back to the user. The malicious code, usually Java Script, can then access user sensitive information stored in the navigator and send them back to a web site set by the attacker for data collection -- cookies are a prime target (i.e. document.cookie) since, under certain circumstances and with cookies set properly, one can access a web site while impersonating a legitimate user.

The <script> is the one that comes to mind to be used to trigger JavaScript execution by the victim's web browser while accessing a legitimate site, but other tags like <img> could be used as well (through a src attribute set to javascript:.... Even form's input widget value tags could be used to carry mallicious data through the use of a "> to allow for free HTML escape.

Of course, besides the operator clicking through a possibly unsollicited URL it was provided with, the fault lies in the site that serves parameter values back to the user without doing any sanitization performed on their content (or input validation.) The web application should be written to check incoming parameters for acceptable values and discard or reject anything that could be abused -- the parameter input range should be carefully specified in that regard. Note that all input sources should be subject to filtering: query parameters, body parameters of POST request and HTTP headers.

Output filtering could also be performed on the web server prior to sending the page back to the victim, and content inspection capable firewalls could also be deployed for ingress and outgress filtering.

See also: CSRF or XSRC (Cross-site Request Forgery:) here.

Labels:

Wednesday, April 06, 2005

Secure Programming Cookbook.

A coworker lent me a copy of O'Reilly's Secure Programming Cookbook. Just by reading the table of content you get an idea of what you should be thinking about, the rest really are (as being a cookbook) implementation details.

Some notes on file use:
  • Don't create a file using fopen as the permissions are 0666 (umask modified.) I never liked the f function anyways.
  • Avoid TOCTOU (Time of Check Time of Use) race conditions (before the time you check and the time you use, the attacker as done something to your target and you operate on something that's not what you thought of in the first place.) The best way to avoid these when dealing with files is always to operate on the file descriptor, as the underlying file object doesn't change. If you're using function relying on a string to determine the filename you're 1) wasting cycles and 2) exposing yourself to a TOCTOU based attack.
  • Unix/Linux doesn't support mandatory file locks very well. Windows gets it right (a file lock is a lock that, once aquired on a file, prevents other processes from accessing the file in the way described by the lock while the lock is held.)
  • To make sure a temporary file can't be used by anyone else: after the file has been opened, delete the file. The process owning the file descriptor will still be able to use the file, while no one else will because the file doesn't exist by name.

Labels:

Tuesday, March 29, 2005

On botnets.

The other day I was reading an this link from an entry of .tHE pRODUCT blog. It linked to a recent Honeynet piece on botnets. The article is fascinating and well worth the read. It yields very valuable URL for further exploration on the topic. Here are a few notes.

Here's how a botnet is built:
  • Vast section the public internet are scanned
  • Attacks are launched on 445/TCP (M$ directory service), 139/TCP and 137/UDP (netBios) or 135/TCP (M$ RPC.) Unpatched XP or SP1 patched XP hosts consitute the bulk of the victims
  • An IRC client (IRC bot) is installed on the system and it connects to a central server from where they can be managed (uploaded software for instance) or used (DDoS attacks, spamming, keylogging, attacks on IRC networks, Google AdSense and other automated clicking abuses, etc...)
  • Once compromised, a machine will try to compromise more systems (such as its peers as reachable through examined system configuration)
A compromised machine is called a Bot. A collective of bots is a botnet. Botnets can comprise hundreds of thousands of machines. They are built and used for fun and profit. Note that an unpatched and unprotected Windows box will be compromised within 10 minutes or less on average.

Here's how bots are caught:

An attractive victim (honypot) is placed behind a honeywall, a system running snort_inline so that the IRC traffic can be observed while rendering the bot harmless. The IRC client is also replaced by something that won't be detected as foreign when the bot joints its botnet -- honeypot wrote their one called Drone.

Labels:

Friday, March 25, 2005

XML-RPC, SOAP, etc...

1. XML-RPC

A XML file is used to describe a procedure invocation -- the method name is mapped by the server/CGI interface as some executable resources to access, and the arguments can be of scalar and non scalar (such as array a structures) types. The method invocation returns through a XML file that defines the set of returned results or errors.

Here's a HOWTO/tutorial on XML-RPC.

BLOB upload can be problematic, an alternative is described here. An implementation on the client side can reuse the user agent created for the Frontier::Client and the returned XML decoded through the client {'enc'}->decode method. On the server side, a Frontier::RPC2 object can be created in order to use the encore_response method to produce the returned XML.

2. SOAP

SOAP (Simple Object Access Protocol) is a XML based protocol to let applications exchange information over HTTP. XML is used to transport the message featuring:

  • An enveloppe (to identify the message as a SOAP message)
  • Optional header
  • A body containing call and response
  • Optional Fault section for error reporting on message processing
A SOAP method is an HTTP request/response that complies with the SOAP encoding rule (HTTP + XML = SOAP.) A SOAP request can be an HTTP POST or GET request. Content-Type can for instance be application/soap+xml; charset=utf-8.

3. Comparison

I found this comparison of XML-RPC/SOAP interesting.

XML-RPC is really simple and to the point. SOAP picks things up where XML-RPC left them and tends to be bigger and bloatier. It is endorsed by the big of the industry (IBM and M$, with IBM having the most complete implementation.)

4. Implementation

XML-RPC seem to be enjoying the vastest array of implementation: Python, C, C++, Java and Perl to name a few. Available and open SOAP implementation seem to be written in Java but there's at least one C++ based implementation. SOAP C++ implementation is here (but it's unclearwhether Java is required or not...)

XML-RPC seems to be fairly easy to integrate at the Perl CGI/bin level. The Frontier::RPC2 module provides what is required for XML input parsing and output formating -- it pretty much works straight out of the box but base64 support is said to be buggy. An alternative to try would be RPC::XML.

Note on the perl stuff. If you're accessing through https, you will get a `Protocol scheme 'https' is not supported' if IO::Socket::SSL isn't installed on the client (for a second you might thing it's a Frontier package problem but it's not.) Also, once you created a server, setting $server->{'debug'} will turn debug printout on. But for some errors, its lacking a bit and you're better off exploring the content of the returned value in Frontier::call() (an HTTP code is alway printed though.)

Something else I haven't read yet: Zope and XML-RPC here.

A modern use of SOAP is SOA (Service Oriented Architecture.) There's a description of SOA here.

Labels:

Thursday, March 24, 2005

Packet filtering with iptable

Netfilter implementation

Just as a reminder: netfilter provides a way for kernel modules to register callbacks/hooks (there are 5 of them) with the network stacks (IPv4,6 and DecNET.) iptable implements in the kernel a named array of rules. For the purpose of implementing a firewall, tracking connections or doing NAT, packets arriving at the netfilter hooks are sent traversing iptables and might or might not get out of it alive.

About the use of the limit module (found in the Packet Filtering HOWTO, from netfilter.org:)
  • By combining the number of packet to accept per unit of time (like a second) and the number of burst packet, one can implement syn-flood protection. For instance: -p tcp --syn -m limit --limit 1/s will make you accept five SYN request in burst, after allowing only one per second and reconstituing the burst buffer by one shot every second.
  • With -p tcp --tcp-flags SYN,ACK,FIN,RST RST... and limits, one disable a furtive port scanner.
  • With -p icmp -icmp-type echo-request and limits, one deals with the ping of death.
Note that the rules can be injected into the packet filtering infrastructure using the iptables commands, but the libipq API is provided as a way to interact from user space with iptables. For instance, there's a version of snort that receives packets from iptables instead of getting them from libpcap and will inform iptables whether the packet should be dropped, rejected or modified.

Firewalling with netfilter

Existing chains:
  • PREROUTING: before the route decision is taken (does mangling and nat)
  • FORWARD, when the firewall is routing input traffic, goes to post-routing (does mangling and filtering)
  • INPUT: on the incoming traffic (does mangling and filtering)
  • OUTPUT: the outgoing traffic, goes to post-routing (does mangling, nat and filtering)
  • POSTROUTING (mangling and nat)
Each tables (filter, mangle and nat) in a chain can be filtered. Each chain implements rule based filtering on its table, and after consultation, the fate of the packet will be decided upon: ACCEPT or DROP. If no rule matches the packing under scrutiny, then the default chain policy is applied -- usually, you want to DROP the packed if you failed to successfully determine its fate after filtering.

Connection tracking:

Tracking connections at the firewall level is important -- for instance, you want to allow new and established connection to leave your network, and established connection to enter your network. Note that after a connection has been established and a few packets have been sent, the connection is declared assured.

Tracking TCP connections closure is done by recognizing the FIN/ACK or RST sequence, and letting the connection enter a time wait status to give a chance to all packets to traverse the firewall ruleset (thing about out-of-order packets reaching the firewall after the connection reset packet has been received.)

Note that UDP and ICMP traffic is tracked as connection in the sense of monitoring what is being sent to who and back, although the UDP and ICMP protocols aren't establishing connections by themselves. Since ICMP packets can be sent back to connection orignators to indicate of a problem, they should be considered as related to other connection traffic.

Some complex protocols (such as FTP) that are sending related connections requests in the a control connection require the firewall to examine the traffic to mark requested new connection as related.

Further readings that I should indulge in:

Labels:

Wednesday, March 23, 2005

RAW socket programming.

RAW:

A recent openBSD exploit eventually got me into looking at RAW socket programming. The exploit is fairly recent and can be found here. The checksum routine doesn't work BTW (use any other implementation and it works.)

I took a look at the patch that fixes the exploit, but I don't know enough of openBSD to understand why it crashes some specific kernels -- but it looks like a bogus timestamp can trigger the computation of a TCP retransmit timeout that will eventually crash the system.

packet generation utility:

All this makes me want to write a generic packet generator. netcat (see tutorial here)
is nice but I guess what I want is something that can let me craft packets the way I want, for instance
  • packet -src=192.168.0.1 -dst=192.168.0.2 -proto=tcp -ttle=255 -tos=0 -fragment=no -sport=1234 -dport=80 -seq=rand -ack=rand -flags=SYN,ACK -window=512
Or
  • packet -ether-type=arp -ether-src=00:... -ether-dst=broadcast

You get the idea.

Labels:

Friday, March 18, 2005

Passive OS fingerprinting.

The classic on OS fingerprinting is here. Active OS fingerprinting relies on sending packets (mostly TCP and ICMP) to open or closed ports and observing the answer, but this is rude.

p0f is a passive OS fingerprinting that allows for all sort of interesting application. It works best when it sits waiting for packets to showup for analysis. For instance, it could be installed on a web server to look at incoming TCP packets to find out what is connecting to it.

Here's how it figures certain things:
  • the uptime: the timestamp on SYN requests here (but this depends on the OS: Linux seems to be using ctime, Windows is using some HZ increment.)
  • The link type: with the gathered MSS/MTU (packet -vs- payload size) values
  • NAT: analyzing disparities in fingerprinting received for the same IP (link type, OS identification, etc...)
Just a few hints:
  • Look at the TTL value in the received packet and do a traceroute to figure the TTL: 64 is common for Linux/BSD, 128 could be a Windows box
  • Look at the Window Size: 0x1600/0x2D00 or so and somewhat constant through the connection is common for Linux.
  • Changing through the life of the connection is common for Windows.
Mention of p0f fetched here.

Labels: ,

Python.

So far, the things I like about Python:
  • the functional programming stuff: filter/map/reduce
  • ranges
  • list comprehension
  • sets and operations on sets
  • loops and iterators/generators
  • introspection
  • continuations (code invocation context awarness)
Things I like less:
  • Implicit declaration of variables and fields
  • No functions with multiple signatures, unless I'm mistaken

Tuesday, January 25, 2005

BPM.

Some acronym maping:
  • BPM: Business Process Management (wiki)
  • ERP: Enterprise Resource Planning (wiki)
  • CRM: Customer Resource/Relation Management (wiki)
  • SCM: Supply Chain Management (wiki)
BPM is a relatively recent concept. ERP, CRM and SCM are older solution still influencing BPM. BPM has been going very strong the past few years, and will continue to grow as more companies find benefits in deploying it.


Friday, November 19, 2004

On threading.

Add: read/write lock, spinlocks, valgrind stuff, read ntpl source code (implementation details), Pthread and MPI, NUMA

Question: linux kernel version, old LinuxThreads or NPTL compiler, MPIAND threads?

The OpenGroup specs for threads (and other.)

LinuxThreads -vs- NTPL:


LinuxThreads:

Some documentation here

There's a manager thread (handles thread creation/destruction, fatal signals, memory management (allocated stack, thread local data, etc...), waiting on dead threads.


Synchronization primitives with signals (spurious wake-ups, pressure to the kernel signal system.) Broken SIGSTOP/SIGCONT implementation (ctrl-Z doesn't work) Limits on number of threads.


Reliance on manager thread imposes performance penalties: because of
serialization of creation/deletion, monopolization of one CPU,increases context switching.

List of all thread is maintained, to implement pthread_key_delete for instance. /proc clutter.

NTPL:
Correct implementation POSIX signal handling in the kernel solves fatal signal handling issues.

Kernel can do thread memory deallocation. Kernel can reap terminated threads.

Thread specific data and local storage are managed through generation counters.

Synchronizations primitives are implemented with futexes. (which can be placed in shared memory, so that PTHREAD_PROCESS_SHARED can be implemented (mutex shared between threads belonging to separate processes))

Thread local storage and thread data structures are merged in one block and placed on the stack. Stack frames can be cached for thread creation/deletion performances (stack frame unmap are costly depending on the architecture)

Required kernel support:

  • Arbitrary thread specific data areas support

  • Extension of clone to optimize thread creation and facilitate termination (no manager thread required.)

  • POSIX signal handling for multi-threaded processes, fatal signal terminate entire process. Stop/continue affects entire process (ctrl-Z works)

  • exit_group syscall added.

  • And more...

There's a testing effort ongoing: see results here. It's based on the POSIX test suite.

There are scalability limits that started to get documented


1-on-1 -vs- M-on-N

Kernel threads are used (pure user-level implementation makes multi-processor use impossible.) M-on-N schdules M user threads on N kernel threads: two schedulers at work, need cooperation. Doesn't fit well on Linux (user level context switch often requires copy of register content from kernel space.) O(1) scheduler negates advantage of user level scheduler. Overhead cost. Simplifies signal delivery.



Message queues:

Inter process message exchange.
  • mq_open create a message Q. It's identified by a filename path starting with / (no directories allowed,) features R/W/RW priviledges. Attributes (mq_maxmsg and mq_msgsize) can be specified if privlegdes for the given message queue are
    granted (permission are resolved on name as for file access.)

  • mq_send sends a message to a message Q. The priority argument determines where the message is inserted in the Q. When the Q is full, the caller blocks (unless O_NONBLOCK) If several threads a blocking, the highest priority one is awaken to send the message.

  • mq_notify subscribe to message notification. A struct sigevent argument specifies how to be notified (and how the poster will be notified of the IO completion ?)

  • mq_receive receives the oldest message of the highest priority. Same blocking policy than for mg_send.

  • Attributes: mq_maxmsg, mq_msgsize, mq_curmsgs and mq_flags (O_NONBLOCK)
Initialization:
pthread_once: Argument pthread_once_t is initialized to PTHREAD_ONCE_INIT and the function is called with a function pointer to execute. Subsequent call won't do anything as the value in pthread_once_t is modified to mark that the initialization happened.
Thread Specific Data:
Can be viewed as a thread private array of PTHREAD_KEYS_MAX void * addressed by keys. Keys are common to all threads, but values are private. TSDs are disposed of when cancelling or exiting a thread.
  • pthread_key_create allocates a new key (whose value is returned through a parameter) and set the initial value to NULL. A destructor can be attached to the key for destruction at cancellation or exit. The destructor doesn't run if the associated value is NULL, but associates NULL to the value it's destroying.

  • pthread_key_delete deallocate a TSD key, but doesn't run the destructor and doesn't care about the deallocated value.
This is the old way of doing things. Nowadays, we let the compiler handle things by using the __thread keyword.
Synchronization:

Mutexes

Mutexes shouldn't be called from signal handler (not async signal safe.) Mutex functions aren't cancellation point (don't hold mutexes for long when cancellation is deferred.)

Two states: unlocked (not owned), locked (owned by one thread.) Acquiring an owned thread will make the thread block.

  • pthread_mutexattr_init: Initialize a mutex with attributes (mutex kind determines what happen for pthread_mutex_lock on already owned mutex: PTHREAD_MUTEX_FAST_NP suspends calling thread for ever, PTHREAD_MUTEX_RECURSIVE_NP returns immediately with success code, number of time thread owning the mutex is recorded and must be matched by number of pthread_mutex_unlock; PTHREAD_MUTEX_ERRORCHECK_NP returns with error code EDEADLK.)
  • pthread_mutex_lock: if unlocked, the mutex is set in the lock state and to belong to a thread. If locked, depends on fast, recursive and error check mutex kind.
  • pthread_mutex_trylock: just like pthread_mutex_lock but return EBUSY if mutex is owned.
  • pthread_mutex_unlock: fast mutexes are returned to the unlock state, recursive mutexes have their reference count decremented and the mutex unlocked when count reaches 0, error checking mutexes are checked for lock state and owning thread appurtenance (error code may be returned.) Fast and recursive mutexes can be unlocked by non owner. This is not a portable behavior.
  • PTHREAD_MUTEX_INITIALIZER, PTHREAD_RECURSIVE_MUTEX_INITIALIZER_NP, PTHREAD_ERRORCHECK_MUTEX_INITIALIZER_NP: Static initialization macros.
  • pthread_mutexattr_destroy does nothing for Linux except check that the mutex is unlocked.
  • pthread_mutexattr_{set,get}type set/query mutex kind attribute.

Condition variable (condition)

Additional info here. Conditionals are implemented with FUTEXES (wait queues).

Mutexes control access to data. Condition variables provide synchronization on data value, without resorting to polling. Mutexes are required to work with condition, because of race conditions (to avoid a thread signaling a condition before and other thread can wait on it.)
  • pthread_cond_init: There are no condition attributes supported on Linux implementation. Static initialization through PTHREAD_COND_INITIALIZER
  • pthread_cond_wait: atomically unlocks the associated mutex, and waits on a condition. pthread_cond_timedwait add a timeout (And a may set ETIMEDOUT)
  • pthread_cond_signal: restart one waiting thread that a condition has been met. pthread_cond_broadcast restart all waiting threads (thundering herd problem/scheduler trashing prone.) When thread is restarted, associated mutex is automatically and atomically locked again.
  • pthread_cond_destroy: destroy ressources used by condition. Does nothing on linux except check that no thread waits on cond
Semaphores
Semaphores are counters for ressources shared between threads. They are incremented/decrement atomically.
  • sem_init initializes a semaphore with an initial value. Semaphores can be shared amongst processes (LinuxThreads doesn't support this.)

  • sem_wait suspend calling thread until values as a non zero count, then decreases the semaphore count. It is signal safe (the only POSIX synchronization function that is, LinuxThread implementation isn't) sem_trywait is the non blocking variant (EAGAIN returned if count is zero.) This is the P operation.

  • sem_post Atomically increases the count of the semaphore and never blocks (this is the V operation)

  • sem_getvalue gets the current count of a semaphore.

  • sem_destroy releases ressources allocated for a semaphore. For LinuxThreads, just check that no thread is waiting on the semaphore.
  consumer:
loop:
P (); ; Mutex initialized to 0, consumer blocks
consume;
goto loop;

producer:
loop:
produce;
V (); ; 0 Initialized mutex brought to 1, consumer awakens
goto loop;
Barrier
Barriers set the number of threads that must reach a barrier before all of them can be allowed to continue.
  • pthread_barrier_init Initializes a barrier with a count number.
  • pthread_barrier_wait Thread blocks until the required number of threads have reached the specified barrier.

Thread management:

pthread_cancel:

Sends cancellation to thread. Receiving thread can ignore the request, honor it right away or defer until the next cancellation point. Receiving thread exits as if pthread_exit(PTHREAD_CANCELED) had been called.
  • pthread_setcancelstate sets either THREAD_CANCEL_{ENABLE,DISABLE}

  • pthread_setcanceltype sets PTHREAD_CANCEL_ASYNCHRONOUS (as soon as cancellation request is received), PTHREAD_CANCEL_DEFERRED (next cancellation point.

When cancelling: execution of cleanup handlers (reverse order, LIFO), finalization for thread specific data and return PTHREAD_CANCELED.

pthread_cleanup_{push,pop,push_defer_np,pop_restore_np}:

Manages cleanup handlers:
  • pthread_cleanup_push: install a cleanup handler, called when thread terminate (cancel or exit.) LIFO.
  • pthread_cleanup_pop: remove last cleanup handler, with possibility of executing it.

    Note: these matching pairs should be in the same block/function (they're macros and introduce a {/} sequence.
  • pthread_cleanup_{push_defer,pop_restore}_np Non portable extension (set PTHREAD_CANCEL_DEFERRED, push cleanup handler. pop cleanup handler (with possible execution) and restore cancellation type.
Cancellation Points
In general, any function that might suspend the execution of a thread for a long time, should be a cancellation point. In practice: depends on the implementation and how POSIX it is. Cancellation point can be explicit or implicit.
  • pthread_testcancel: tests for a pending cancellation, effectively establishing a cancellation point.
  • pthread_cond_{timed}wait, pthread_join, sigwait and sem_wait.
  • All other syscalls that cause a process to block: read, select, wait and whatever in the libC uses them.
Thread linux special:

pthread_atfork:

Due to implementation limitation: fork on a threaded application duplicate the currently running thread but not others. Mutexes are duplicated in their current state, this gives us a chance to set things straight.
Thread attributes:
detachstate:

  • PTHREAD_CREATE_JOINABLE: thread termination synchronization possible through pthread_join (termination code available.) Allocated thread ressources reclaimed after join.

  • PTHREAD_CREATE_DETACHED: no join synchronization, ressources reclaimed when thread terminates.

  • pthread_detach can force PTHREAD_CREATE_DETACHED

schedpolicy:

  • SCHED_OTHER, SCHED_RR, SCHED_FIFO (both require super user priviledge.)

  • pthread_setschedparam can change schedpolicy on running thread.

schedparam:

  • Set priority value for SCHED_RR, SCHED_FIFO scheduling policy, withing sched_get_priority_{min,max} range (range depending on the policy), usually 0 or 1 to 99.

inheritsched:

  • PTHREAD_EXPLICIT_SCHED: scheduling policy for newly created thread determined by schedpolicy and schedparam.

  • PTHREAD_INHERIT_SCHED: inherited by parent thread.

scope:

  • PTHREAD_SCOPE_SYSTEM: threads contend for CPU time with all other processes runing on the machine (thread priorities are interpreted relatives to other processes priorities.)
  • PTHREAD_SCOPE_PROCESS: contention occurs with other threads of the running process. No supported on Linux.
Default attribute values



































AttributeValue
Threads
detachstatePTHREAD_CREATE_JOINABLE
schedpolicySCHED_OTHER
schedparam0
inheritschedPTHREAD_EXPLICIT_SCHED
scopePTHREAD_SCOPE_SYSTEM
Mutexes
mutex kindPTHREAD_MUTEX_FAST_NP

Condition variables
cond_attrValue ignored

Pitfalls:

Race conditions
Execution results depends on code execution order (code in different thread.) Circumvent with synchronization.
Deadlock
thread1 acquires a lock1, thread2 acquires lock2. thread1 blocks acquiring lock2 and thread2 blocks acquiring lock1. Thread1 can't unlock lock1 as its blocked on lock2, thread2 can't unlock lock2 as its blocked on lock1. Deadlock ensues.
Livelock
Two or more threads change they states in response to changes in other thread or threads, without doing any usefull work. Differ from a deadlock in that neither tread is blocked or waiting for anything.
Priority inversion
Low priority running thread synchronizes access to a ressource with high priority thread. When high priority thread runs, it can't because low priority hasn't released the ressource. This give a chance for a medium priority tasks to run, preventing low priority tasks from running to release what blocks the high priority tasks. As a result, high priority tasks doesn't run when it should and what it controls doesn't happen.

Solution is:
  • Priority inheritance: lower priority temporarily inherits higher priority of high priority thread that locks on its ressource, givin it a chance to run when medium priority would have ran. Requires OS help.

  • Priority ceilings: associates a priority with each resource, transfered to the accessor of the resource + 1.
What's Linux specific

Document the NP stuff.

Does nothing:pthread_mutexattr_destroy, pthread_mutex_destroy

Error code
ESRCH is used for invalid thread scheduling parameter specification.