GeekLogs

Random geeky stuff

  • Using python's map() and reduce()

    • 29 Jan 2012
    • 3 Responses
    •  views
    • map() python reduce()
    • Edit
    • Delete
    • Tags
    • Autopost

    Sum of the Squares of the first 5 numbers

    In this post I will try to use the map() and reduce() functions to compute the sum of the squares of the first five numbers. The map() function takes a function and an iterable as a parameter. The items of the iterable are fed to the function to create a new list. The reduce() function also takes a function and an iterable as a parameter. However, this takes two arguments cumulatively to the items of iterable, from left to right, so as to reduce the iterable to a single value.

    So, the first 5 numbers are 1, 2, 3, 4, 5. And, the first task would be to find the the squares of each of these numbers. Lets start with writing a simple function that finds the square of a number.

    def sqr(x):         
        return x*x

    Now, that we know how to find the squares, lets write a function that can find the sum of two numbers.

    def sum(x,y):         
        return x+y

    We can get the first 5 numbers using the range() function into a list. The next step would be to find the square of each numbers. This can be achieved by passing each of the elements from the list into the sqr() function using python’s built-in map() function.

    a = range(1,6)     
    squares = map(sqr, a)

    Once, we have the the squares in a list, we would now want to add each of these items to find the sum of the squares.

    sum_of_squares = reduce(sum, squares)

    The full program listing

    a = range(1,6)      
    
    def sqr(x):         
        return x*x      
    
    def sum(x,y):         
        return x+y     
    
    squares = map(sqr, a)     
    sum_of_squares = reduce(sum, squares)
    • Tweet
  • What is Memoization ?

    • 7 Jan 2012
    • 5 Responses
    •  views
    • fibonacci memoization optimization python
    • Edit
    • Delete
    • Tags
    • Autopost

    Memoization is a computer science concept to optimize programs by avoiding computations that have already been done and to reuse it. This is achieved by storing the computaions in a lookup table and retrieving them if a need for it arrives in a future computation step.

    Now, lets try to walk this concept using a Fibonacci series where each number in the series is sum of the previous two numbers. In the subsequent code, we are trying to find the value of the N'th term in the series.

    Fibonacci Series -> 0, 1, 1, 2, 3, 5, 8, ....

        def fib(n):
            if n == 0:
                return 0 
            if n == 1:
                return 1
            return fib(n-1)+fib(n-2)    

    So,
    F(7) = F(6) + F(5)
           = [F(5) + F(4)] + [F(4) + F(3)]
           = [[F(4) + F(3)] + [F(3) + F(2)]] + [[F(3)+ F(2)] + [F(2)+F(1)]]  
           = [[[F(3)+F(2)] + [F(2)+F(1)]] + [[F(2)+F(1)] + [F(1)+F(0)]]] + [[F(3)+ F(2)] + [F(2)+F(1)]] 
           = ...
           = 8 

    If you notice, we are computing the fib() for all the positions less than 'N' from scratch to compute the fib(N). And we could have easily stopped this process earlier if we had some of the values computed beforehand.

    So, lets try our hand at coding a memoized version of fibonacci below.

        fib_list = [0,1]
        def fib_mem(n):
            if len(fib_list) > n:
                return fib_list[n] 
            fib_val = fib_mem(n-1) + fib_mem(n-2)
            fib_list.append(fib_val) 
            return fib_val   

    In this new approach, we look for the value of f(n) in the list. If it exists we simply return it, else we compute it and store it in the list, so we can speed up future computations.

    I was amazed to see the improvement in the speed of the computation. Infact computing using the pure recursive method for values > 39 was so slow, that I decided to not compute for N > 39. And, the meomized version is blazingly fast. 

     

    =====Non Mem=====
    fibonacci(35) ->  9227465
    T = 11.2710268497
    fibonacci(32) ->  2178309
    T = 2.53484201431
    fibonacci(34) ->  5702887
    T = 7.05076313019
    fibonacci(39) ->  63245986
    T = 77.6575329304

    ===== Mem=====
    fibonacci_mem(35) ->  9227465
    T = 6.98566436768e-05
    fibonacci_mem(32) ->  2178309
    T = 1.19209289551e-05
    fibonacci_mem(34) ->  5702887
    T = 1.21593475342e-05
    fibonacci_mem(45) ->  1134903170
    T = 2.50339508057e-05
    fibonacci_mem(55) ->  139583862445
    T = 2.88486480713e-05
    fibonacci_mem(155) ->  110560307156090817237632754212345
    T = 0.000208854675293

     

     

    • Tweet
  • Flash cards using pygame

    • 6 Nov 2011
    • 0 Responses
    •  views
    • flash cards pygame python spanish
    • Edit
    • Delete
    • Tags
    • Autopost

    Flash cards using pygame

    I am currently enrolled in a spanish class for beginners at the local community college. So, the other day one of the classmates who sits next to me asked if I was using flash cards to memorize the words. I mentioned that since I am pretty much in front of a computer all day, I will write a simple program to simulate a flash card.

    Today, I embarked upon writing the very first version of my flash card application using pygame. It’s a pretty simple program that reads a list of words from a text file and displays them in a small flash card sized window. It moves between the next and previous words using the Left/Right arrow keys. The translation between english and spanish can be toggled using the Up/Down arrow keys.

    CODE

    import pygame
    from pygame.locals import *
    import sys
    import time
    
    WINWIDTH = 800
    WINHEIGHT = 300
    
    HALF_WINWIDTH = int(WINWIDTH / 2)
    
    RUNNING = 1
    
    BGCOLOR = BRIGHTBLUE = (0, 170, 255)
    TEXTCOLOR = WHITE = (255, 255, 255)
    MEANINGCOLOR = MCOLOR = (85, 65, 0)
    
    
    
    def main():
        global MAINSURF, BASICFONT, MAINCLOCK
        pygame.init()
        MAINCLOCK = pygame.time.Clock()
        MAINSURF = pygame.display.set_mode((WINWIDTH, WINHEIGHT))
        pygame.display.set_caption('Lingua Flash Card')
        BASICFONT = pygame.font.Font('freesansbold.ttf', 36)
    
        
        topCoord = 50
        spanishText = []
        englishText = []
        file = open("lesson1.txt")
        for line in file:
            es_text, en_text = line.split(":")
            spanishText.append(es_text.strip())
            englishText.append(en_text.strip())
        list_len = len(spanishText)
        # These lists will hold the Pygame Surface and Rect objects of the text.                         
        instSpanishSurfs = []
        instEnglishSurfs = []
        instRects = []
    
        # Render the text and get the Pygame Surface and Rect objects of the text.                       
        for i in range(len(spanishText)):
            instSpanishSurfs.append(BASICFONT.render(spanishText[i], 1, TEXTCOLOR))
            textPos = instSpanishSurfs[i].get_rect()
    
        # Render the text and get the Pygame Surface and Rect objects of the text.                       
        for i in range(len(englishText)):
            instEnglishSurfs.append(BASICFONT.render(englishText[i], 1, MCOLOR))
    
        textPos.top = topCoord + 10
        textPos.centerx = HALF_WINWIDTH
    
        next = 0
        meaning = 0
    
        while True:
            for event in pygame.event.get():
                if event.type == QUIT:
                    terminate()
                if event.type == KEYDOWN:
                    if event.key == K_LEFT:
                        next -= 1
                        if next < 0:
                            next = list_len - 1
                    if event.key == K_RIGHT:
                        next += 1
                        if next > list_len - 1:
                            next = 0
                    if event.key == K_UP:
                        meaning = 0 
                    if event.key == K_DOWN:
                        meaning = 1
    
            MAINSURF.fill(BGCOLOR)
    
            if meaning == 0:
                MAINSURF.blit(instSpanishSurfs[next], textPos)
            elif meaning == 1:
                MAINSURF.blit(instEnglishSurfs[next], textPos)
    
            pygame.display.update()
    
        def terminate():
            pygame.quit()
            sys.exit()
    
        if __name__ == "__main__":
            main()

    Source

    I used the starpusher game posted at ‘the invent with python’ blog as an example. I plan to polish up and add some more functionality to the program, before pushing to my github account [github.com/amehta]

    • Tweet
  • Url Shortner in Flask - a micro web framework

    • 22 Oct 2011
    • 0 Responses
    •  views
    • bitly django example flask python sql tinyurl urlshortner webapps
    • Edit
    • Delete
    • Tags
    • Autopost

    Flask is a micro framework based on python that I came across from a post on HN a few months ago. It seemed very powerful and much more easier to follow than Django. Django is definitely a good framework too, but it comes with a lot of bells and whistles that may not be needed for simple hacks or prototyping of simple ideas.

    I am currently writing a fun app similar in concept to Yelp and in the process learning about Flask, SQL, Python  all in the same time. In the meanwhile, I decided to write a simple URL shortner like Bit.ly, tinyUrl etc and publish it out to show the ease of writing simple apps in flask. You can download the example from https://github.com/amehta/Flaskly

    And check out below to see how simple it is to write a "Hello World" web page.

    from flask import Flask
    app = Flask(__name__)
    
    @app.route("/")
    def hello():
        return "Hello World!"
    
    if __name__ == "__main__":
        app.run()

    Happy learning Flask :)

    • Tweet
  • 3n+1 Problem aka Collatz conjecture

    • 18 Oct 2011
    • 0 Responses
    •  views
    • 3n+1 collatz number python
    • Edit
    • Delete
    • Tags
    • Autopost

    The Collatz conjecture is a conjecture in mathematics named after Lothar Collatz, also called the 3n + 1 problem.

    Program

    def collatz_seq(num):
      seq = []
      seq.append(num)
      while num!= 1:
        if num == 1:
          break
        if num%2 == 0:
          num=num/2
        elif num%2 == 1:
          num=3*num+1
        seq.append(num)
      return seq
    • Tweet
  • How does DHCP work ?

    • 28 Sep 2011
    • 0 Responses
    •  views
    • dhcp lan network networking packetanaylysis udp wireshark
    • Edit
    • Delete
    • Tags
    • Autopost

    What is DHCP ?

    DHCP is a protocol used to provide an IP address to a device. It is also used to configure additional networking related parameters on the device like subnet mask, router, domain name, and dns server.  

    Why DHCP ?

    DHCP stands for dynamic host configuration protocol and so allows a device to get an IP address without any manual intervention as long as the dhcp client is enabled and running on the device.

    • Autoconfigure.
      • No longer have to manually assign address and configurations on every device
    • No conflicting IP addresses
      • No longer have to track each and every ip address that was assigned.
    • Reuse of IP addresses
      • If an ip address is not in use, it can be used by some other device

    How it works ?

    Most devices on boot up do not have an IP address unless configured statically. To get to the network the device would need to first find a dhcp server to get an IP address. Since it does not know the address of the dhcp server, the dhcp client (running in the device) sends a UDP broadcast packet (DHCP Discover) in the network. All the devices in the network will see be able to see this packet and if there is one or more dhcp server** in the network, they will respond back to the client with an offer (DHCP offer). The DHCP offer includes an IP address, subnet mask, lease time, router and other parameters as configured on the server by an administrator or requested by the client.

    If multiple offers were received by the client, then it can decide which offer to accept and accordingly send a broadcast packet requesting (DHCP Request) an IP address from one of the offers. The dhcp server can identify themselves by the "transaction id" field in the request. All the other servers on receiving the request packet will withdraw their offers and return it back to their pool of addresses. The dhcp server that got selected by the client finally concludes the process by sending an acknowledgment (DHCP Ack) and any other information that the client may have requested. 

    ** If dhcp server is not in the same subnet, a dhcp relay-agent can be used to forward the request.

    How to verify ?

    It's pretty easy to follow the DHCP Handshake on Wireshark. Since, all the handshake happens using broadcast packets, you can pretty much see any device trying to get an IP address using DHCP in your local network. Once you fire up Wireshark start taking live traces on your network interface. To filter out the other noise you can filter baded on a UDP port. (DHCP server listens on port 67). Next take some other device (laptop, iPod, iPad,) in your local network and power it on. You should be able to immediately see the entire handshake on your Wireshark window. If you do not see the first two steps and only see the DHCP Request, and DHCP Ack, it probably means that your device already has a preconfigured ip address and is requesting the server to extends it's lease.

    Screenshot

    Dhcp_handshake

    Packet Flow

    http://www.eventhelix.com/RealtimeMantra/Networking/DHCP.pdf

    References

    http://tools.ietf.org/html/rfc2131

    • Tweet
  • Traceroute implementation using scapy

    • 24 Sep 2011
    • 0 Responses
    •  views
    • icmp networking packet packetanaylysis packetcrafting ping python scapy traceroute wireshark
    • Edit
    • Delete
    • Tags
    • Autopost

    Traceroute is a network diagnostic utility used for displaying the path taken by a packet to it's destination. It uses the ICMP protocol to help traverse the path. Each IP packet has an 8 bit TTL field that gets decremented by every router on the path, to prevent the packet from indefinitely circulating the internet (or network).  When the TTL value reaches zero  an ICMP error 'Time To Live Exceeded' is sent back to the sender.  We can use this fact to our advantage by discovering all the hops between the source and the destination.

    Example:

    MyPC --- R1 --- R2 --- R3 --- R4 --- FavoriteServer

    In the above example, we can discover R1 by sending an ICMP(echo-request) based IP packet starting with a TTL value of 1 destined to the server. R1 will decrement the TTL, thereby reducing it to zero, which in turn will trigger the ICMP error message TTL exceeded. This error message is sent to the sending machine as an ICMP based IP packet. The source field of this IP packet will have the IP address of R1. We can repeate this exercise by incrementing the TTL value till we no longer get the error message and instead get an ICMP echo-reply message from the destination.

    I used Scapy a python based packet crafting library to create a bare bones version of traceroute as per the explanation above.

    Python Program 

    #traceroute.py

    from scapy.all import *
    import sys

    def main():
        host = sys.argv[1]
       print "Tracroute ", host
        flag = True
        ttl=1
        hops = []
        while flag:
            ans, unans = sr(IP(dst=host,ttl=ttl)/ICMP())
            if ans.res[0][1].type == 0: # checking for  ICMP echo-reply
                flag = False
            else:
                hops.append(ans.res[0][1].src) # storing the src ip from ICMP error message
                ttl +=1
        i = 1
        for hop in hops:
            print i, " " + hop
            i+=1

    if __name__ == "__main__":
        main()

    Sample output 

    apurva$ sudo python2.5 taceroute.py google.com
    WARNING: No route found for IPv6 destination :: (no default route?)
    Tracroute  google.com
    Begin emission:
    .Finished to send 1 packets.
    ..*
    Received 4 packets, got 1 answers, remaining 0 packets
    .... 

    1  192.168.0.1
    2  98.234.104.1
    3  68.85.190.245
    4  68.85.155.74
    5  68.86.91.225
    6  68.86.85.181
    7  68.86.86.122
    8  66.208.228.226
    9  72.14.232.136
    10  64.233.174.19

    • Tweet
  • Dropbox: LAN sync protocol

    • 10 Sep 2011
    • 2 Responses
    •  views
    • dropbox lan lansync networking packetanaylysis tcp udp wireshark
    • Edit
    • Delete
    • Tags
    • Autopost

    A year ago, I fired up wire shark on my home network to check out the packet flow of the DHCP process. I closed all my browser windows and other chat/im services, and still noticed a lot of chatter on the network. One that got my particular attention was a packet that was related to the popular service dropbox. I had not updated/edited any files recently, so was surprised to see it. I then decided to dig a little more on wire shark and with some google search came to the conclusion that it was a dropbox feature to sync data changes within the local network. 

    Earlier today(yes 1 year later) I decided to spend some more time looking at the traces and try to figure out the details of the LAN sync protocol. After an hour or so I believe that the following steps are involved in Dropbox’s LAN syncing -

    1. Discovery (LAN sync discovery protocol) 

    The dropnbox process running on your computer sends a UDP broadcast packet in the local network every 30 seconds*. The source and destination port of the packet are set to 17500. This UDP packet has some payload attached to it for identify itself to the receiver. It seems like a dictionary.
        Text: {
            "host_int": 14xxxx52,
            "version": [1,8],
            "displayname": 14xxxx52,
            "port": 17500,
            "namespaces": [263xxxx, 152xxxxx, 152*****] 
        } 

    *The Macbook was sending every 30 seconds. However, on my Win7 laptop, it seemed like there was an initial rampup time where intervals started with 22 seconds and only after 5 mins they became 30 seconds.

    2. Exchange Data (LAN sync protocol) 

    If all the dropbox clients have LAN sync enabled, then each of them should be able to understand and respond to the Discovery packet (assuming it's able to distinguish between different user accounts. I believe it uses namespaces to identify them uniquely). This response packet called DB LAN sync(DB-LSP) is a TCP packet where the dropbox clients exchange data.

    This first DB-LSP based TCP packet is destined to the other client on port 17500. The source port is an ephemeral port (say 49373) to which the receiver will send its reply. These are set to type 23. My guess is that these basically exchange information if there is data to be synced or not between the clients. If there is data to be synced, there will be some configurations that need to be exchanged to send them securely/encrypted across the local network. This is probably sent using DB-LSP packets set to type 22.

     

    Dropbox LSP Sync Protocol

       0 - 7     8 - 31       32 - 63
    +--------+-------------+--------------+
    | TYPE | MAGIC | LENGTH    |
    +--------+-------------+--------------+
    |                                         |
    |              DATA                   |
    +---------------------------------------+

    TYPE: 
        Decimal   Hex   Meaning
               20       14    <Unknown>
               22       16    Configuration
               23       17    Data
             128       80    <Uknown>

    MAGIC:
        Decimal  Hex  
        769      0301

     

    I quickly wrote this to share my notes on the analysis. Feel free to correct me in any of the assumptions. Also, pardon me for any spelling/grammar mistakes.

    PS: If you would like to experiment this on your home network, you could use the following filter to get rid of the noise.
    Wireshark Filter
    udp.port == 17500 or (ip.addr==mahcine1-ip and ip.addr==machine2-ip)

    • Tweet
  • About


    5909 Views
  • Archive

    • 2012 (4)
      • January (4)
    • 2011 (14)
      • November (5)
      • October (2)
      • September (7)

    Get Updates

    Subscribe via RSS
    Twitter