BlogGalleryAbout meContact
Jaganadh's bookshelf: read

Python Text Processing with NTLK 2.0 CookbookPython 2.6 Text Processing Beginners Guide

More of Jaganadh's books »
Jaganadh Gopinadhan's  book recommendations, reviews, quotes, book clubs, book trivia, book lists
Ubuntu GNU/Linux I am nerdier than 94% of all people. Are you a nerd? Click here to take the Nerd Test, get nerdy images and jokes, and write on the nerd forum! Python

Bangalore

Pycon India 2009 a report


Our team landed Bangalore on 25th night 9.15 for the first Indian Python Conference. Our colleague Mr.Sudharshan arranged accommodation for us in a men's hostel near to MEI @ Bangalore(Thanks to Sudharshan and Subhash). The whether was so cool. We felt it something like reached from Sahara to Antarctica.

On 26th morning we reached IISC around 9.45 a.m. Too late! We missed the talk "My adventures with Python" by Prabhu Ramachandran. The hall was full.

My friend Godfrey and myself attended the talks in Hall L4 in the morning session. The session started with a talk by Anand B Pillai  on "Python tools for Network Security". It was really nice one. He demonstrated how we can use Python tools for network security. The next talk was by Senthil Kumaran on "Algorithms in Python". He just gave brief intro to Python3 and his contribution to Python3. He demonstrated how different algorithms can be implemented in Python and also an evaluation of those algorithms. Students asked so many questions on the implementation. Really his presentation style was rocking one!

After these tow talks two lightning talks were given by two people from Dell gave a small talk on how Python is used in Dell for Hardware testing. But th code is not open!!!!!!!

During the lunch time I interacted with Anivar Aravind Santhosh Thottingal and some other SMC members. Just chit chat. After the lunch I rushed to Hall L5.

The first talk in Hall L5 after lunch was by Vinay Modi from Voice Pitara on "Semantic Web ad Python". He explained the concepts in Semantic Web and Python RDFLib in a nice way. He also mentioned about the semantic web expeditions by his group. I got some threads to begin on semantic web. There was no session managers(I am not sure). But he finished the talk by 45 minutes and answered for the queries from the audience.  The next talk was by Anand Janakiraman from Strand on "An analysis of the use of Python/Jaython at Strand". He introduced what are the services provided by Starnd and the role of Pytho/Jython in their product. He demonstrated the analysis of delegate registration in the India Python Conference with Avardis(TM) and some funny thing in the registration. How many spellings for Bangalore !! The law of Choice. His presentation style was incredible. Everybody enjoyed it. Some questions like why "jython" why not groovy ? etc came from the audience. Hmmm he answered it all with a smile, like a saint.

Tea break .

After having tea I rushed to Hall L4 . The first talk after the tea break was by Ramakrishna Readdy on "Building Python Applications for the Linux Enterprise". I am not literate enough to give the gist of his talk. The next talk was on "National Mission on Education" by Prabhu Ramacandran and Asokan Pichai. Dr.Prabhu introduced the project National Mission for Education and his drem about Python in eduction. After Prabhu ashokan pichai explained the plans of the project. After the talk they distributed a live DVD for experimenting with Python tools and T-Shirts too. But some people missed it!!

That was the end of the day one.

On the second day 27th I reached the venue by 9.45. little bit late. I rushed to Hall L5 . Keerthi Shankar was delivering his talk on "Python and .NET". In an attractive way he delivered the talk. He gave some real world examples of IronPython as well as .NET and its inter operability. The next talk was by Vikrant Patil from Strand on "Can your UI change colors like a chamelon". He demonstrated how the avardis(tm) scripting frame work works. He gave live demo with real world examples. It is wonder to see that how Python/Jython helps to solve customer requirements with in minutes. It is good model, really good business model. The next talk was given by me on "Natural Language Toolkit" . A talk with a text file opened in gedit and a python interpreter with some plots and mistakes. Thanks to Gopalasivam for providing laptop for the presentation.

Lunch break.

After having a nice lunch I escaped from the venue with my friends. Because my health became worse by the cool weather in Bangalore.


To me that was the end of first Indian Python Conference.

It was really great event. Well organised one . All the sessions were really brainstorming.


But

There is a bug in the T-Shirt "I am not reappy a wizard I am using Python".
"really" became "reappy" really unsolvable bug!!!!!!!!!

Thanks to my team mates Godfrey, Gopalasivam and Sudharsan for travelling with me to attend the conference.

Related Entries:
Using Yahoo! Term Extractor web service with Python
Python workshop at Kongu Engineering College, Perundurai
FOSS Workshop at PSR Engineering College Sivakasi
CSV to CouchDB data importing, a Python hack
Book Review: Python 2.6 Text Processing Beginner's Guide by Jeff McNei
Comments (2)  Permalink

Chalo PyConf 2009

Dear All Pythonists
Wake up! Only one more day for Indian Python Conference. Chalo Bangalore.
Let us make it a great event .

Use OpenOffice.org

Use OpenOffice.org

Related Entries:
Using Yahoo! Term Extractor web service with Python
Python workshop at Kongu Engineering College, Perundurai
FOSS Workshop at PSR Engineering College Sivakasi
CSV to CouchDB data importing, a Python hack
Book Review: Python 2.6 Text Processing Beginner's Guide by Jeff McNei
 Permalink

Word cloud view of my blog

I tried Word Cloud generator available at Wordle.

To create your own word cloud go to this page http://www.wordle.net/create .

I generated Word Cloud from my blog.

Here it is .

Wordle: Jaggu's World

I generated a cloud of en.wikipedia.org too.

Here it is .

Wordle: EN_WIKIPEDIA

Another Cloud view of my blog

Wordle: Jaggu2

General
Comments (1)  Permalink

Wing Commercial IDE for Python.

As you know I am using free version of the freeflux.net CMS system. So they put some add with each blogpost. Most of my posts were on Python. So the freeflux.net people put add of an IDE for Python called "WingIDE". It is not a free one. Purely commercial. But you can download a trial version from the site http://wingware.com/products?gclid=CMrp_OLQh50CFQEupAodPR1Lbw.
They are calling it as "The Intelligent Development Environment for Python Developers".

I just downloaded the trial version of the IDE. Fortunately they provides M$Windows and GNU/Linux version. I downloaded the deb file and installed in my Ubuntu 9.04 for having a feel of it. It is nice tool. Testing and debugging facilities are there. The features are listed here http://wingware.com/wingide/features .

If you are a web developer you will get Zope and Plone support too. Professional Personal and Basic version of the IDE is available.
You will get a tial for 10 days which can be extened up to 10 days.

It is nice but I uninstalled it. Because I don't have a licence and I cant divorce my VIM editor .

Related Entries:
Eric IDE for Python
Using Yahoo! Term Extractor web service with Python
Python workshop at Kongu Engineering College, Perundurai
FOSS Workshop at PSR Engineering College Sivakasi
CSV to CouchDB data importing, a Python hack
 Permalink

Who said that Sanskrit is a "dead language"?

Normally people believes that Sanskrit is a "dead language". But it is not, because it is in use in India as mother tongue. Mother tongue of Mathur village, a village in Karnataka state in India.

Today I watched an advertisement of Bajaj Discovery bike. Mathur is location of the add. Except some on or two sentence in the add rest of the dialogues are in Sanskrit only!!!!

Being a student of Sanskrit I am excited to see the add with Sanskrit dialogue. I am searching for the youtube version of the add .

Sanskrit is a living language.

Comments (3)  Permalink

File handling with 'with' in Python 2.6

I was exploring the online python documentation . I found an interesting way of opening and processing file content.

For opening and getting the content from file the process will be like :

        txt = open("your_file",'r') # Open file
        txtc = txt.read() # Read total content
        txt.close() # close file object

You can avoid the use of file.close() if you are using the 'with' statement.
In normal way we will be reading and printing the content like:

(1)        txt = open("my_text.txt",'r') # Open file
        txtcont = txt.read()
        txt.close()
        print txtcont # Printing the contents of the file.

    Or
(2)   txt = open("my_text.txt",'r') # Open file
        txtcont = txt.readlines() # Read lines from file and store to a list
        txt.close()
       
        for line in txtcont:
            print line # Prints line by line

The above given example can be done in another way (Copied from pydoc)
(3)   f = open("hello.txt")
        try:
                for line in f:
                print line
        finally:
                f.close()


The items 2 and 3 can be implemented with 'with' in the following way (in Python 2.6).

    with open("my_file") as txt:
        for line in txt:
            print line

Instead of five/six lines we are using three lines. beautiful code !!!

You may ask where is the .close() function. While using 'with' for opening a file, the opened file will be closed when the 'with' block is exited. Means the file locally scoped in the 'with' block.

I just played with 'with' to know where all the operations which is done with normal 'open()' is possible or not. I am posting the codes below.

    =======Code Begin ========
    import sys
   
    warr = []   
    with open(sys.argv[1]) as txt:
        txtc = txt.read()
        words = txtc.split()
        warr.extend(words)

    for wa in warr:
        print warr

    ======= Code End =========

What I tried here is open a file with 'with', split in to words and append to a list outside the scope of 'with' block, and print each element in the list.

    =======Code Begin ========
    import sys
       
    with open(sys.argv[1]) as txt:
        txtc = txt.readlines() #Read lines and store to list
        for l in txtc:
            print l # Print each line

    ======= Code End =========

Happy SFD !!!!!
Happy Hacking !!!!!!!!

 Permalink

Meaning vs Interpretation the "cattle class" and other MWE.

Today(Sep. 18 2009) the word "cattle class" is the most famous compound or multi-word expression(MWE)  in India. The word opened a lot of controversies in Indian Politics. I am not interested in the political discussion of the topic. Being Computational Linguists by profession I am interested on the unit "cattle class".

The word "cattle class" is a MWE . So what is an MWE ? . "A multiword expression (MWE) is a lexeme made up of a sequence of two or more lexemes that has properties that are not predictable from the properties of the individual lexemes or their normal mode of combination."(Wikipedia http://en.wikipedia.org/wiki/Multiword_expression). In Machine Translation MWE is one of the bottle neck. For example if the unit "mango tree" is translated as "mAffA marM"(മാങ്ങാ മരം) in Malayalam, it will be considered as a bad translation. The meaning of the total unit cant be the meaning of its components. If we interpret or explain the unit with its components that can be considered as misinterpretation.

In terms of Indian Language grammar these types of word units(MWE) can be called as 'samasa'(समासः) or compound. If we observe in our language we can find out so many MWEs in our language as well as in English.Like
Indian bread
mobile phone
mango tree
toddy tapper
etc......

MWEs are not only an issue in MT but also in human translation. Example is the recent "cattle class" case. If a translator or interpreter is not approaching the problem of language translation with sufficient knowledge of logical structure of language meaning, it can fire-out n issues. Interestingly the issue of understanding meaning from an utterance was deeply studied by Logicians in ancient India. They called the it as "sAbdabOdha"(शाब्दबोधः) , verbal cognition.

Anyhow we started with "cattle class", what does it really means. I found a link from wikipedia regarding the word. http://en.wikipedia.org/wiki/Economy_class .
To this page "cattle class" means economy class in plains or trains. But wikipedia says that the article needs verification(The article appeared somewhere in 2008 in wikipedia). Anyhow what I assumes that the word "cattle class" will be getting the position of a synonym for 'economy class'. May be we can say that it happened. Anyhow we interpreted it !!!!

Finally MWEs are one of the hottest research topic in Natural Language Processing.(e.g mwe.stanford.edu) A science for to study how to handle words like "cattle class" properly.

Even a corpus linguist can investigate the collocation probability or mutual info rate of the term "cattle class" in large English corpus (concordance extraction can revel the usage context too). From your controversy our research starts!!!!!!

Links pints to meaning of "cattle class"

http://livestock.colostate.edu/youth/judging/index.html

en.wikipedia.org/wiki/Travel_class

www.urbandictionary.com/define.php?term=cattle+class


http://www.openwriting.com/archives/2008/02/penned_up_in_ca_1.php

http://travel.ciao.co.uk/Economy_Class__Review_5124862

Note :- I am not criticising or justifying anybody. Please avoid misinterpretation of this writing.

 Permalink

Thoughts on text mining

I was reading the book 'Practical Text Mining with Perl' by 'Roger Bilisoly', which is published by 'Willy'. It is a nice book to learn text mining through the Perl programming language for beginners . So many practical examples are give in the text. Most of the examples are familiar to me, because I am using Perl and Python for so many years. Suddenly I thought that why cant I work out the examples in Python too!! Practical text mining with Python. I am not going to write a text book :-) . Just working out the examples in Perl and Python.

In page 22 of the book the author gives a perl code for extracting words from text. In the resulted text no punctuation marks will be there. I am reproducing the code (Not the exact code in the text book.)

===== Code Begin (Perl)=====
#!/usr/bin/env perl

$f = $ARGV[0];
open( FILE, $f ) or die("File not found or can't read\n");
while (<FILE>) {
    chomp;
    @words = split(/\s+/);
    foreach $word (@words) {
        if ( $word =~ /(\w+)/ ) {
            print "$1 \n";
        }
    }
}
===== Code End (Perl)======



The same thing can be implemented in Python in two different ways.

===== Code Begin (Python) ====
#!/usr/bin/env python

import sys
import re

txt = open(sys.argv[1],'r').read()
words = txt.split()

for word in words:
    cword = re.search('\w+',word)
    print cword.group()

===== Code End (Python) ====--

This code have a problem. Suppose the text contains some word like "gr8one" the program will throw error "AttributeError: 'NoneType' object has no attribute 'group'". I don't know whether it is my error :-).

So the second implementation is given below.

======= Code Begin (Python) ====
#!/usr/bin/env python

import sys
import string

txt = open(sys.argv[1],'r').read()
for punct in string.punctuation:
    txt = txt.replace(punct," ")

words = txt.split()

for word in words:
    print word
====== Code end (Python) =======

Hey if you have any suggestions on the programs pleas put a comment.

Happy Hacking !!!!!!!


 Permalink

Python- removing punctuation from string/text

While performing one may require to remove punctuations from the text. There is an easy way to do this in Python. Using the 'string' module we can do it.
e.g.
=== Code Begin ======
#!/usr/bin/env python
import sys
import string

txt = open(sys.argv[1],'r').read()
for punct in string.punctuation:
    txt = txt.replace(punct,"")
print txt

=== Code End ========

The punctuations contained in the 'string.punctuation are
" '!"#$%&\'()*+,-./:;<=>?@[\\]^_`{|}~' "
Total 33 punctuation marks.

If you would like to retain any of the punctuation marks in the above list in your text, you can modify the loop. For example I would like to retain "%" and "$" in the text. For the same code will be like

=== Code Begin ======
#!/usr/bin/env python
import sys
import string

txt = open(sys.argv[1],'r').read()
excludu = ['$','%']
for punct in string.punctuation:
    if not punct in excludu:
        txt = txt.replace(punct,"")
print txt

=== Code End ========

Happy Hacking !!!!!!!

Comments (1)  Permalink

Plotting wave form and spectrogram the pure Python way

In one of my earlier post we discussed how to plot spectrogram with 'scikits audiolab' and python. One of my friend asked me whether it is possible to do without 'audiolab'. So I started exploring Python wave reading module and wrote another piece of code to plot spectrogram and waveform. In this program I reduced some dependency also. While using 'audiolab',  'numpy' and 'struct' were imported in the program. In this program only 'pylab' and 'wave' modules are imported. This program will plot both waveform and spectrogram in same window.
Here is the code.

=== Code begin ===
#!/usr/bin/env python
import sys
from pylab import *
import wave

def show_wave_n_spec(speech):
    spf = wave.open(speech,'r')
    sound_info = spf.readframes(-1)
    sound_info = fromstring(sound_info, 'Int16')

    f = spf.getframerate()
   
    subplot(211)
    plot(sound_info)
    title('Wave from and spectrogram of %s' % sys.argv[1])

    subplot(212)
    spectrogram = specgram(sound_info, Fs = f, scale_by_freq=True,sides='default')
   
    show()
    spf.close()

fil = sys.argv[1]

show_wave_n_spec(fil)
=== Code End ======

To run the program copy and paste the code into a file sp.py. Run python sp.py your_wav.wav .

I just run the program on a small .wav file . The result is shown below.


 Permalink
Next1-10/12