BlogGalleryAbout meContact
Jaganadh's bookshelf: read

Python Text Processing with NTLK 2.0 CookbookPython 2.6 Text Processing Beginners Guide

More of Jaganadh's books »
Jaganadh Gopinadhan's  book recommendations, reviews, quotes, book clubs, book trivia, book lists
Ubuntu GNU/Linux I am nerdier than 94% of all people. Are you a nerd? Click here to take the Nerd Test, get nerdy images and jokes, and write on the nerd forum! Python

Bangalore

Quick MySQL to CouchDB migration with Python

I used to play a lot with text databases. Today I was just thinking of migrating some of my data collection to CouchDB. I used the following script to convert one of my DB table (Almost all fields are TEXT) to a CouchDB collection.

#!/usr/bin/env python
import couchdb
import MySQLdb as mdb
couch = couchdb.Server()
db = couch.create('YOUR_COLLECTION_NAME')
con = mdb.connect(host='HOST_NAME',user='YOU',passwd='YOUR_PASS',db='YOUR_DB')
cur = con.cursor(mdb.cursors.DictCursor)
command = cur.execute("SELECT * FROM YOUR_DB_TABLE")
results = cur.fetchall()
for result in results:
    db.save(result)

The DictCursor in Python MySQLdb API was a great help in creating fields and values in CouchDB collection. As my table contained text data only the operation was smooth and I was able to migrate about 1 GB data to CouchDB. But !!! life is not easy if your text data have encoding issues or junk values that can't be converted to Unicode you are in trouble. Don't worry here comes the solution; replace the last two lines in the code with below given code.

for result in results:
    k = result.keys()
    v = result.values()
    v = [repr(i) for i in v]
    d = dict(zip(k,v))
    db.save(d)

Hmm so far so good. But I tried the same code with a different table where the structure is like:

+-------+--------------+------+-----+---------+----------------+
| Field | Type         | Null | Key | Default | Extra          |
+-------+--------------+------+-----+---------+----------------+
| ID    | int(11)      | NO   | PRI | NULL    | auto_increment |
| NAME  | varchar(30)  | NO   |     |         |                |
| PRICE | decimal(5,2) | NO   |     | 0.00    |                |
+-------+--------------+------+-----+---------+----------------+

Now the code thrown a big list of error. Life is not easy !! have to find a good solution for this ... Happy hacking !!!!

 Permalink

CSV to CouchDB data importing, a Python hack

Last month I was playing with Apache CouchDB. Just some introductory stuff, map reduce etc... Soon I received some Linguistic data in .cvs format, as part of the project which I was managing. There was a need to analyze it. Usually we used MySQL/Spreadsheets  to store and analyze the data. Suddenly I thought why can't I do it with CouchDB ?? . There was no direct option for import CSV data to CouchDB. I searched in the web and ended with a hint. Manilal a friend of mine also pointed to the same hint http://www.apacheserver.net/Load-CSV-file-into-couchdb-at1056996.htm .

Soon I created a small script to do the job aka load CSV file to CouchDB. The script is available in my Bitbucket repo https://bitbucket.org/jagan/misc/src/84cefb61c86a/csv2couch.py . It is a quick solution. May be you may be have a better version !!! I thought putting it in the web may help somebody else.


Happy Hacking !!!

Related Entries:
Using Yahoo! Term Extractor web service with Python
Python workshop at Kongu Engineering College, Perundurai
FOSS Workshop at PSR Engineering College Sivakasi
Book Review: Python 2.6 Text Processing Beginner's Guide by Jeff McNei
New book by Packt:'Python Text Processing with NLTK2.0 Cookbook'
Comments (0)  Permalink

New book by Packt: MySQL for Python

Packt Publishing provided an e-book of their new book "MySQL for Python" by Albert Lukaszewski. I will be reading the book during my deepavali holidays and I will put a review on the book here.

They provided link to a free chapter (Chapter No. 4 - Exception Handling) also . If you are interested in Python and MySQL feel free to read it.


https://www.packtpub.com/sites/default/files/imagecache/productview/0189OS_MockupCover_0.jpg
Related Entries:
New book by Packt:'Python Text Processing with NLTK2.0 Cookbook'
Hadoop Database access experiment
Using Yahoo! Term Extractor web service with Python
Python workshop at Kongu Engineering College, Perundurai
FOSS Workshop at PSR Engineering College Sivakasi
 Permalink
1-3/3