BioPython Me!

“I am quite concerned about the API [NCBI’s EUtils] that you’re planning to use! I think we should give it a try before start working on the database and the interface! Give me the URL! ‘Pypi’s EUtils‘?” Those words were from my supervisor at ITI as you might have guessed!

We started EUtils in python setup, tried the example in the README file, a very sticky error faced us, “File “/usr/local/lib/python2.6/dist-packages/EUtils/”, line 25, in _load_module mod = __import__(name) ImportError: No module named DTDs.pubmed_020114″.

>>> from EUtils import DBIdsClient
>>> import EUtils
>>> from EUtils import DBIdsClient #repeated line
>>> import EUtils #repeated line
>>> dbids = EUtils.DBIds("pubmed", ["9390282"])
>>> pom = DBIdsClient.from_dbids(dbids).fetch() #efetch() is the correction
>>> print pom #print is the correction

My supervisor refused to give up and gave it a shot by using efetch() and printing pom, and a logical error hit us “<addinfourl at 142981388 whose fp = <socket._fileobject object at 0x8827f2c>>”!

After that, I tried to summarize the “central dogma” to him, and to a colleague “Imagine a class “rhodopsin”, you can say: new rhodopsin() in eye or just don’t create new instance in the heart!”. Then, we decided to install BioPython, happiest moment in my day! I would like to say that Biopython is not just an API for NCBI databases.. it’s way more than that.


We downloaded Biopython-1.57 source code and ran our first test before even reading the documentation! It’s amazing! You can connect to NCBI databases anytime, anyquery, to grab data using ID or keywords and insert it into our database. You may want to install BioSql (to create database for biological data with different DBMS) and have to install NumPy packages.

My script (compiled form the tutorial):

#!/usr/bin/python #magic line suggested by my supervisor
from Bio import Entrez = '' #one reason to write script was to avoid writing email each time I run something
#handle = Entrez.esearch(db="pubmed", term="biopython")
#record =
#print record["IdList"]
handle = Entrez.einfo(db="pubmed")
record =
#print record["DbInfo"].keys()
##Returns: [u'Count', u'LastUpdate', u'MenuName', u'Description', u'LinkList', u'FieldList', u'DbName']
for field in record["DbInfo"]["FieldList"]:
	print "%(Name)s, %(FullName)s, %(Description)s" % field

Further readings:

-Biopython: freely available Python tools for computational molecular biology and bioinformatics [PMID: 19304878]:


About Mariam Rizkallah

Hey, This is Mariam Reyad Rizkallah, a graduate of Faculty of Pharmacy, Cairo University (FOPCU), just finished my software engineering diploma, and currently a Biotechnology Master's student at the American University in Cairo. A bit interested in tropical infectious diseases, bioinformatics and phage genomics, but you can call me "Passepartout"; because I "can be interested" in everything!! In my first years in college, I never imagined that one day I will read or write about science. Now, here I'm, among an amazing team starting by writing about science and now doing science led by Dr. Ramy Aziz from FOPCU. My favorite m.o. is P. falciparum; I like it because I like Ronald Ross, I think that he was distracted someday, just like me! Things I can't stand and want to fight: unemployment, child abuse, inaccessibility to medicine/healthcare and inaccessibility to knowledge. Favorite bands!? I listen to Linkin Park, Coldplay, Eminem, Bon Jovi, Fall Out Boy, The Killers and Chris Cornell.
This entry was posted in Bioinformatics, Development and tagged , , , , , , , , . Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s