Connecting to MySQL without passwd prompt

If you are (like me) annoyed by providing password at every mysql login, you can skip it. Also it makes easier programmatic access to any MySQL db, as not passwd prompting is necessary šŸ™‚
Create `~/.my.cnf` file:

[bash]
[client]
user=username
password="pass"

[mysql]
user=username
password="pass"
[/bash]

And login without `-p` parameter:
[bash]
mysql -h host -u username dbname
[/bash]

If you want to use `~/.my.cnf` file in MySQLdb, just connect using this:
[python]
import MySQLdb
cnx = MySQLdb.connect(host=host, port=port, read_default_file="~/.my.cnf")
[/python]

Advertisements

Get gene names for set of RefSeq IDs

Today I needed to annotate set of RefSeq IDs in .bed file with gene names.
Firstly, I was looking for a way to get gene names for RefSeq IDs. I have found simple solution on BioStars.

[bash]
mysql –user=genome -N –host=genome-mysql.cse.ucsc.edu -A -D danRer10
-e "select name,name2 from refGene" > refSeq2gene.txt
[/bash]

Secondly, Iā€™ve written simple Python script to add the gene name to .bed file in place of score which I donā€™t need at all.

[python]
#!/usr/bin/env python
# Add gene name instead of score to BED file

# USAGE: cat bed | bed2gene.py refSeq2gene.txt > with_gene_names.bed

import sys

fn = sys.argv[1]
refSeq2gene = {}
for l in open(fn):
refSeq, gene = l[:-1].split(‘t’)
refSeq2gene[refSeq] = gene

sys.stderr.write(" %s accesssions loaded!n"%len(refSeq2gene))

for l in sys.stdin:
ldata = l[:-1].split(‘t’)
chrom, s, e, refSeq = ldata[:4]
if refSeq in refSeq2gene:
ldata[4] = refSeq2gene[refSeq]
else:
ldata[4] = "-"
sys.stdout.write("t".join(ldata)+"n")
[/python]

Hope, some will find it useful.

Fetching from MySQL in batches

Today, I have encountered problems during fetching large MySQL table. Simply, the server was getting out of memory due to largeĀ table size.

I knew, you can fetch subset of dataĀ from MySQL tableĀ using LIMIT. Standard `LIMIT N` returns first N rows of your table.

However there is second way of calling LIMIT, `LIMIT S, N`; where S is the starting row and N is total number of rows to be returned. Thus, you can easily plug it into FOR LOOPĀ to fetch from large MySQL table in for example 1 million rows batches:

step=1000000
t=large_table_name
user=mysql_username
pass=mysql_pass
db=database_name
#get row count
c=`mysql -u$user -p$pass $db -N -A -e"select count(*) from $t"`
echo `date` $t "with $c rows";
#write table header
echo -e $header | gzip > $t.txt.gz
#get table in batch
for i in `seq 1 $step $c`; do
  cmd="select * from $t limit $i, $step"
  echo -en "\r$i / $c"
  mysql -u$user -p$pass $db -A -N -e"$cmd" | gzip >> $t.txt.gz
done
echo ""

Hope, someone will find it useful:)