My Geeky Journal: 2011

venerdì 25 novembre 2011

Wordpress get_terms with parent id returns an empty array

Stumbled on what looks like some sort of bug (probably, of my own code). I spent the last hour trying to understand why this code:

$subterms = get_terms("customtax",
   array(
     "hide_empty"=>0,
     'parent' => $parent_term->term_id,
   )
  );

returned and empty array, while i was sure my custom taxonomy had lots of children.
Digging in wordpress code i found that get_terms calls a function called "_get_term_hierarchy".

This function uses some sort of caching system. It looks for an option named "customtax_children" on wordpress' database, and populates it if it's empty.

On my database, that option had an empty array as value, represented by a string like this one: "a:0:{}".

All i had to do was to delete that option in order to force WP to repopulate the cache:

delete from wp_option where option_name='customtax_children';

I hope this helps somebody.

mercoledì 26 ottobre 2011

Gaussian filter python implementation

This post is, hopefully, a part of a bigger tutorial about edge detection. My final goal is to implement a Canny edge detector in python, it's just an exc
ercise to get a better understanding about the matter.

The first step in Canny algorithm is to apply a gaussian filter to the image, in order to get rid of some noise that will make edge detection harder.

I used this guide as a reference.

One-dimensional window

Here is the algorithm that applies the gaussian filter to a one dimentional list. The first step is to calculate wiindow weights, than, for every element in the list, we'll place the window over it, multiply the elements by their corresponding weight and then sum them up.


def get_window_weights(N):
    support_points = [(float(3 * i)/float(N))**2.0 for i in range(-N,N + 1)]
    gii_factors = [exp(-(i/2.0)) for i in support_points]
    ki = float(sum(gii_factors))
    return [giin/ki for giin in gii_factors]

def apply_filter(index,array,window):
    N = (len(window)-1)/2
    #fix out of range exception
    array_l = [array[0] for i in range(N)] + array + [array[-1] for i in range(N)]
    return sum(
            (float(array_l[N + index + i]) * window[N+i]
            for i in range(-N,N+1)
            )
            )

def gaussian_filter(data,window_weights,filter_func = apply_filter):
    ret = []
    for i in range(len(data)):
        ret.append(filter_func(i,data,window_weights))
    return ret

In order to apply the filter to images, we need a function that can work on pixels. The basic idea is to execute the same operations on every component of the color.


def sum_filtered_pixels(pix1,pix2):
    return tuple([pix1[i]+pix2[i] for i in range(len(pix1))])

def apply_filter_to_pixel(index,array,window):
    N = (len(window)-1)/2
    #fix out of range exception
    array_l = [array[0] for i in range(N)] + array + [array[-1] for i in range(N)]
    return reduce(sum_filtered_pixels,
            ( tuple([float(v) * window[N+i] for v in array_l[N + index + i]])
            for i in range(-N,N+1)
            )
            )

Bidimensional window

Obviously, we need to apply the filter to an image, so we need it to work with a bidimensional window. As explained in the guide, we can use a divide-et-impera approach and use the one dimensional algorithm. All we have to do is to run 1d gauss filter over all the pixel lines, and than again over all the columns.


def gaussian_filter_2d(matrix,window_weights,filter_func = apply_filter):
    new_matrix = []
    for i in range(len(matrix)):
        new_matrix.append(gaussian_filter(matrix[i],window_weights,filter_func))   
    #apply 1d gaussian filter line by line
    for i in range(len(matrix[0])):
        temp_list = gaussian_filter([new_matrix[t][i] for t in range(len(matrix))],
                                    window_weights,filter_func)
        for t in range(len(matrix)):
            new_matrix[t][i] = temp_list[t]
    return new_matrix

def gaussian_blur(img_in,img_out,window_size):
    img = Image.open(img_in)
    width,height = img.size
    window_weights = get_window_weights(window_size)
    pixmap = gaussian_filter_2d([
                                 [img.getpixel((w,h)) for w in range(width)]
                                 for h in range(height)
                                ],
                                window_weights,
                                apply_filter_to_pixel)

    new_image = Image.new("RGB",(width,height))
    for h in range(height):
        for w in range(width):
            new_image.putpixel(
                          (w,h),(int(pixmap[h][w][0]),
                                int(pixmap[h][w][1]),
                                 int(pixmap[h][w][2]))
                        )
    new_image.save(img_out)

if __name__ == '__main__':
    gaussian_blur('wombat.jpg',"wombat_blurred.jpg",5)

Here is the result:

lunedì 24 ottobre 2011

HowTo: avoid 'Complete surveys to unblock the website

Just a stupid hint.

Whenever you find yourself in front of a page that blocks asking you to complete a survey in order to continue navigating, and you're looking for a simple link on the blocked page ( say a megavideo link) just do this:

Go to the page source (CTRL + u on Google Chrome);
search the link on the source (CTFL + f on Google Chrome), using some keyword (for example "megavideo.com";
copy/paste the link (inside <a href="***" )
enjoy.

giovedì 11 agosto 2011

Simple way to solve clang import errors on Ubuntu 11.04

If clang give you errors such as:

/usr/include/linux/errno.h:4:10: fatal error: 'asm/errno.h' file not found

just manually set the right path in order for it to find the necessary dependencies. To me (Kubuntu 11.04 64 ) it worked with:

clang -I/usr/include/x86_64-linux-gnu/

martedì 26 luglio 2011

An mqueue example that actually works

As for title, i couldn't find an example of a simple application using POSIX mqueues that could work under ubuntu 11.04. There are mainly 2 problems:

Queue names must begin with a "/"
When you set the flag O_CREAT you MUST supply an mq_attr structure to the constructor.
The mq_attr structure parameters must (obviuosly) fit your system constraints

So, here's a slightly different mqueue send example

#define PMODE 0666
#define MSGSIZE 256

int main (){
      int i;
      int md;
      int status;
      struct mq_attr attr;
      mqd_t mqfd;

     attr.mq_maxmsg = 10;
     attr.mq_msgsize = MSGSIZE;
     attr.mq_flags   = 0;
      mqfd = mq_open ("/my_queue3", O_WRONLY|O_CREAT,PMODE,&attr);
      if (mqfd == -1)
        {
          perror("couldn't open mqueue");
          exit(0);
        };

   for (i=0; i<10 data-blogger-escaped-br="" data-blogger-escaped-i="">    {
    status = mq_send(mqfd,"ciao",4,0);
    if (status == -1){
        perror("mq_send failure on mqfd");
    }
    else{
        printf("successful send, i = %d\n",i);
    }
    }

}

domenica 12 giugno 2011

Digest http authorization on SOAP services with Suds

It took me some time to figure out how to access a SOAP service protected by http digest authentication with urllib2 and suds. Here's what i came out with.

import urllib2

URL = 'http://example.com/service/' 

ah = urllib2.HTTPDigestAuthHandler()
password = "mypass"
ah.add_password(None,'http://www.example.com/','username',password)
urllib2.install_opener(urllib2.build_opener(ah))

from suds.client import Client
url = "http://www.example.com/wsdl/"
client = Client(url)

client.options.transport.urlopener = urllib2.build_opener(ah)

domenica 15 maggio 2011

Redis: how to support "startswith" queries

While working at django_db_indexer i wanted to implement an indexing system that could take full advantage of Redis db types.

In this post i'll explain how django_db_indexer offers support for startswith queries using redis zsets, feel free to write me if you think of a better way.

Suppose you have some objects you want to store in a redis database, for example :


class Musician(object):
  def __init__(self,name,group):
    self.name = name
    self.group = group

We want to be able to retrieve all musician whose names starts with a certain string.

To do so, i took inspiration from this post in Salvatore Sanfilippo's blog:

http://antirez.com/post/autocomplete-with-redis.html

We'll be using redis' ordered sets. Zsets in redis are sets of values (strings) ordered by a certain score.

When two or more values share the same score, they are ordered alphabetically.

We'll store some musicians:


> hmset musicians:1 name "Julian Casablancas" group Strokes
> hmset musicians:2 name "Thom Yorke" group Radiohead
> hmset musicians:2 name "James Murphy" group "LCD Soundsystem"
...

For every record, we'll save the musician's name in a redis zset like this:


> zadd musician_name_startswith 0 Julian Casablancas_1
> zadd musician_name_startswith 0 Thom Yorke_2
> zadd musician_name_startswith 0 James Murphy_3
> zrange zset 0 -1
1. "James Murphy_3"
2. "Julian Casablancas_1"
3. "Thom Yorke_2"

Now, if we want to find all musicians whose name starts with "J" we simply follow these steps:

1) add the string "J" to our zset with score 0

2) find it's rank with "zrank" command

3) retrieve all elements in zset which rank is bigger than "J"'s rank using zrange

4) remove "J" from the set

We want to use a transaction with "WATCH" used over our zset in order to make this method work, because if another record is inserted in the zset the lookup string rank may vary.

We can now simply iterate over the list, breaking when we find a string that doesn't start with "J", and find musician's id by splitting the string by "_".

While zadd and zrank have logarithmic complexity, zrange has O(log(N)+M), where M is the number of elements returned (N in worst case scenario). To reduce this M you could add another string with only the last char changed in the following one (e.g. for our "J" i whould be "K") and retrieve the records with zrank between the two strings.

Here's the code:


SEPARATOR = '_'
def startswith(lookup,connection):
  key = "musician_name_startswith"
    conn.zadd(key,lookup,0)
      conn.watch(key)
      while True:
 try:
   pipeline = conn.pipeline()  
     up = conn.zrank(key,lookup)
     pipeline.zrange(key,up+1,-1)
     pipeline.zrem(key,lookup)
     res = pipeline.execute()
     return result_generator(lookup,res[0])
     except WatchError:
       pass
 
def result_generator(lookup,l):
    for i in l:
      if not i.startswit(lookup):
 break
      yield i

lunedì 2 maggio 2011

Playing with Redis scripting branch

The new Redis' scripting feature is what i was looking for to reduce network latency bottleneck.

- Store something in a key based on the result of an incr call:

./redis-cli eval "t = redis('incr','x')
>redis('set','key_' .. t,'value')
> return 'key_' .. t" 0

This is expecially useful when you're storing a complex object with a unique id using redis hash type:

./redis-cli eval "t = redis('incr','user_id')
> redis('hmset','user:' .. t,'username','user','password','pass')
> return t" 0
(integer) 2

./redis-cli hgetall user:2
1) "username"
2) "user"
3) "password"
4) "pass"

- Startswith index (i'll talk about it in a next post):

 
./redis-cli zadd startswith_index 0 anthony
./redis-cli zadd startswith_index 0 bert
./redis-cli zadd startswith_index 0 carl
./redis-cli zadd startswith_index 0 carlton
./redis-cli zadd startswith_index 0 peter

./redis-cli eval "redis('zadd','startswith_index','0','car')
> r = redis('zrank','startswith_index','anth')
> ret = redis('zrange','startswith_index',tonumber(r)+1,'-1')
> redis('zrem','startswith_index','anth')
> return ret" 0
1) "carl"
2) "carlton"
3) "peter"

Mysql and commas in decimal numbers

Some time ago i was working on a mess of a site: i had to fix some issues in order to make it work as long as needed to rewrite it from scratch. One of these issues involved a query that had to be ordered by a text field containing decimal numbers with commas instead of points.

In my first attempt i simply used CAST to convert text to decimal, something like:

ORDER BY CAST( price as decimal(10,2))

Unfortunately, mysql doesn’t like commas at all: it just “cuts” the number whenever it finds one. To get my queryset ordered properly i had to get rid of the commas, so i used replace:

ORDER BY CAST( replace(price,’,’,’.’) as decimal(10,2))

This tip is also useful when you have decimals with commas you want to display with points and vice versa.

giovedì 28 aprile 2011

The very basics of Facebook apps on Django

Working on a facebook app i discovered that:

Pyfacebook doesn't work anymore.
Other projects are far too complicated (or maybe i'm just lazy) to suit to my app.
The majority of the tutorials i could find on the internet were largely out of date.

So, here's all you need to simply retrieve the user data you desire from facebook using django.

All you need is facebook python sdk.

Thanks to Sulil Arora for sharing this snippet, it helped a lot.

Just create a django app, create a view using the following code and set it as your facebook app's canvas page.

You can find a project using this piece of code here

import base64
from django.http import HttpResponse, HttpResponseRedirect
from django.views.decorators.csrf import csrf_exempt
from django.utils import simplejson
import facebook

APP_ID = 'my_app_id'

### CHECK SIGNED REQUEST
import hmac
import hashlib
APP_SECRET = "my_app_secret"

@csrf_exempt  #or post requests will be blocked
def canvas(request):
 #Facebook sends base64 encoded data via POST

 sr = request.POST['signed_request']
 #We check the signature and retrieve basic user data
 data = parse_signed_request(sr,APP_SECRET)
 if data is None:
   # Algorithm is unknown or the signed request
   #doesn't come from Facebook. You should handle this.
   return HttpResponse('ERROR')

 user_id = data.get('user_id',False)
 oauth_token = data.get('oauth_token',False)
 # If we're to retrieve his/her data
 # we'll find user_id and oauth_token in data.
 # If we're not we need to redirect the user to facebook
 # auth dialog

 if not user_id or not oauth_token:
 # Need authorization for more data
 # redirect to Facebook oauth dialog
 # using a script (see Facebook docs)
   redirect_url =   "https://www.facebook.com/dialog/oauth?"+\
      "client_id=%s&redirect_uri=%s"\
       % (APP_ID,'http.//www.example.com/canvas/')
   
   return HttpResponse(
                  "<script> top.location.href='%s' </script>"\
                    % redirect_url
                      )

 f = facebook.GraphAPI(oauth_token)
 user_data = f.get_object(user_id)

 ### Use user's data as needed

 return HttpResponse('OK')

def base64_url_decode(inp):
 """
 Cleans a base64 encoded string and returns
 it decoded
 """
 padding_factor = (4 - len(inp) % 4) % 4
 inp += "="*padding_factor
 return base64.b64decode(
           unicode(inp).translate(
                  dict(zip(
                       map(ord, u'-_'),
                       u'+/'
                      ))
                   )
                   )

def parse_signed_request(signed_request, secret):
 """
 Check if the received signed_request is really coming
 from Facebook using your app secret key.
 Returns a dict containing facebook data or None if it fails.
 """
 l = signed_request.split('.', 2)
 encoded_sig = l[0]
 payload = l[1]

 sig = base64_url_decode(encoded_sig)
 data = simplejson.loads(base64_url_decode(payload))

 if data.get('algorithm').upper() != 'HMAC-SHA256':
   return None

 else:
   expected_sig = hmac.new(secret,
                   msg=payload,
                  digestmod=hashlib.sha256).digest()

 if sig != expected_sig:
   return None
 else:
   return data

My Geeky Journal