Benchmark redis-like server

Benchmark

Benchmark is as simple as:

redis-benchmark -p 8888 -t ping -n 10000

Redis

  • Record: 33000 requests per second
  • Response time for 90% of requests: <= 2 ms

Goroutines - Golang

Code:

package main

import (
    _ "bufio"
    "net"
)

var PONG = []byte("+PONG\r\n")

func proxy(conn net.Conn) {
    buf := make([]byte, 100)

    for {
        _, err := conn.Read(buf)
        if err != nil {
            break
        }

        _, err = conn.Write(PONG)
        if err != nil {
            break
        }
    }

    conn.Close()
}

func main() {
    listener, err := net.Listen("tcp", "localhost:8888")
    if err != nil {
        panic(err)
    }

    for {
        conn, err := listener.Accept()
        if err != nil {
            panic(err)
        }
        go proxy(conn)
    }
}
  • Record: 28000 requests per second
  • Response time for 90% of requests: <= 3 ms

Gevent - Python

import gevent
from gevent import socket


PONG = '+PONG\r\n'


def proxy(conn):
    while 1:
        s = conn[0]
        try:
            s.recv(100)
        except Exception:
            break

        try:
            s.send(PONG)
        except Exception:
            break


def main():
    s = socket.socket()
    s.bind(('', 8888))
    s.listen(1)

    while 1:
        conn = s.accept()
        gevent.spawn(proxy, conn)


if __name__ == '__main__':
    main()
  • Record: 19000 requests per second
  • Response time for 90% of requests: <= 2 ms

ndbpager: Pager for Google Appengine NDB

NDB has very useful query.fetch_page() method that does most of work for you. It can be used like this:

query = Article.query()

articles_for_page1, cursor, more = query.fetch_page(1)

if more:
    articles_for_page2, cursor, more = query.fetch_page(2, cursor=cursor)

Having that method the only thing developer should do is storing cursor somewhere. There are several places where you can store it:

  1. Use cursor.to_websafe_string() and pass it via URL.
  2. Store cursor in session just for one user.
  3. Store cursor in memcache for all users.

ndbpager caches cursor for developer in memcache. Usage:

import ndbpager

pager = ndbpager.Pager(query=query, page=1)
articles_for_page1, _, _ = pager.paginate(page_size=20)

Internally ndbpager checks memcache for cursor for the given page. If it does not find one it uses cursor for previous page to get cursor for current page. Cursor is updated in memcache every time user views page. So it should stay up to date.

Having pager object you can render HTML with Jinja2 like this:

{% if pager %}
<ul class="pager">
  <li class="previous{% if not pager.has_prev %} disabled{% endif %}">
    <a href="{{ url_for_other_page(pager.prev_page) }}">Previous</a>
  </li>
  <li class="next{% if not pager.has_next %} disabled{% endif %}">
    <a href="{{ url_for_other_page(pager.next_page) }}">Next</a>
  </li>
</ul>
{% endif %}

Python tip of the day

Instead of:

if one == None or two == None or three == None:
    pass

use

if None in (one, two, three):
    pass

or

if any([x == None for x in one, two, three]):
    pass

Batch operations support for django-nonrel

I have implemented custom backend for Django, that let you use batch operations with Django and Google App Engine. You should be aware that API is not stable yet and I did not get official approval from django-nonrel team yet (although there is some progress).

Get started

Get all needed libraries:

hg clone https://bitbucket.org/wkornewald/django-testapp batch-save-test
hg clone https://bitbucket.org/wkornewald/django-nonrel django-nonrel-bs
hg clone https://bitbucket.org/wkornewald/djangoappengine djangoappengine-bs
hg clone https://bitbucket.org/wkornewald/djangotoolbox djangotoolbox-bs
hg clone https://bitbucket.org/wkornewald/django-dbindexer django-dbindexer-bs

Apply patches:

hg -R django-nonrel-bs pull -u https://bitbucket.org/vladimir_webdev/django-nonrel
hg -R djangoappengine-bs pull -u https://bitbucket.org/vladimir_webdev/ djangoappengine
hg -R django-dbindexer-bs pull -u https://bitbucket.org/vladimir_webdev/django-dbindexer

Create symbolic links to fetched libraries:

cd batch-save-test
ln -s ../django-nonrel-bs/django
ln -s ../djangoappengine-bs djangoappengine
ln -s ../djangotoolbox-bs/djangotoolbox
ln -s ../django-dbindexer-bs/dbindexer

Try to run server:

./manage.py runserver

Now you should be able to use batch operations with Django.

Usage

The simplest example of using batch operations looks like this:

from __future__ import with_statement
from django.db.models import BatchOperation
with BatchOperation() as op:
    for i in range(100):
        op.save(Post(title='Title %d' % i, text='Text %d' % i))

That’s it. Internally code above will create two pools: one for save operations and one for delete operations. When pool is filled entries will be flushed to backend and backend is responsible to batch save them. Available configuration options are:

  • pool_size (save_pool_size, delete_pool_size) - number of models instances stored in pool (500 by default). If you experience problems with memory you can try to lower this value.
  • batch_size (save_batch_size, delete_batch_size) - number of models instances that will be flushed in one batch operation (100 by default). If you experience problems with datastore timeout exceptions you can try to lower this value.

We need to make differences between pool_size and batch_size, because theoretically Django can be configured to use several databases. So pool can contain instances for different databases.

You can configure BatchOperation() like this:

config = dict(default=dict(pool_size=50,
                           save_batch_size=50,
                           delete_pool_size=50))
with BatchOperation(config) as op:
    for p in Post.objects.all()[:100]:
        op.delete(p)

Stats

I have tested batch saves with such views:

def plain_save(request):
    for i in range(100):
        Post.objects.create(title='Title %d' % i, text='Text %d' % i)
    return http.HttpResponse('Ok')

def batch_save(request):
    with BatchOperation() as op:
        for i in range(100):
            op.save(Post(title='Title %d' % i, text='Text %d' % i))
    return http.HttpResponse('Ok')

and got following results from appstats:

"GET /plain_save/" 200 real=2734ms cpu=1720ms api=6583ms overhead=21ms (100 RPCs)
"GET /batch_save/" 200 real=609ms cpu=193ms api=6516ms overhead=0ms (1 RPC)

Feedback

Feel free to add feedback/report bugs at django-nonrel user group.