Batch SQL Operations in Groovy

Posted by jt - 15/02/11 at 12:02 pm

Over the last few weeks I’ve been playing with the SQL capabilities in Groovy. Compared to hand rolling JDBC, Groovy is wonderful. Its simple, easy and straight forward.

This week I had a loop running over a SQL update statement. The performance rather disappointing for large operations. I was seeing about 1000 updates per minute. This was code executing off my local iMac, hitting a dev database a remote data center.

My code looked something like this:

records.each {
        conn.executeUpdate("UPDATE my_table set my_column = false where id = ?", [it[0]])
  }

This code was forcing a round trip to the server with each execution. The slow performance was mostly network overhead.

The 1.7 release of Groovy added batch updates. This allows the SQL statements to be sent to the database server in batches.

I can refactor my code as follows:

conn.withBatch { stmt ->
      records.each {
          stmt.addBatch "UPDATE my_table set my_column = false where id = ${it[0]}"
          }
      }

The results? This will update about 25,000 records per minute (vs 1,000). I’d call that a significant difference!

A couple quick notes:

  • You do not need to call execute on this; it is done automatically.
  • All the statements in the closure ‘withBatch’ will be assigned a transaction.  If any go boom, the transaction will get rolled back.
  • You can adjust the batch size via “conn.withBatch (2000) { stmt ->” (where 2000 is the batch size).  In my testing a batch size of 10 was slow, 100 reasonable, 200 was about the same as the default (I could not find documentation on what the default value actually is)
  • Your mileage WILL vary.
Share

5 Responses to “Batch SQL Operations in Groovy”

  1. nikki says:
    February 24th, 2011 at 6:46 pm

    Does addBatch() works with one parameter or can have multiple ones?

    I am trying at use addBatch() for a insert statement with parameters. The parameters are coming from the db and hence can have different characters(‘,”,@). addBatch is not escaping these characters.

    Is there easy solution?

  2. nikki says:
    February 24th, 2011 at 6:49 pm

    In cont..to the last post

    stmt.addBatch “Insert into customer(id,name,date_added) values(${it.id},${it.name})”

  3. jt says:
    February 25th, 2011 at 6:50 am

    At this time Groovy batch SQL does not support bind parameters. The SQL statement is a string. In my example, I’m using a gstring to build the sql statement with data from the collection I’m looping over. So support of multiple parameters is no big deal.

    Not ideal from a DB standpoint. Binds are far more efficient to use.

    I can’t think of a quick and easy way to escape characters for sql…

    Brings up an important point though – An untrusted data source could be malicious via a SQL injection attack. Although, I can’t think of a case where you’d bind this type of operation to a web form.

  4. Gregory Kanevsky says:
    September 1st, 2011 at 1:19 am

    Groovy version 1.8.1 introduced support for prepared statements with batching. I tried to put it all together here: http://novyden.blogspot.com/2011/09/groovy-batch-prepared-statement-nice.html

  5. avgvstvs says:
    December 15th, 2011 at 9:30 am

    Google OWASP ESAPI for a java-library that can be imported into any groovy program and allows for SQL escaping for MySQL, SQL Server, and Oracle. (DB2 under development.)
    Specifically, the encoder API. So if prepared statements are out, you can at least escape user input.

Leave a Reply