Batch SQL Operations in Groovy
Posted by jt - 15/02/11 at 12:02 pmOver the last few weeks I’ve been playing with the SQL capabilities in Groovy. Compared to hand rolling JDBC, Groovy is wonderful. Its simple, easy and straight forward.
This week I had a loop running over a SQL update statement. The performance rather disappointing for large operations. I was seeing about 1000 updates per minute. This was code executing off my local iMac, hitting a dev database a remote data center.
My code looked something like this:
records.each { conn.executeUpdate("UPDATE my_table set my_column = false where id = ?", [it[0]]) } |
This code was forcing a round trip to the server with each execution. The slow performance was mostly network overhead.
The 1.7 release of Groovy added batch updates. This allows the SQL statements to be sent to the database server in batches.
I can refactor my code as follows:
conn.withBatch { stmt -> records.each { stmt.addBatch "UPDATE my_table set my_column = false where id = ${it[0]}" } } |
The results? This will update about 25,000 records per minute (vs 1,000). I’d call that a significant difference!
A couple quick notes:
- You do not need to call execute on this; it is done automatically.
- All the statements in the closure ‘withBatch’ will be assigned a transaction. If any go boom, the transaction will get rolled back.
- You can adjust the batch size via “conn.withBatch (2000) { stmt ->” (where 2000 is the batch size). In my testing a batch size of 10 was slow, 100 reasonable, 200 was about the same as the default (I could not find documentation on what the default value actually is)
- Your mileage WILL vary.
A bunch of random technology stuff that has my attention. I work with a lot of Oracle, Java, and dabble with various open source software packages.
February 24th, 2011 at 6:46 pm
Does addBatch() works with one parameter or can have multiple ones?
I am trying at use addBatch() for a insert statement with parameters. The parameters are coming from the db and hence can have different characters(‘,”,@). addBatch is not escaping these characters.
Is there easy solution?
February 24th, 2011 at 6:49 pm
In cont..to the last post
stmt.addBatch “Insert into customer(id,name,date_added) values(${it.id},${it.name})”
February 25th, 2011 at 6:50 am
At this time Groovy batch SQL does not support bind parameters. The SQL statement is a string. In my example, I’m using a gstring to build the sql statement with data from the collection I’m looping over. So support of multiple parameters is no big deal.
Not ideal from a DB standpoint. Binds are far more efficient to use.
I can’t think of a quick and easy way to escape characters for sql…
Brings up an important point though – An untrusted data source could be malicious via a SQL injection attack. Although, I can’t think of a case where you’d bind this type of operation to a web form.
September 1st, 2011 at 1:19 am
Groovy version 1.8.1 introduced support for prepared statements with batching. I tried to put it all together here: http://novyden.blogspot.com/2011/09/groovy-batch-prepared-statement-nice.html
December 15th, 2011 at 9:30 am
Google OWASP ESAPI for a java-library that can be imported into any groovy program and allows for SQL escaping for MySQL, SQL Server, and Oracle. (DB2 under development.)
Specifically, the encoder API. So if prepared statements are out, you can at least escape user input.