Wednesday, December 28, 2011

Getting a point on a line at a certain distance using OGR

Another day, another adventure. I arrived at work and was assigned the task to look for a point at a certain distance on a line. The point didn't exist yet, I had to create it and return it. My boss, grumpy and suffering from a hangover, was threatening (like the whole week before that) to get my ass fired if I didn't prove myself more productive and resourceful. So I grabbed myself a mug of coffee and set myself on the search of that elusive point, having yet to pay the month's rent and put the food on the table for the wife and cats.

The distance was relative: it was to be the middle of the segment of the line where the middle of the line occurred. Was that clear? A line can have several segments. The segment that interests us is the one on which the line hits half of its total distance. We want the point at the middle of that segment. We want it real bad!



A quick look at the OGR API tells us there's a Value method for OGRLineString. But that's not exactly what we want, and there's no Python binding anyway. So, God, what am I gonna do?

"Don't panic." a calm voice told me. "You have good friends and they've dragged you out of a bigger mess than this one before."

So I calmed down and started coding, and very soon I was feeling better. Here's is the piece of code that flowed from my fingertips:
 for line_feat in layer:  
   geom = line_feat.geometry()  
   num_p = geom.GetPointCount()  
   current_distance = 0.0  
   half_distance = geom.Length() / 2  
   for i in xrange(num_p-1):  
     if i+1 < num_p:      
       line = ogr.Geometry(ogr.wkbLineString)  
       line.AddPoint(geom.GetPoint(i)[0],geom.GetPoint(i)[1])  
       line.AddPoint(geom.GetPoint(i+1)[0],geom.GetPoint(i+1)[1])  
       segment_len = line.Length()        
       if (current_distance + segment_len) > half_distance:  
         x_0 = line.GetPoint(0)[0]  
         y_0 = line.GetPoint(0)[1]  
         x_1 = line.GetPoint(1)[0]  
         y_1 = line.GetPoint(1)[1]  
         d_x = (x_1 - x_0) / 2  
         d_y = (y_1 - y_0) / 2  
         x_n = x_1 - d_x  
         y_n = y_1 - d_y          
         createPoint(x_n,y_n)  
         break  
       current_distance = current_distance + segment_len  

In short:
  • Get the half-length of the total line feature
  • Obtain the points from which you reconstitute each segment composing the line
  • You calculate the length of these segments and sum them up, when you've exceeded the half-length of the total line, you know you're holding your precious segment
  • You calculate the point at the middle of that segment - that's your point!

Linking a point to the nearest line using OGR with Oracle Spatial

An Oracle table contains point data. Another contains line data. Both were imported from some CAD application that did not deem it important to consign the relationships between both. That was left to the programmer as an exercise.

Now, of course, using PL/SQL and Oracle spatial functions or operators is not an option. Why not? Because Oracle is not a good spatial RDBMS and PL/SQL is not a good scripting language. Maybe you could do it if you were willing to unlearn how to do things in a logical, straightforward GIS way and were willing to modify your data in a way to accommodate the idiosyncrasies of a crappy RDBMS (assuming you have admin powers on the Oracle server), but very few of you actually would. Unless, of course, you don't know better - in which case it's time to get wiser and learn something new. OGR and Python, for example, a magical pair, thick as thieves.

The solution is so simple, I'm ashamed to give it:
  def getClosestLine(point_geom):   
   mind = {}   
   for i in line_geoms:   
    d = point_geom.Distance(i)   
    mind[d] = i   
   return mind[min(mind.keys()]   

Nice solution but a bit slow. A quicker alternative, although less precise is to use incremental point buffering and selecting the first line it intersects, like this:
 def getClosestLine(point_geom):  
   found = False  
   buf_size = 0.0  
   step = 0.01  
   closest = None  
   while not found:  
     buf = point_geom.Buffer(buf_size)  
     intersecting = [ x for x in line_geoms if x.Intersect(buf) ]  
     if len(intersecting) > 0:  
       found = True  
       closest = intersecting[0] #can be multiple, not 100% precise  
       break  
     else:  
       buf_size += step  
     if buf_size > 10.0: break  
   return closest  

It's quicker than the previous (although still very slow), but it's not quite precise, as there will always be that margin for error the size of step. The finer the step you choose, the slower it'll get. But if you're only looking for an approximation, I guess it's OK.

Since I had a rough idea of the distance in which the lines were likely to occur (by testing), I could experience with a blend of both solutions. I created a rough buffer (1m) around my point and filtered the lines intersecting with it. I then calculated the minimum distance for only those lines to select my winner.
  def getClosestLine(point_geom):   
   mind = {}   
   buf_size = 1  
   buf = geom.Buffer(buf_size)  
   intersecting = [ x for x in line_geoms if x.Intersect(buf) ]  
   for i in intersecting:   
    d = point_geom.Distance(i)   
    mind[d] = i  
   return mind[min(mind.keys()]   

This was largely efficient for the tens of thousands of points and lines I had to relate. It kept me away from the coffee machine during testing and got me the bad eye from the Oracle boys too stubborn to learn something new and less expensive. Just kidding, the bad eye wasn't directed towards me - it's just the glum stare you get when working too long with bad software! Ha ha

Tuesday, December 27, 2011

Connecting to Oracle Spatial using OGR and Python

Let there be no mistake: Oracle sucks. It's a bloated hog that wades slowly through the corrupt data morasses of corpocracy. But sometimes, as an aloof mercenary to the corporate beast in the field of GIS, you have to swallow your contempt and do that for which you are paid: solve problems. Think of yourself as Mr Wolf: composed, BS-repellent and efficient, with an unexpected inclination for the finer things in life (after the job's done).

I installed the latest mainline OSGeo4W with all the libs and command-line tools (well, at a minimum, a custom install with with python, gdal and oracle drivers. The oracle drivers (OCI) are not installed by default). Version of python is 2.5 and GDAL 1.8. Dropped a cx_Oracle library in the install folder in order to perform some classical database operations on the Oracle server (10G R2). Then I started scripting.

The OGR OCI driver has several issues, the two main of which: its connection string syntax is a mess and it doesn't support schema's. If I were an altruistic guy I'd delve into the code and arrange this for my fellow coders. But I'm a profiteering bastard in a hurry, lacking time and (most of all) a decent development enviroment (I'm a UNIX man stuck in a Windows suit). So I need quick workarounds.

A connection string that works for OGR (Python 2.5):
 ds_str = "OCI:%s/%s@localhost/database:%s" % (user,passwd,tablename)  #getting directly the table

A connection string that works for cx_Oracle (5.1 with Python 2.5):
 ds_str = "%s/%s@localhost/database" % (user,passwd)  

Intrerestingly, using cx_Oracle (5.1.1) in Python 2.7 you don't need to speciy the host, just doing this:
 ds_str = "%s/%s@database" % (user,passwd)  

Another (minor) caveat: the Python driver won't accept that you specify, on connection, a username and password that are identical. If you're working, like myself, in a environment where this can occur, give the account temporarily a different password (no quotes for either user name or password).
 alter user user_name identified by new_password;  

After this, you can finally start talking to your server and manipulating your data and geometries in a decent way - using python and OGR.