Discussion:
[Community] Shapely with RTree
David Fawcett
2013-03-26 21:17:22 UTC
Permalink
I am thinking through the logic for efficiently overlaying ~10,000 points
on ~100 polys. What I want in the end is a list of dicts holding the
string ID for the point and string ID for the intersecting Poly. I have
an example with prepared geometries, but now I am working up an RTree
example to benchmark.

Here is the logic:

- I use RTree to create an index on my polys
- at the same time, I create a dict to lookup the string ID (stationID) for
the integer index value

- I loop through the points in my collection and test them against the
index.
- if only one intersection is returned, I grab the id and look up my
countyID.

- if my intersection operation on the index returns multiple poly IDs, I
need to test against the actual geometries.
This is where I am struggling. I don't see a way to access a specific
feature from a shapely collection based on a property. My first idea is to
store the polygon features in a dictionary after reading them from the
shapefile. This would give me access by ID. At the same time, I am trying
to work within the Shapely data structures.

Does anyone have any suggestions on best practices for integrating RTree
and Shapely? Specifically, how you relate geometry collections to RTree
indexes.

I will post a code example after I modify my current example to not use
external data sources.

Thanks,

David.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gispython.org/pipermail/community/attachments/20130326/891ca131/attachment.htm>
Michael Weisman
2013-03-26 21:52:23 UTC
Permalink
Hi David,

No idea if this is best practice or if hobu and Sean will be shaking their heads in disappointment in a second, but RTree allows one to store an object (anything pickleable I think) in the index. You can store a dictionary with attributes and the geometry from a fiona feature like this:
geom = geometry.shape(feature['geometry'])
idx.insert(int(feature['id']), geom.bounds, obj={'properties': feature['properties'], 'geom':geom})

Now I can iterate over the results of the RTree intersection query with shapely:
for riding in idx.intersection((point.y, point.x), objects="raw"):
if riding['geom'].contains(point):
return riding

There may be a better way to do this, but in my experience the above works quite well.

Do I get bonus points for also using Fiona in my solution?

Thanks,

Michael
I am thinking through the logic for efficiently overlaying ~10,000 points on ~100 polys. What I want in the end is a list of dicts holding the string ID for the point and string ID for the intersecting Poly. I have an example with prepared geometries, but now I am working up an RTree example to benchmark.
- I use RTree to create an index on my polys
- at the same time, I create a dict to lookup the string ID (stationID) for the integer index value
- I loop through the points in my collection and test them against the index.
- if only one intersection is returned, I grab the id and look up my countyID.
- if my intersection operation on the index returns multiple poly IDs, I need to test against the actual geometries.
This is where I am struggling. I don't see a way to access a specific feature from a shapely collection based on a property. My first idea is to store the polygon features in a dictionary after reading them from the shapefile. This would give me access by ID. At the same time, I am trying to work within the Shapely data structures.
Does anyone have any suggestions on best practices for integrating RTree and Shapely? Specifically, how you relate geometry collections to RTree indexes.
I will post a code example after I modify my current example to not use external data sources.
Thanks,
David.
_______________________________________________
Community mailing list
Community at lists.gispython.org
http://lists.gispython.org/mailman/listinfo/community
David Fawcett
2013-03-27 04:02:49 UTC
Permalink
Michael,

Thanks for the quick and awesome answer. I was spending my time trying to
figure out how to store an association between the ID in the index and geom
and never thought about storing the actual geom in the index. I had tried
to use Objects=True with RTree, but didn't get it to work.

I created a bunch of test scripts to determine which strategy was fastest.
I posted the code as gist's for anyone that wants to look at them. There
is definitely room for cleaning up the code. I was also going for
readability over compactness. Because there really aren't a lot of
examples for fiona/shapely/rtree out there, I am interested in seeing how
other people might code it. Functional style would be interesting too.

My test data set was 7142 points in 87 polygons with mildly overlapping
bboxes.

Here are my examples in order of processing speed (fastest to slowest):

RTree index and shapely optimizations - 5.01 seconds
https://gist.github.com/fawcett/5251327

Added shapely speedups and prepared geometries - 7.07 seconds
https://gist.github.com/fawcett/5251267

Basic point in poly, no optimization - 10.14 seconds
https://gist.github.com/fawcett/5251238

Rtree index, storing the geoms in the index as objects - 66.10
(only doing true intersects on points:polys where a point hits multiple
RTree leaves)
https://gist.github.com/fawcett/5251373

Rtree index, storing the geoms in the index as objects - 68.85
(doing intersects on all point RTree leaf hits)
https://gist.github.com/fawcett/5251407

Obviously something went very wrong on the last two. They use the RTree,
but are an order of magnitude slower than the most basic RTree example. I
am definitely curious if my code is bad or if it has to do with the way
that geometries and properties are store within the RTree.


And yes, you do get extra points for using fiona! I am using it in my
scripts. (We should really push this example to gis.se for more visibility
for fiona and shapely.)

David.
Post by Michael Weisman
Hi David,
No idea if this is best practice or if hobu and Sean will be shaking their
heads in disappointment in a second, but RTree allows one to store an
object (anything pickleable I think) in the index. You can store a
geom = geometry.shape(feature['geometry'])
feature['properties'], 'geom':geom})
return riding
There may be a better way to do this, but in my experience the above works quite well.
Do I get bonus points for also using Fiona in my solution?
Thanks,
Michael
Post by David Fawcett
I am thinking through the logic for efficiently overlaying ~10,000
points on ~100 polys. What I want in the end is a list of dicts holding
the string ID for the point and string ID for the intersecting Poly. I
have an example with prepared geometries, but now I am working up an RTree
example to benchmark.
Post by David Fawcett
- I use RTree to create an index on my polys
- at the same time, I create a dict to lookup the string ID (stationID)
for the integer index value
Post by David Fawcett
- I loop through the points in my collection and test them against the
index.
Post by David Fawcett
- if only one intersection is returned, I grab the id and look up my
countyID.
Post by David Fawcett
- if my intersection operation on the index returns multiple poly IDs, I
need to test against the actual geometries.
Post by David Fawcett
This is where I am struggling. I don't see a way to access a specific
feature from a shapely collection based on a property. My first idea is to
store the polygon features in a dictionary after reading them from the
shapefile. This would give me access by ID. At the same time, I am trying
to work within the Shapely data structures.
Post by David Fawcett
Does anyone have any suggestions on best practices for integrating RTree
and Shapely? Specifically, how you relate geometry collections to RTree
indexes.
Post by David Fawcett
I will post a code example after I modify my current example to not use
external data sources.
Post by David Fawcett
Thanks,
David.
_______________________________________________
Community mailing list
Community at lists.gispython.org
http://lists.gispython.org/mailman/listinfo/community
_______________________________________________
Community mailing list
Community at lists.gispython.org
http://lists.gispython.org/mailman/listinfo/community
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gispython.org/pipermail/community/attachments/20130326/0da5f97c/attachment.htm>
Oliver Tonnhofer
2013-03-27 07:20:20 UTC
Permalink
Hi David,
Post by David Fawcett
RTree index and shapely optimizations - 5.01 seconds
https://gist.github.com/fawcett/5251327
[...]
Rtree index, storing the geoms in the index as objects - 66.10
(only doing true intersects on points:polys where a point hits multiple RTree leaves)
https://gist.github.com/fawcett/5251373
Rtree index, storing the geoms in the index as objects - 68.85
(doing intersects on all point RTree leaf hits)
https://gist.github.com/fawcett/5251407
Obviously something went very wrong on the last two. They use the RTree, but are an order of magnitude slower than the most basic RTree example. I am definitely curious if my code is bad or if it has to do with the way that geometries and properties are store within the RTree.
It's the serialization/deserialization of the objects that kills the performance. So better just store indices.

Regards,
Oliver
David Fawcett
2013-03-27 13:39:39 UTC
Permalink
Thanks Oliver.

David.
Post by Michael Weisman
Hi David,
Post by David Fawcett
RTree index and shapely optimizations - 5.01 seconds
https://gist.github.com/fawcett/5251327
[...]
Rtree index, storing the geoms in the index as objects - 66.10
(only doing true intersects on points:polys where a point hits multiple
RTree leaves)
Post by David Fawcett
https://gist.github.com/fawcett/5251373
Rtree index, storing the geoms in the index as objects - 68.85
(doing intersects on all point RTree leaf hits)
https://gist.github.com/fawcett/5251407
Obviously something went very wrong on the last two. They use the
RTree, but are an order of magnitude slower than the most basic RTree
example. I am definitely curious if my code is bad or if it has to do with
the way that geometries and properties are store within the RTree.
It's the serialization/deserialization of the objects that kills the
performance. So better just store indices.
Regards,
Oliver
_______________________________________________
Community mailing list
Community at lists.gispython.org
http://lists.gispython.org/mailman/listinfo/community
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gispython.org/pipermail/community/attachments/20130327/91cf2616/attachment.htm>
David Fawcett
2013-03-27 15:25:21 UTC
Permalink
I morphed my RTree example into a more generic function.

https://gist.github.com/fawcett/5255019

Any critique is welcome.

Thanks for your input on this.

David.
Post by David Fawcett
Thanks Oliver.
David.
Post by Michael Weisman
Hi David,
Post by David Fawcett
RTree index and shapely optimizations - 5.01 seconds
https://gist.github.com/fawcett/5251327
[...]
Rtree index, storing the geoms in the index as objects - 66.10
(only doing true intersects on points:polys where a point hits multiple
RTree leaves)
Post by David Fawcett
https://gist.github.com/fawcett/5251373
Rtree index, storing the geoms in the index as objects - 68.85
(doing intersects on all point RTree leaf hits)
https://gist.github.com/fawcett/5251407
Obviously something went very wrong on the last two. They use the
RTree, but are an order of magnitude slower than the most basic RTree
example. I am definitely curious if my code is bad or if it has to do with
the way that geometries and properties are store within the RTree.
It's the serialization/deserialization of the objects that kills the
performance. So better just store indices.
Regards,
Oliver
_______________________________________________
Community mailing list
Community at lists.gispython.org
http://lists.gispython.org/mailman/listinfo/community
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gispython.org/pipermail/community/attachments/20130327/da66b4e1/attachment.htm>
Howard Butler
2013-03-29 14:54:44 UTC
Permalink
Post by David Fawcett
Obviously something went very wrong on the last two. They use the RTree, but are an order of magnitude slower than the most basic RTree example. I am definitely curious if my code is bad or if it has to do with the way that geometries and properties are store within the RTree.
A few points about using clustered indexes (store the data in the index itself) with Rtree. First, it is very sensitive to the page size parameter that is set when the index is created. The default page size is quite low because the base example is to use Rtree in a non-clustered configuration and just store indices. Too large of a page size means bloating up the index (and on-disk footprint) of the index. Too small of a page size means reallocating and moving tons of bytes around for each insert to make the GeoJSON/WKB/WKT of the geometry fit in the index.

Secondly, you can override/control the serialization that happens when items are inserted or removed from the index. For shapely geoms, the default probably i/o's through GeoJSON (didn't look), but you could make it i/o through WKB and it would likely be faster and more compact. Rtree just stores pickles. The faster your pickles are, the faster the serialization/deserialization will happen out of the index.

Finally, clustered index storage is a lazy man's not-so-fast spatial database. There's some threshold of number of searches x number of items where it crosses the realm of usefulness and performant-enough, but it's quite sensitive to the configuration of a number of things.

Hope this helps,

Howard
Robert Sanson
2013-05-14 22:44:28 UTC
Permalink
I have an identical Python cgi-bin script running on two servers - one a Windows 2003 server and the other my desktop (Windows 7 64-bit). The script uses Shapely prepared polygon geometries to test whether a large number of points fall within given polygons. I get different results from both servers. Both are running Python 2.7.

Here are some code snippets:

from shapely.geometry import Point, Polygon
from shapely.wkt import loads
from shapely.prepared import prep

#read in file of livestock areas in WKT format and create list of Polygons
allf = open("poly_geom.txt","r")
lines = allf.readlines()
lsrs = []
polys = []
polysdict = {}
for line in lines:
lsr,geom = line.strip().split("|")
lsrs.append(lsr)
poly = loads(geom)
prpoly = prep(poly)
polysdict[lsr] = prpoly
polys.append(prpoly)

#Later, using a list of points (allpts) I do:
polyidx = 0
for prpoly in polys:
a = [pt for pt in allpts if prpoly.contains(pt)]
lsr = lsrs[polyidx]
alldict[lsr] = len(a)
print lsr + ":" + str(len(a))
polyidx = polyidx + 1

On the Windows 2003 server I get almost double the numbers of points falling within each polygon. I have checked using a GIS and spatialite, and my desktop (Windows 7) returns the correct results. On the server, if I use unprepared polygon geoemetries, I get the correct answer. ie. I replace the first section of code with:

for line in lines:
lsr,geom = line.strip().split("|")
lsrs.append(lsr)
poly = loads(geom)
#prpoly = prep(poly)
polysdict[lsr] = poly
polys.append(poly)

Is there a bug in the prepared code or does Shapely rely on other installed components (Geos) etc. such that I might have an older dll somewhere?

Many thanks,

Robert



This email and any attachments are confidential and intended solely for the addressee(s). If you are not the intended recipient, please notify us immediately and then delete this email from your system.

This message has been scanned for Malware and Viruses by Websense Hosted Security.
www.websense.com
Sean Gillies
2013-05-16 03:32:57 UTC
Permalink
Hi Robert,

Indeed, Shapely's prepared geometry operations are all handled by GEOS and
it seems possible that you may have different versions of the DLL.

Yours,


On Tue, May 14, 2013 at 4:44 PM, Robert Sanson <
Post by Robert Sanson
I have an identical Python cgi-bin script running on two servers - one a
Windows 2003 server and the other my desktop (Windows 7 64-bit). The script
uses Shapely prepared polygon geometries to test whether a large number of
points fall within given polygons. I get different results from both
servers. Both are running Python 2.7.
from shapely.geometry import Point, Polygon
from shapely.wkt import loads
from shapely.prepared import prep
#read in file of livestock areas in WKT format and create list of Polygons
allf = open("poly_geom.txt","r")
lines = allf.readlines()
lsrs = []
polys = []
polysdict = {}
lsr,geom = line.strip().split("|")
lsrs.append(lsr)
poly = loads(geom)
prpoly = prep(poly)
polysdict[lsr] = prpoly
polys.append(prpoly)
polyidx = 0
a = [pt for pt in allpts if prpoly.contains(pt)]
lsr = lsrs[polyidx]
alldict[lsr] = len(a)
print lsr + ":" + str(len(a))
polyidx = polyidx + 1
On the Windows 2003 server I get almost double the numbers of points
falling within each polygon. I have checked using a GIS and spatialite, and
my desktop (Windows 7) returns the correct results. On the server, if I use
unprepared polygon geoemetries, I get the correct answer. ie. I replace the
lsr,geom = line.strip().split("|")
lsrs.append(lsr)
poly = loads(geom)
#prpoly = prep(poly)
polysdict[lsr] = poly
polys.append(poly)
Is there a bug in the prepared code or does Shapely rely on other
installed components (Geos) etc. such that I might have an older dll
somewhere?
Many thanks,
Robert
This email and any attachments are confidential and intended solely for
the addressee(s). If you are not the intended recipient, please notify us
immediately and then delete this email from your system.
This message has been scanned for Malware and Viruses by Websense Hosted Security.
www.websense.com
_______________________________________________
Community mailing list
Community at lists.gispython.org
http://lists.gispython.org/mailman/listinfo/community
--
Sean Gillies
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gispython.org/pipermail/community/attachments/20130515/e91a15f7/attachment.htm>
Robert Sanson
2014-04-29 01:57:10 UTC
Permalink
Hi All

I am trying to use shapely to fix an invalid polygon with a massive bowtie:

import shapely
from shapely.wkt import loads

p1 = loads('POLYGON((0 0, 2 2, 2 0, 0 2 ,0 0))')
p2 = p1.buffer(0)

p2.wkt returns 'POLYGON ((1 1, 2 2, 2 0, 1 1))'

I am losing half the original polygon.

How can I get a full valid polygon back?

Many thanks,

Robert Sanson




This email and any attachments are confidential and intended solely for the addressee(s). If you are not the intended recipient, please notify us immediately and then delete this email from your system.

This message has been scanned for Malware and Viruses by Websense Hosted Security.
www.websense.com
Harasty, Daniel J
2014-04-29 14:11:38 UTC
Permalink
I a hunch that this is the crux of your problem: Since p1 -- as a "bowtie" -- is non-simple, part of it is "inside out" (with respect to the conventional winding rules). The buffer() operation is -- perhaps by design -- not returning the buffered version of the "inside out" portion.

This seems to work (and seems to corroborate the above hunch):

p1r = Polygon(list(reversed(p1.exterior.coords))
p2r = p1r.buffer(0)

Since p1r is "inside out" in the "other sense", the buffer() operation seems to return the "other part" of the bowtie.

Thus the union of p2 and p2r seem to be what you are looking for.

If that doesn't work for you and your complex polygon, I can think of a way that is more general, but probably at the expense of using more computations. Here is an outline:

* Split up the complex/non-simple polygon in to its constituent LinearRings or LineStrings or line segments.

* Use "cascaded_union" to node all the lines (where they intersect)

* Use "polygonize" to stitch back together constituent polygons

That works for me, too, but the code snippet is a bit long; let me know if you'd like me to post it, and/or send it to you directly.

Cheers,

Dan Harasty


-----Original Message-----
From: community-bounces at lists.gispython.org [mailto:community-bounces at lists.gispython.org] On Behalf Of Robert Sanson
Sent: Monday, April 28, 2014 9:57 PM
To: gispython.org community projects
Subject: [Community] bowtie problem

Hi All

I am trying to use shapely to fix an invalid polygon with a massive bowtie:

import shapely
from shapely.wkt import loads

p1 = loads('POLYGON((0 0, 2 2, 2 0, 0 2 ,0 0))')
p2 = p1.buffer(0)

p2.wkt returns 'POLYGON ((1 1, 2 2, 2 0, 1 1))'

I am losing half the original polygon.

How can I get a full valid polygon back?

Many thanks,

Robert Sanson




This email and any attachments are confidential and intended solely for the addressee(s). If you are not the intended recipient, please notify us immediately and then delete this email from your system.

This message has been scanned for Malware and Viruses by Websense Hosted Security.
www.websense.com<http://www.websense.com>
_______________________________________________
Community mailing list
Community at lists.gispython.org<mailto:Community at lists.gispython.org>
http://lists.gispython.org/mailman/listinfo/community
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gispython.org/pipermail/community/attachments/20140429/c8016349/attachment.htm>
Robert Sanson
2014-04-29 21:25:23 UTC
Permalink
Thanks for your suggestion Dan. Your first solution works if there is a single bowtie and yields a Multipolygon:

from shapely.wkt import loads
from shapely.geometry import Polygon

poly = loads('POLYGON ((0 0, 2 2, 2 0, 0 2, 0 0))')
poly1 = poly.buffer(0)
p2list = list(poly.exterior.coords)
p2list.reverse()
poly2 = Polygon(p2list).buffer(0)
poly3 = poly1.union(poly2)
geom = poly3.wkt

print geom

produces:

MULTIPOLYGON (((1.0 1.0, 2.0 2., 2.0 0.0, 1.0 1.0)), ((0.0 0.0, 0.0 2.0, 1.0 1.0, 0.0 0.0)))

However, it does not generalise if there are more bowties. See the two attached images. bowtie1.png produces bowtie2.png

I would be interested in your script for your second solution.

Many thanks,

Robert
"Harasty, Daniel J" <dharasty at appcomsci.com> 30/04/2014 2:11 a.m. >>>
I a hunch that this is the crux of your problem: Since p1 -- as a "bowtie" -- is non-simple, part of it is "inside out" (with respect to the conventional winding rules). The buffer() operation is -- perhaps by design -- not returning the buffered version of the "inside out" portion.

This seems to work (and seems to corroborate the above hunch):

p1r = Polygon(list(reversed(p1.exterior.coords))
p2r = p1r.buffer(0)

Since p1r is "inside out" in the "other sense", the buffer() operation seems to return the "other part" of the bowtie.

Thus the union of p2 and p2r seem to be what you are looking for.

If that doesn't work for you and your complex polygon, I can think of a way that is more general, but probably at the expense of using more computations. Here is an outline:

* Split up the complex/non-simple polygon in to its constituent LinearRings or LineStrings or line segments.

* Use "cascaded_union" to node all the lines (where they intersect)

* Use "polygonize" to stitch back together constituent polygons

That works for me, too, but the code snippet is a bit long; let me know if you'd like me to post it, and/or send it to you directly.

Cheers,

Dan Harasty


-----Original Message-----
From: community-bounces at lists.gispython.org [mailto:community-bounces at lists.gispython.org] On Behalf Of Robert Sanson
Sent: Monday, April 28, 2014 9:57 PM
To: gispython.org community projects
Subject: [Community] bowtie problem

Hi All

I am trying to use shapely to fix an invalid polygon with a massive bowtie:

import shapely
from shapely.wkt import loads

p1 = loads('POLYGON((0 0, 2 2, 2 0, 0 2 ,0 0))')
p2 = p1.buffer(0)

p2.wkt returns 'POLYGON ((1 1, 2 2, 2 0, 1 1))'

I am losing half the original polygon.

How can I get a full valid polygon back?

Many thanks,

Robert Sanson




This email and any attachments are confidential and intended solely for the addressee(s). If you are not the intended recipient, please notify us immediately and then delete this email from your system.

This message has been scanned for Malware and Viruses by Websense Hosted Security.
www.websense.com<http://www.websense.com>
_______________________________________________
Community mailing list
Community at lists.gispython.org<mailto:Community at lists.gispython.org>
http://lists.gispython.org/mailman/listinfo/community

-------------- next part --------------
A non-text attachment was scrubbed...
Name: bowtie1.png
Type: image/png
Size: 4250 bytes
Desc: not available
URL: <Loading Image...>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: bowtie2.png
Type: image/png
Size: 2799 bytes
Desc: not available
URL: <Loading Image...>
Harasty, Daniel J
2014-04-29 21:53:21 UTC
Permalink
I'm using Shapely 1.2.14; this works for me. Let me know if it works in your general case.

Dan Harasty


'''
Created on Jul 9, 2012
@author: dharasty
'''

from shapely.wkt import loads

from shapely.geometry import LineString, Polygon
from shapely.ops import cascaded_union, polygonize

p1 = loads('POLYGON((0 1, 1 2, 3 0, 5 2 , 6 1, 0 1))')
print "wkt:", p1.wkt
print

ext1 = LineString(p1.exterior.coords)

def segments(linestring):
'''This function takes a LineString, and yields individual LineStrings
for each of its segments.'''
if isinstance(linestring, LineString):
coords = list(linestring.coords)
for n in range(len(coords)-1):
yield LineString([coords[n], coords[n+1]])

segs = list(segments(ext1))

# cascaded_union() happens to correctly "node" the LinearStrings
# (that is, break them where they intersect)
# which is the required input to polygonize(), below
noded_segments = cascaded_union(segs)

for patch in polygonize(noded_segments):
print "wkt: ", patch
print


-----Original Message-----
From: community-bounces at lists.gispython.org [mailto:community-bounces at lists.gispython.org] On Behalf Of Robert Sanson
Sent: Tuesday, April 29, 2014 5:25 PM
To: gispython.org community projects
Subject: Re: [Community] bowtie problem

Thanks for your suggestion Dan. Your first solution works if there is a single bowtie and yields a Multipolygon:

from shapely.wkt import loads
from shapely.geometry import Polygon

poly = loads('POLYGON ((0 0, 2 2, 2 0, 0 2, 0 0))')
poly1 = poly.buffer(0)
p2list = list(poly.exterior.coords)
p2list.reverse()
poly2 = Polygon(p2list).buffer(0)
poly3 = poly1.union(poly2)
geom = poly3.wkt

print geom

produces:

MULTIPOLYGON (((1.0 1.0, 2.0 2., 2.0 0.0, 1.0 1.0)), ((0.0 0.0, 0.0 2.0, 1.0 1.0, 0.0 0.0)))

However, it does not generalise if there are more bowties. See the two attached images. bowtie1.png produces bowtie2.png

I would be interested in your script for your second solution.

Many thanks,

Robert
"Harasty, Daniel J" <dharasty at appcomsci.com<mailto:dharasty at appcomsci.com>> 30/04/2014 2:11 a.m. >>>
I a hunch that this is the crux of your problem: Since p1 -- as a "bowtie" -- is non-simple, part of it is "inside out" (with respect to the conventional winding rules). The buffer() operation is -- perhaps by design -- not returning the buffered version of the "inside out" portion.

This seems to work (and seems to corroborate the above hunch):

p1r = Polygon(list(reversed(p1.exterior.coords))
p2r = p1r.buffer(0)

Since p1r is "inside out" in the "other sense", the buffer() operation seems to return the "other part" of the bowtie.

Thus the union of p2 and p2r seem to be what you are looking for.

If that doesn't work for you and your complex polygon, I can think of a way that is more general, but probably at the expense of using more computations. Here is an outline:

* Split up the complex/non-simple polygon in to its constituent LinearRings or LineStrings or line segments.

* Use "cascaded_union" to node all the lines (where they intersect)

* Use "polygonize" to stitch back together constituent polygons

That works for me, too, but the code snippet is a bit long; let me know if you'd like me to post it, and/or send it to you directly.

Cheers,

Dan Harasty


-----Original Message-----
From: community-bounces at lists.gispython.org<mailto:community-bounces at lists.gispython.org> [mailto:community-bounces at lists.gispython.org] On Behalf Of Robert Sanson
Sent: Monday, April 28, 2014 9:57 PM
To: gispython.org community projects
Subject: [Community] bowtie problem

Hi All

I am trying to use shapely to fix an invalid polygon with a massive bowtie:

import shapely
from shapely.wkt import loads

p1 = loads('POLYGON((0 0, 2 2, 2 0, 0 2 ,0 0))')
p2 = p1.buffer(0)

p2.wkt returns 'POLYGON ((1 1, 2 2, 2 0, 1 1))'

I am losing half the original polygon.

How can I get a full valid polygon back?

Many thanks,

Robert Sanson




This email and any attachments are confidential and intended solely for the addressee(s). If you are not the intended recipient, please notify us immediately and then delete this email from your system.

This message has been scanned for Malware and Viruses by Websense Hosted Security.
www.websense.com<http://www.websense.com<http://www.websense.com%3chttp:/www.websense.com>>
_______________________________________________
Community mailing list
Community at lists.gispython.org<mailto:Community at lists.gispython.org<mailto:Community at lists.gispython.org%3cmailto:Community at lists.gispython.org>>
http://lists.gispython.org/mailman/listinfo/community

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gispython.org/pipermail/community/attachments/20140429/8bf8f323/attachment-0001.htm>
Robert Sanson
2014-04-29 23:26:17 UTC
Permalink
Thanks so much Dan.

I have modified your code slightly to keep the largest polygon:

if not (p1.is_valid):
ext1 = LineString(p1.exterior.coords)
segs = list(segments(ext1))
noded_segments = cascaded_union(segs)
lgstpoly = 0.0
for patch in polygonize(noded_segments):
if patch.area > lgstpoly:
lgstpoly = patch.area
geom = patch.wkt
size = patch.area
minx,miny,maxx,maxy = patch.bounds

regards,

Robert
"Harasty, Daniel J" <dharasty at appcomsci.com> 30/04/2014 9:53 a.m. >>>
I'm using Shapely 1.2.14; this works for me. Let me know if it works in your general case.

Dan Harasty


'''
Created on Jul 9, 2012
@author: dharasty
'''

from shapely.wkt import loads

from shapely.geometry import LineString, Polygon
from shapely.ops import cascaded_union, polygonize

p1 = loads('POLYGON((0 1, 1 2, 3 0, 5 2 , 6 1, 0 1))')
print "wkt:", p1.wkt
print

ext1 = LineString(p1.exterior.coords)

def segments(linestring):
'''This function takes a LineString, and yields individual LineStrings
for each of its segments.'''
if isinstance(linestring, LineString):
coords = list(linestring.coords)
for n in range(len(coords)-1):
yield LineString([coords[n], coords[n+1]])

segs = list(segments(ext1))

# cascaded_union() happens to correctly "node" the LinearStrings
# (that is, break them where they intersect)
# which is the required input to polygonize(), below
noded_segments = cascaded_union(segs)

for patch in polygonize(noded_segments):
print "wkt: ", patch
print


-----Original Message-----
From: community-bounces at lists.gispython.org [mailto:community-bounces at lists.gispython.org] On Behalf Of Robert Sanson
Sent: Tuesday, April 29, 2014 5:25 PM
To: gispython.org community projects
Subject: Re: [Community] bowtie problem

Thanks for your suggestion Dan. Your first solution works if there is a single bowtie and yields a Multipolygon:

from shapely.wkt import loads
from shapely.geometry import Polygon

poly = loads('POLYGON ((0 0, 2 2, 2 0, 0 2, 0 0))')
poly1 = poly.buffer(0)
p2list = list(poly.exterior.coords)
p2list.reverse()
poly2 = Polygon(p2list).buffer(0)
poly3 = poly1.union(poly2)
geom = poly3.wkt

print geom

produces:

MULTIPOLYGON (((1.0 1.0, 2.0 2., 2.0 0.0, 1.0 1.0)), ((0.0 0.0, 0.0 2.0, 1.0 1.0, 0.0 0.0)))

However, it does not generalise if there are more bowties. See the two attached images. bowtie1.png produces bowtie2.png

I would be interested in your script for your second solution.

Many thanks,

Robert
"Harasty, Daniel J" <dharasty at appcomsci.com<mailto:dharasty at appcomsci.com>> 30/04/2014 2:11 a.m. >>>
I a hunch that this is the crux of your problem: Since p1 -- as a "bowtie" -- is non-simple, part of it is "inside out" (with respect to the conventional winding rules). The buffer() operation is -- perhaps by design -- not returning the buffered version of the "inside out" portion.

This seems to work (and seems to corroborate the above hunch):

p1r = Polygon(list(reversed(p1.exterior.coords))
p2r = p1r.buffer(0)

Since p1r is "inside out" in the "other sense", the buffer() operation seems to return the "other part" of the bowtie.

Thus the union of p2 and p2r seem to be what you are looking for.

If that doesn't work for you and your complex polygon, I can think of a way that is more general, but probably at the expense of using more computations. Here is an outline:

* Split up the complex/non-simple polygon in to its constituent LinearRings or LineStrings or line segments.

* Use "cascaded_union" to node all the lines (where they intersect)

* Use "polygonize" to stitch back together constituent polygons

That works for me, too, but the code snippet is a bit long; let me know if you'd like me to post it, and/or send it to you directly.

Cheers,

Dan Harasty


-----Original Message-----
From: community-bounces at lists.gispython.org<mailto:community-bounces at lists.gispython.org> [mailto:community-bounces at lists.gispython.org] On Behalf Of Robert Sanson
Sent: Monday, April 28, 2014 9:57 PM
To: gispython.org community projects
Subject: [Community] bowtie problem

Hi All

I am trying to use shapely to fix an invalid polygon with a massive bowtie:

import shapely
from shapely.wkt import loads

p1 = loads('POLYGON((0 0, 2 2, 2 0, 0 2 ,0 0))')
p2 = p1.buffer(0)

p2.wkt returns 'POLYGON ((1 1, 2 2, 2 0, 1 1))'

I am losing half the original polygon.

How can I get a full valid polygon back?

Many thanks,

Robert Sanson




This email and any attachments are confidential and intended solely for the addressee(s). If you are not the intended recipient, please notify us immediately and then delete this email from your system.

This message has been scanned for Malware and Viruses by Websense Hosted Security.
www.websense.com<http://www.websense.com<http://www.websense.com%3chttp:/www.websense.com>>
Loading...