skip to Main Content

Geocoding collision data for RoadSafe GIS

Accurately geocoding collision data is necessary when building a collision database that will be used for spatial analyses. Geocoding is the process of assigning a latitude/longitude (X/Y) coordinate to a descriptive location. For collision data, the descriptive location is typically a primary road and a secondary intersecting road. To geocode a collision the primary and secondary road must match to a location on a digital street network. When the collision location is described perfectly without typos, it is usually easy to match.  However, there are frequently typos, abbreviations or other anomalies that make it difficult to match the collision to a street network. In addition, the street network may fail to recognize a valid name or a new road that may not exist, preventing the collision from being geocoded.

For these reasons it can be difficult to geocode collision data accurately, but the importance cannot be understated. Even if you are not planning to view the collisions on a map the geocoding process can assign collisions to the nearest road segment or intersection. This is important for basic tabular analyses to rank high collision intersections. Since the street names for an intersection will differ slightly in the police report, geocoding is necessary to standardize the intersection names.

As part of the RoadSafe GIS service, we provide automatic geocoding and data updates of new collision data. For every new client an initial setup process lays the foundation for accurate collision data geocoding. The goal is to maximize the number of collisions that can be geocoded and minimize the location errors. Without getting into the entire geocoding process, I wanted to highlight the value added approach we use for RoadSafe GIS to handle specific examples. If you are interested in a longer general overview of a collision geocoding process, you can refer to this article regarding the geocoding process I helped develop at UC Berkeley years ago. The methodology in that article is over 7 years old now, but provided some of our original inspiration. Our geocoding process at RoadSafe GIS is very different today, however, and much better equipped to deliver more accurate results at the local level.

We do this by:

  1. Conducting thorough manual reviews of the data to build an extensive exception list of intersections that need special attention.
  2. Editing the street network as necessary for new street geometries or names.
  3. Developing a python script to clean the intersection names, handle the exceptions, or directly assign an XY coordinate.

This allows us to geocode a very high percentage and ensures the accuracy of the matched location. And most importantly, the results are repeatable with new years of data since our python script reads directly from the exception list and handles the record appropriately. If manual reviews in the future identify any new issues, we simply add the new intersection to the exception list and it is automatically taken care of.

That should give you an idea of how we geocode the collisions, but over the course of our work we have come across several special scenarios. I will outline those in future blog posts and discuss how we handled them in our process. For now, I posted a brief description of them below. Stay tuned for our next post.

Offset intersections

Collins St intersects with Reseda Blvd in two separate locations.

Location does not exist in the street network data

The private driveway into The Home Depot off Roscoe Blvd.

Invalid offset direction

State route 1 is typically oriented North/South, but in some locations it becomes East/West or slightly opposite. This can create a mismatch with the offset direction in the collision record.

Latest Articles
April 27, 2020

California HSIP Cycle 10

Caltrans has announced the HSIP Cycle 10 call for projects. Applications are due September 4th, 2020 and an informational webinar…

June 13, 2016

California HSIP Cycle 8

Caltrans has announced the HSIP Cycle 8 call for projects. Applications are due August 12th, 2016, so you should already…

June 9, 2016

San Diego geocoded SWITRS data

As part of our ongoing work to test and improve our geocoding processes, we recently looked at SWITRS collision data…

Back To Top